Claude Code "400: no low surrogate in string" on every turn: repairing a permanently broken session transcript

A Claude Code session that returns API Error: 400 ... not valid JSON: no low surrogate in string on every turn is poisoned by a lone UTF-16 surrogate (a code point in U+D800–U+DFFF) baked into its on-disk transcript, and the fix is to close that session, strip only those lone surrogates from the offending line of the .jsonl file while leaving real emoji untouched, re-serialize that one line, and then claude --resume.

← hexisteme · notes · June 23, 2026

The poison is already on disk and is precisely targetable: a normal emoji is a single Python code point (e.g. U+1F9ED) and can never fall inside the surrogate range U+D800–U+DFFF, so deleting only that range removes the broken half-character with zero collateral damage to valid text. A cheap C-level byte pre-filter (scan for the \ud escape or raw ED A0–BF bytes before doing any per-line json.loads) cut a 174-file transcript scan from 3.4s to 1.1s, making it cheap enough to run automatically on every session start.

Why every turn fails: the surrogate is in the transcript, not the network

Claude Code persists each session to a JSONL transcript on disk (one JSON object per line, under your projects directory). Every turn replays the accumulated history back to the API. So if a single byte sequence in that history is invalid, the API rejects every subsequent request with the same error, at the same byte offset — the session is permanently bricked, and reopening it doesn't help because the bad data is reloaded from the file.

The specific failure is 400 The request body is not valid JSON: no low surrogate in string: line 1 column N (the mirror-image variant is no high surrogate). Non-BMP characters — emoji like 🧭, some extended CJK ideographs — are encoded in UTF-16 as a pair of surrogate code units: a high surrogate (U+D800–U+DBFF) followed by a low surrogate (U+DC00–U+DFFF). When a large tool output is truncated by a length limit and the cut lands exactly between the two halves of a pair, one orphaned half survives. That lone surrogate gets written into the transcript, replayed on every turn, and the API's strict JSON parser refuses it.

The triggering pattern is mundane: dumping a big, emoji-heavy blob into the context — a worker log peppered with status emoji, a daily-report job's output, a verbose ingest run — right before a large body that gets truncated mid-pair. Content-heavy projects (anything generating a lot of natural-language or creative text with non-BMP characters) re-hit this per session, not once.

The safe repair: strip only U+D800–U+DFFF, leave real emoji intact

The key insight that makes the fix safe is a property of Python's str: a valid emoji is a single code point (🧭 is U+1F9ED), so it can never land in the surrogate range U+D800–U+DFFF. Anything you find in that range is, by definition, a broken half. So you can delete exactly those code points and every legitimate character — emoji included — is left byte-for-byte untouched. You are not "removing emoji"; you are removing the orphaned halves that should never have been on disk.

The manual version of the fix, when you don't have a script handy:

import json

# read the one offending line, parse it, walk every string,
# drop only lone surrogates, re-serialize compactly
obj = json.loads(line)

def strip(o):
    if isinstance(o, str):
        return "".join(c for c in o if not (0xD800 <= ord(c) <= 0xDFFF))
    if isinstance(o, list):
        return [strip(x) for x in o]
    if isinstance(o, dict):
        return {k: strip(v) for k, v in o.items()}
    return o

fixed = json.dumps(strip(obj), ensure_ascii=False, separators=(",", ":"))

Write fixed back as that single line, leaving every other line of the transcript exactly as it was. Always copy the file to a .bak first. Re-serializing only the broken line keeps the diff minimal and preserves the rest of the conversation verbatim. After the rewrite, claude --resume and the 400 is gone.

Close the session before you touch the file

This is the step people skip, and it silently undoes the repair. While a session is open, Claude Code holds the transcript and will overwrite your edited file from its in-memory state — your fix vanishes the moment the next turn flushes. You must close the target session first, or verify nothing holds the file:

lsof -- ~/.claude/projects//.jsonl

Empty output means it's safe to edit. A good repair tool checks this for you and refuses (skips) any transcript that is currently held open, so an automated pass can never corrupt a live session. The trade-off: the one session you most want to fix — the one throwing 400s right now — may be the one you have open, so a fully automatic pass can skip exactly that file. That's the case where you fall back to the manual close-then-fix once.

Make it cheap enough to auto-repair on every session start

Because a content-heavy project re-breaks per session, a one-time manual fix isn't durable — you want the repair to run automatically before you ever see the error. The natural place is a SessionStart hook that scans recently-modified transcripts and silently cleans the closed ones. The problem is that per-line json.loads over every transcript in your projects tree is too slow to run on every launch.

The fix is a cheap byte-level pre-filter that runs before any JSON parsing. A lone surrogate only persists to disk in two shapes, and both are detectable by scanning raw bytes:

A file with neither signal is provably clean and skips the expensive parse entirely. In practice this matters: a 174-file scan dropped from 3.4s to 1.1s, cheap enough to run on every session start. Run it scoped to recent days only, quiet unless something was actually fixed, and skipping any open transcript via lsof. A representative one-shot invocation in a hook:

python3 fix-jsonl-surrogates.py --fix-all --recent 3 --quiet

Why you can't prevent it upstream (honest limitation)

This is a repair pattern, not a prevention pattern, and that's a deliberate concession to where the truncation happens. The cut that orphans a surrogate occurs inside the harness's own truncation logic, between the model output and the transcript write. The user-facing hook lifecycle (PreToolUse, PostToolUse, Stop, SessionStart, and so on) fires around lifecycle events, not in the middle of serializing the request body — so there is no interception point that can stop the bad byte from being written in the first place. Making the recovery fast and automatic is the only lever you actually control.

The pre-filter approach also has a real gap: if you immediately --resume the very session that's broken, the auto-repair pass may find the file held open and skip it (correctly, to avoid clobbering live state), so coverage is not 100% — that's the one case needing a manual close-then-fix. And do not lean on /compact as a fix: if a lone surrogate survives into the summary, the 400 persists; if it works, it worked by luck, not by design. The only genuine frequency-reduction is behavioral: avoid dumping huge emoji-saturated blobs (verbose logs, status-emoji-heavy output) into the context wholesale, or ASCII-ify such logs at the source.

FAQ

Q. What does '400 the request body is not valid JSON: no low surrogate in string' mean in Claude Code?
It means your session's on-disk transcript contains a lone UTF-16 surrogate — an orphaned half of a surrogate pair, a code point in U+D800–U+DFFF that has no matching partner. Non-BMP characters like emoji are stored as a high+low surrogate pair; when a large output is truncated mid-pair, one half is left behind. That half is replayed in every request, and the API's strict JSON parser rejects it, so the session fails on every turn at the same byte offset.

Q. Why does the error happen on every single turn instead of just once?
Because Claude Code replays the full session history (read from the transcript JSONL file) on every request. The invalid byte is part of that persisted history, so it's resent each turn and rejected each turn. Reopening the session doesn't help — the bad data is reloaded from disk. The session stays bricked until you edit the file itself.

Q. Will stripping surrogates delete my real emoji or corrupt the conversation?
No. A valid emoji is a single Unicode code point (for example 🧭 is U+1F9ED) and can never fall inside the surrogate range U+D800–U+DFFF. Only broken half-characters live in that range. Removing exactly that range leaves every legitimate character — emoji included — untouched. If you re-serialize only the one offending line and back up the file first, the rest of the transcript is preserved verbatim.

Q. Do I need to close the Claude Code session before editing the .jsonl transcript?
Yes — this is the step that silently breaks the fix if skipped. While the session is open, Claude Code holds the transcript and will overwrite your edited file from its in-memory state, erasing your repair. Close the session first, or run lsof on the file to confirm nothing holds it open. A safe repair tool checks this with lsof and skips any open transcript so it can never clobber a live session.

Q. Can I prevent this from happening instead of repairing it each time?
Not fully, because the mid-pair truncation that creates the lone surrogate happens inside the harness's own serialization, where no user-facing hook can intercept it. The hook lifecycle fires around events, not during request-body serialization. The practical mitigations are: (1) auto-repair on SessionStart with a cheap byte pre-filter so recovery is fast and silent, and (2) behavioral — avoid dumping huge emoji-heavy logs into context wholesale, or ASCII-ify those logs at the source to reduce how often it triggers.

Q. How do I make the auto-repair scan fast enough to run on every session start?
Add a byte-level pre-filter before any JSON parsing. A lone surrogate only persists as either the escape sequence \ud... or the raw UTF-8 bytes ED A0–BF (valid UTF-8 only allows ED followed by 80–9F). If a file contains neither signal it's provably clean and you skip the expensive per-line json.loads. This dropped a 174-file scan from 3.4s to 1.1s. Scope it to recent days, run quiet unless something was fixed, and skip open files.

← hexisteme · notes · CC-BY 4.0