Getting Reliable Structured Output from LLMs
The moment you move an AI feature from "chat" to "part of your app," you need the model to return structured data — usually JSON your code can parse — not a friendly paragraph. This is where a lot of AI features quietly break, because language models are built to produce text, and text is messy. Here's how to get reliable structure out of them.
Why this is hard
An LLM predicts likely text. Ask for JSON and it will usually produce valid JSON — but "usually" is a nightmare in production. It might wrap the JSON in a markdown code block, add a chatty preamble ("Sure! Here's the JSON:"), use single quotes, or trail off. Any of these breaks a naive JSON.parse(), and now your feature is throwing errors on real users.
Technique 1: Use native structured-output features
The best fix, when available, is to not fight the problem at all. Many modern model APIs support a structured output or JSON mode where you provide a schema and the API guarantees the response conforms to it. If your provider offers this, use it — it eliminates most parsing failures at the source. This should be your first choice.
Technique 2: Ask precisely, with an example
When you're prompting directly, be explicit and show the shape you want:
Return ONLY valid JSON matching this structure, with no explanation
and no markdown fences:
{"category": "string", "confidence": 0.0, "tags": ["string"]}
Two things matter here: telling it to return only JSON with no extra prose, and giving a concrete example of the structure. Models follow examples far better than they follow descriptions.
Technique 3: Always parse defensively
Never trust the raw response. Extract and validate before you use it:
import json, re
def parse_json(text: str):
# strip markdown fences and surrounding prose
match = re.search(r"\{.*\}", text, re.DOTALL)
if not match:
raise ValueError("no JSON object found")
return json.loads(match.group(0))
Then validate the shape against a schema (Pydantic in Python, or a validation library in your stack). A response that parses but is missing a required field is still a bug — catch it here, not three screens later.
Technique 4: Retry with the error
When parsing or validation fails, you often don't need to fail the whole request. Send the model its own broken output plus the error and ask it to fix it:
Your previous response was not valid JSON. Error: <message>.
Return corrected JSON only.
A single retry resolves the large majority of malformed responses. Cap it at one or two attempts so a stubborn failure can't loop forever.
Technique 5: Have a fallback
Sometimes the model just won't cooperate, or the provider is down. Decide in advance what happens then: a safe default value, a "couldn't process this" state, or a queue for later. What you must not do is let an unparsed response crash the feature or corrupt user data.
Keep the schema small
The more fields and nesting you demand, the more ways the model can drift. Ask for the minimum structure you need. If you need something complex, consider breaking it into smaller, separate calls — each with a simple schema — rather than one giant request.
Summary
Getting structured output from an LLM is a solved problem if you layer your defences: prefer native JSON/schema modes, prompt with an explicit example, parse and validate defensively, retry with the error message on failure, and always have a fallback. Treat the model as a helpful but unreliable narrator, and build the guardrails that turn "usually valid" into "always safe."