Structured Outputs

LLMs do not always listen to instructions properly. Structured outputs for LLMs are a feature ensuring model responses follow a strict, user-defined format (like JSON or XML schema) instead of free-form text, making outputs predictable, machine-readable, and easily integrable into applications.

Natively Supported Structured Outputs

A number of LLM services (e.g., vLLM, OpenAI, Anthropic Claude, AWS GovCloud Bedrock) include native support for producing structured outputs. To take advantage of this capability when it exists, you can supply a Pydantic model representing the desired output format to the response_format parameter ofLLM.prompt.

Structured outputs for LLMs are a feature ensuring model responses follow a strict, user-defined format (like JSON or XML schema) instead of free-form text, making outputs predictable, machine-readable, and easily integrable into applications.

Anthropic or OpenAI

from onprem import LLM
from pydantic import BaseModel

class ContactInfo(BaseModel):
    name: str
    email: str
    plan_interest: str
    demo_requested: bool

# Create LLM instance for Claude
llm = LLM("anthropic/claude-3-7-sonnet-latest")

# Use structured output - this should automatically use Claude's native API
result = llm.prompt(
    "Extract info from: John Smith (john@example.com) is interested in our Enterprise plan and wants to schedule a demo for next Tuesday  at 2pm.",
      response_format=ContactInfo
  )

print(f"Name: {result.name}")
print(f"Email: {result.email}")
print(f"Plan: {result.plan_interest}")
print(f"Demo: {result.demo_requested}")

The above approach using the response_format parameter works with both Anthropic and OpenAI as LLM backends.

AWS GovCloud Bedrock

A structured output example using AWS GovCloud Bedrock is shown here.

VLLM

For vLLM, you can generate structured outputs using documented extra parameters like extra_body argument as illustrated below:

from onprem import LLM
llm = LLM(model_url='http://localhost:8666/v1', api_key='test123', model='MyGPT')

# classification-based structured outputs
result = llm.prompt('Classify this sentiment: vLLM is wonderful!',
                     extra_body={"structured_outputs": {"choice": ["positive", "negative"]}})
# OUTPUT: positive

# JSON-based structured outputs
from pydantic import BaseModel, Field
class MeasuredQuantity(BaseModel):
    value: str = Field(description="numerical value - number only")
    unit: str = Field(description="unit of measurement")
response_format = {"type": "json_schema",
                     "json_schema": {
                     "name": MeasuredQuantity.__name__.lower(),
                      "schema": MeasuredQuantity.model_json_schema()}}
result = llm.prompt('Extract unit and value from the following: He was going 35 mph.',                                                                                       response_format=response_format)
# OUTPUT: { "value": "35", "unit": "mph" }

# RegEx-based strucured outputs
result = llm.prompt(
    "Generate an example email address for Alan Turing, who works in Enigma. End in "
    ".com and new line.",
    extra_body={"structured_outputs": {"regex": r"\w+@\w+\.com\n"}, "stop": ["\n"]},
)
# OUTPUT: Alan_Turing@enigma.com

Ollama

from pydantic import BaseModel

class Pet(BaseModel):
  name: str
  animal: str
  age: int
  color: str | None
  favorite_toy: str | None

class PetList(BaseModel):
  pets: list[Pet]

llm = LLM('ollama/llama3.1')
result = llm.prompt('I have two cats named Luna and Loki...', format=PetList.model_json_schema())

When using an LLM backend that does not natively support structured outputs, supplying a Pydantic model via the response_format parameter to LLM.prompt should result in an automatic fall back to a prompt-based approach to structured outputs as described next.

Tip: When using natively-supported structured outputs, it is important to include an actual instruction in the prompt (e.g., “Classify this sentiment”, “Extract info from”, etc.). With prompt-based structured outputs (described below), the instruction can often be omitted.

Prompt-Based Structured Outputs

The LLM.pydantic_prompt method also allows you to specify the desired structure of the LLM’s output as a Pydantic model. Internally, LLM.pydantic_prompt wraps the user-supplied prompt within a larger prompt telling the LLM to output results in a specific JSON format. It is sometimes less efficient/reliable than aforementioned native methods, but is more generally applicable to any LLM. Since calling LLM.prompt with the response_format parameter will automatically invoke LLM.pydantic_prompt when necessary, you will typically not have to call LLM.pydantic_prompt directly.

from pydantic import BaseModel, Field

class Joke(BaseModel):
    setup: str = Field(description="question to set up a joke")
    punchline: str = Field(description="answer to resolve the joke")

from onprem import LLM
llm = LLM(default_model='llama', verbose=False)
structured_output = llm.pydantic_prompt('Tell me a joke.', pydantic_model=Joke)
llama_context: n_ctx_per_seq (3900) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
llama_kv_cache_unified: LLAMA_SET_ROWS=0, using old ggml_cpy() method for backwards compatibility
{
  "setup": "Why couldn't the bicycle stand up by itself?",
  "punchline": "Because it was two-tired!"
}
structured_output
Joke(setup="Why couldn't the bicycle stand up by itself?", punchline='Because it was two-tired!')
print(structured_output.setup)
print()
print(structured_output.punchline)
Why couldn't the bicycle stand up by itself?

Because it was two-tired!

Guider

The Guider in OnPrem.LLM, a simple interface to the Guidance package, can be used to guide the output of an LLM based on conditions and constraints that you supply.

Let’s begin by creating an onprem.LLM instance.

from onprem import LLM
llm = LLM(n_gpu_layers=-1, verbose=False) # set based on your system

Next, let’s create a Guider instance.

from onprem.pipelines.guider import Guider
guider = Guider(llm)

The guider.prompt method accepts Guidance prompts as input. (You can refer to the Guidance documentation for information on how to construct such prompts.)

Here, we’ll show some examples (mostly taken from the Guidance documentation) and begin with importing some Guidance functions.

The select function

The select function allows you to guide the LLM to generate output from only a finite set of alternatives. The Guider.prompt method returns a dictionary with the answer associated with the key you supply in the prompt.

from guidance import select
guidance_program = f'Do you want a joke or a poem? A ' + select(['joke', 'poem'], name='answer') # example from Guidance documentation
guider.prompt(guidance_program)
Do you want a joke or a poem? A joke
{'answer': 'joke'}

The gen function

The gen function allows you to place conditions and constraints on the generated output.

from guidance import gen
guider.prompt(f'The capital of France is {gen("answer", max_tokens=1, stop=".")}')
The capital of France is Paris
{'answer': 'Paris'}

You can also use regular expressions to guide the output.

prompt = f"""Question: Luke has ten balls. He gives three to his brother. How many balls does he have left?
Answer: """ + gen('answer', regex='\d+')
guider.prompt(prompt)
Question: Luke has ten balls. He gives three to his brother. How many balls does he have left?
Answer: 7
{'answer': '7'}
prompt = 'Generate a list of numberes in descending order. 19, 18,' + gen('answer', max_tokens=50, stop_regex='[^\d]7[^\d]')
guider.prompt(prompt)
Generate a list of numberes in descending order. 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8,
{'answer': ' 17, 16, 15, 14, 13, 12, 11, 10, 9, 8,'}

Structured Outputs With Guider

Using select and gen, you can guide the LLM to produce outputs conforming to the structure that you want (e.g., JSON).

Let’s create a prompt for generating fictional D&D-type characters.

sample_weapons = ["sword", "axe", "mace", "spear", "bow", "crossbow"]
sample_armour = ["leather", "chainmail", "plate"]

def generate_character_prompt(
    character_one_liner,
    weapons: list[str] = sample_weapons,
    armour: list[str] = sample_armour,
    n_items: int = 3
):

    prompt = ''
    prompt += "{"
    prompt += f'"description" : "{character_one_liner}",'
    prompt += '"name" : "' + gen(name="character_name", stop='"') + '",'
    # With guidance, we can call a GPU rather than merely random.randint()
    prompt += '"age" : ' + gen(name="age", regex="[0-9]+") + ','
    prompt += '"armour" : "' + select(armour, name="armour") + '",'
    prompt += '"weapon" : "' + select(weapons, name="weapon") + '",'
    prompt += '"class" : "' + gen(name="character_class", stop='"') + '",'
    prompt += '"mantra" : "' + gen(name="mantra", stop='"') + '",'
    # Again, we can avoid calling random.randint() like a pleb
    prompt += '"strength" : ' + gen(name="age", regex="[0-9]+") + ','
    prompt += '"quest_items" : [ '
    for i in range(n_items):
        prompt += '"' + gen(name="items", list_append=True, stop='"') + '"'  
        # We now pause a moment to express our thoughts on the JSON
        # specification's dislike of trailing commas
        if i < n_items - 1:
            prompt += ','
    prompt += "]"
    prompt += "}"
    return prompt
d = guider.prompt(generate_character_prompt("A quick and nimble fighter"))
{"description" : "A quick and nimble fighter","name" : "Rogue","age" : 0,"armour" : "leather","weapon" : "crossbow","class" : "rogue","mantra" : "Stay nimble, stay quick.","strength" : 10,"quest_items" : [ "a set of thieves' tools","a map of the local area","a set of lockpicks"]}

The Generated Dictionary:

d
{'items': ['a set of lockpicks',
  'a map of the local area',
  "a set of thieves' tools"],
 'age': '10',
 'mantra': 'Stay nimble, stay quick.',
 'character_class': 'rogue',
 'weapon': 'crossbow',
 'armour': 'leather',
 'character_name': 'Rogue'}

Convert to JSON

import json
print(json.dumps(d, indent=4))
{
    "items": [
        "a set of lockpicks",
        "a map of the local area",
        "a set of thieves' tools"
    ],
    "age": "10",
    "mantra": "Stay nimble, stay quick.",
    "character_class": "rogue",
    "weapon": "crossbow",
    "armour": "leather",
    "character_name": "Rogue"
}