Response Generation & Parsing Tutorial (Without Inference)#

This Foundations tutorial shows how response-generation methods and parser utilities fit together. We focus on prompt contracts and parser behavior only, so there are no vLLM/OpenAI calls and no inference runs.

By the end, you will be able to inspect output instructions quickly and predict how parser helpers behave on clean and messy responses.

We will build one reusable pizza-themed setup, compare methods side by side, and then parse hand-crafted outputs. All examples are deterministic so you can inspect behavior directly.

import html
import json
import warnings

import pandas as pd
from IPython.display import HTML, display

from qstn.inference import (
    ChoiceResponseGenerationMethod,
    JSONReasoningResponseGenerationMethod,
    JSONSingleResponseGenerationMethod,
    LogprobResponseGenerationMethod,
)
from qstn.parser import parse_json, parse_json_battery, parse_json_str, parse_logprobs
from qstn.prompt_builder import LLMPrompt, generate_likert_options
from qstn.utilities import constants, placeholder
from qstn.utilities.constants import QuestionnairePresentation
from qstn.utilities.survey_objects import InferenceResult, QuestionLLMResponseTuple
from qstn.utilities.utils import create_one_dataframe

1. Minimal Setup#

We start with two pizza moments and one shared topping option set. This gives us a clean baseline that we can reuse in every block.

questionnaire_df = pd.DataFrame(
    {
        constants.QUESTIONNAIRE_ITEM_ID: [101, 102],
        constants.QUESTION_CONTENT: [
            "movie night with friends",
            "a rainy sunday at home",
        ],
    }
)

base_prompt = LLMPrompt(
    questionnaire_source=questionnaire_df,
    questionnaire_name="FoundationsResponseContract",
    system_prompt=(
        "You are a careful survey assistant.\n"
        f"{placeholder.PROMPT_AUTOMATIC_OUTPUT_INSTRUCTIONS}"
    ),
    prompt=(
        "Choose exactly one pizza topping for the following situation.\n"
        f"{placeholder.PROMPT_QUESTIONS}"
    ),
)

pizza_toppings = [
    "Pepperoni",
    "Mushrooms",
    "Pineapple",
    "Jalapenos",
    "Extra cheese",
]

shared_options = generate_likert_options(n=5, answer_texts=pizza_toppings)

question_stem = (
    f"Pick one topping for a pizza made for {placeholder.QUESTION_CONTENT}.\n"
    f"{placeholder.PROMPT_OPTIONS}"
)

base_prompt.prepare_prompt(question_stem=question_stem, answer_options=shared_options)

display(questionnaire_df)

	questionnaire_item_id	question_content
0	101	movie night with friends
1	102	a rainy sunday at home

generate_likert_options creates index-prefixed labels by default (for example 4: Jalapenos). The parser examples below intentionally use that exact format so the prompt contract and parsed payloads stay aligned.

2. Block A: Automatic Output Instructions by Method#

Each variant below uses the same pipeline. The only thing that changes is response_generation_method=... inside generate_likert_options(...).

All response-generation methods use two independent controls:

constrain_answer_options=True (the default) limits answer fields or guided choices to the resolved AnswerOptions. Set it to False to remove only this answer-option restriction; JSON methods still enforce their JSON schema when output constraints remain enabled.
constrain_output=True (the default) sends JSON schemas or guided-choice constraints to the inference backend. Set it to False to retain automatic prompt instructions without sending an output constraint. Logprob collection remains active.

For every method, we show the full SYSTEM PROMPT and PROMPT. The automatically injected instruction is highlighted in bold inside the system prompt.

Keep in mind that you can change the automatically injected instructions to your liking by simply changing the output_template of the Response Generation Method.

def show_variant_prompts(method_name: str, variant_options):
    variant = base_prompt.duplicate()
    variant.prepare_prompt(question_stem=question_stem, answer_options=variant_options)
    system_text, user_text = variant.get_prompt_for_questionnaire_type(
        questionnaire_type=QuestionnairePresentation.SINGLE_ITEM,
        item_position=0,
    )

    system_lines = [line.strip() for line in system_text.splitlines()]
    automatic_instruction = "\n".join(system_lines[1:]).strip() if len(system_lines) > 1 else ""

    escaped_system = html.escape(system_text)
    if automatic_instruction:
        escaped_instruction = html.escape(automatic_instruction)
        highlighted_system = escaped_system.replace(
            escaped_instruction, f"<strong>{escaped_instruction}</strong>", 1
        )
        instruction_summary = f"<strong>{escaped_instruction}</strong>"
    else:
        highlighted_system = escaped_system
        instruction_summary = "<em>(none)</em>"

    print(f"=== {method_name} ===")
    display(HTML(f"<strong>SYSTEM PROMPT</strong><pre>{highlighted_system}</pre>"))
    display(HTML(f"<strong>PROMPT</strong><pre>{html.escape(user_text)}</pre>"))
    display(HTML(f"<strong>Automatic instruction</strong>: {instruction_summary}"))

2.1 No response-generation method#

This is the unconstrained baseline.

options = generate_likert_options(
    n=5,
    answer_texts=pizza_toppings,
    response_generation_method=None,
)
show_variant_prompts("No Response Generation Method", options)

=== No Response Generation Method ===

SYSTEM PROMPT

You are a careful survey assistant.

PROMPT

Choose exactly one pizza topping for the following situation.
Pick one topping for a pizza made for movie night with friends.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese

Automatic instruction: (none)

2.2 `ChoiceResponseGenerationMethod`#

This variant asks for one choice label without JSON wrapping. Answer options are attached through generate_likert_options. With the default shared settings above, generation is guided to one of those options.

options = generate_likert_options(
    n=5,
    answer_texts=pizza_toppings,
    response_generation_method=ChoiceResponseGenerationMethod(),
)
show_variant_prompts("ChoiceResponseGenerationMethod", options)

=== ChoiceResponseGenerationMethod ===

SYSTEM PROMPT

You are a careful survey assistant.
You only respond with the most probable answer option.

PROMPT

Choose exactly one pizza topping for the following situation.
Pick one topping for a pizza made for movie night with friends.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese

Automatic instruction: You only respond with the most probable answer option.

2.3 `JSONSingleResponseGenerationMethod`#

This version enforces a JSON object with one answer field.

options = generate_likert_options(
    n=5,
    answer_texts=pizza_toppings,
    response_generation_method=JSONSingleResponseGenerationMethod(),
)
show_variant_prompts("JSONSingleResponseGenerationMethod", options)

=== JSONSingleResponseGenerationMethod ===

SYSTEM PROMPT

You are a careful survey assistant.
You only respond with the most probable answer option in the following JSON format:
{
  "answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
}

PROMPT

Choose exactly one pizza topping for the following situation.
Pick one topping for a pizza made for movie night with friends.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese

Automatic instruction: You only respond with the most probable answer option in the following JSON format: { "answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese" }

2.4 `JSONReasoningResponseGenerationMethod`#

This version keeps both reasoning and final answer in structured JSON.

options = generate_likert_options(
    n=5,
    answer_texts=pizza_toppings,
    response_generation_method=JSONReasoningResponseGenerationMethod(),
)
show_variant_prompts("JSONReasoningResponseGenerationMethod", options)

=== JSONReasoningResponseGenerationMethod ===

SYSTEM PROMPT

You are a careful survey assistant.
You always reason about the possible answer options first.
You respond with your reasoning and the most probable answer option in the following JSON format:
{
  "reasoning": "your reasoning about the answer options",
  "answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
}

PROMPT

Choose exactly one pizza topping for the following situation.
Pick one topping for a pizza made for movie night with friends.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese

Automatic instruction: You always reason about the possible answer options first. You respond with your reasoning and the most probable answer option in the following JSON format: { "reasoning": "your reasoning about the answer options", "answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese" }

2.5 `LogprobResponseGenerationMethod`#

This variant requests token probabilities. Like the choice and JSON answer methods, it derives answer options automatically. constrain_answer_options=True (the default) additionally applies guided-choice generation; set it to False to collect logprobs without restricting tokens.

constrain_output=False also disables guided-choice generation, but logprob collection remains active. This lets the response-generation method supply prompt instructions and request probabilities without enforcing an output constraint.

options = generate_likert_options(
    n=5,
    answer_texts=pizza_toppings,
    response_generation_method=LogprobResponseGenerationMethod(),
)
show_variant_prompts("LogprobResponseGenerationMethod", options)

=== LogprobResponseGenerationMethod ===

SYSTEM PROMPT

You are a careful survey assistant.
You only respond with the most probable answer option.

PROMPT

Choose exactly one pizza topping for the following situation.
Pick one topping for a pizza made for movie night with friends.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese

Automatic instruction: You only respond with the most probable answer option.

3. Block B: Single-Item vs Battery Rendering#

Now we keep one JSON method and compare the two rendering modes.

json_prompt = base_prompt.duplicate()
json_options = generate_likert_options(
    n=5,
    answer_texts=pizza_toppings,
    response_generation_method=JSONSingleResponseGenerationMethod(),
)
json_prompt.prepare_prompt(question_stem=question_stem, answer_options=json_options)

<qstn.prompt_builder.LLMPrompt at 0x7366f7528b30>

3.1 `SINGLE_ITEM` rendering#

single_system, single_user = json_prompt.get_prompt_for_questionnaire_type(
    questionnaire_type=QuestionnairePresentation.SINGLE_ITEM,
    item_position=0,
)

single_lines = [line.strip() for line in single_system.splitlines()]
single_instruction = "\n".join(single_lines[1:]).strip() or "(none)"

print("=== SINGLE_ITEM: SYSTEM PROMPT ===")
print(single_system)
print("\n=== SINGLE_ITEM: PROMPT ===")
print(single_user)
print("\n=== SINGLE_ITEM: AUTOMATIC INSTRUCTION ===")
print(single_instruction)

=== SINGLE_ITEM: SYSTEM PROMPT ===
You are a careful survey assistant.
You only respond with the most probable answer option in the following JSON format:
{
  "answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
}

=== SINGLE_ITEM: PROMPT ===
Choose exactly one pizza topping for the following situation.
Pick one topping for a pizza made for movie night with friends.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese

=== SINGLE_ITEM: AUTOMATIC INSTRUCTION ===
You only respond with the most probable answer option in the following JSON format:
{
"answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
}

3.2 `BATTERY` rendering#

The battery format automatically nests each answer option, so that full patterns such as reasoning or verbalized distribution are automatically supported.

battery_system, battery_user = json_prompt.get_prompt_for_questionnaire_type(
    questionnaire_type=QuestionnairePresentation.BATTERY,
)

battery_lines = [line.strip() for line in battery_system.splitlines()]
battery_instruction = "\n".join(battery_lines[1:]).strip() or "(none)"

expected_keys = [
    f"answer_{content}" for content in questionnaire_df[constants.QUESTION_CONTENT].tolist()
]

print("=== BATTERY: SYSTEM PROMPT ===")
print(battery_system)
print("\n=== BATTERY: PROMPT ===")
print(battery_user)
print("\n=== BATTERY: AUTOMATIC INSTRUCTION ===")
print(battery_instruction)

=== BATTERY: SYSTEM PROMPT ===
You are a careful survey assistant.
You only respond with the most probable answer option in the following JSON format:
{
  "movie night with friends": {
    "answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
  },
  "a rainy sunday at home": {
    "answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
  }
}

=== BATTERY: PROMPT ===
Choose exactly one pizza topping for the following situation.
Pick one topping for a pizza made for movie night with friends.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese
Pick one topping for a pizza made for a rainy sunday at home.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese

=== BATTERY: AUTOMATIC INSTRUCTION ===
You only respond with the most probable answer option in the following JSON format:
{
"movie night with friends": {
"answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
},
"a rainy sunday at home": {
"answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
}
}

4. Block C: Hand-Created JSON Strings with `parse_json_str`#

Three tiny inputs show the core behavior quickly: valid JSON, fixable malformed JSON, and plain text.

valid_json = '{"answer": "4: Jalapenos"}'
repairable_json = '{"answer": "4: Jalapenos",}'
invalid_json = "Jalapenos for sure."

examples = {
    "valid_json": valid_json,
    "repairable_json": repairable_json,
    "invalid_json": invalid_json,
}

for name, payload in examples.items():
    parsed = parse_json_str(payload)
    print(f"{name}: {parsed} (type={type(parsed).__name__})")

valid_json: {'answer': '4: Jalapenos'} (type=dict)
repairable_json: {'answer': '4: Jalapenos'} (type=dict)
invalid_json:  (type=str)

5. Block D: Synthetic Single-Item Survey Result with `parse_json`#

One valid row and one invalid row make the parser success/failure paths explicit.

q0 = json_prompt.get_question(0)
q1 = json_prompt.get_question(1)

single_item_result = InferenceResult(
    questionnaire=json_prompt,
    results={
        101: QuestionLLMResponseTuple(
            question=json_prompt.generate_question_prompt(q0),
            llm_response='{"answer": "4: Jalapenos"}',
            logprobs=None,
            reasoning=None,
        ),
        102: QuestionLLMResponseTuple(
            question=json_prompt.generate_question_prompt(q1),
            llm_response="Not valid JSON",
            logprobs=None,
            reasoning=None,
        ),
    },
)

single_parsed_df = create_one_dataframe(parse_json([single_item_result]))

display(single_parsed_df)
if "error_col" in single_parsed_df.columns:
    print(
        "Rows with parsing errors:", int((single_parsed_df["error_col"] == "ERROR: Parsing").sum())
    )

	questionnaire_name	questionnaire_item_id	question	answer	llm_response	error_col
0	FoundationsResponseContract	101	Pick one topping for a pizza made for movie ni...	4: Jalapenos	NaN	NaN
1	FoundationsResponseContract	102	Pick one topping for a pizza made for a rainy ...	NaN	Not valid JSON	ERROR: Parsing

Rows with parsing errors: 1

6. Block E: Synthetic Battery Result with `parse_json_battery`#

Here we run the normal battery path with correctly suffixed keys and inspect the reshaped dataframe.

battery_q0 = json_prompt.get_question(0)
battery_q1 = json_prompt.get_question(1)

valid_battery_payload = {
    str(battery_q0.question_content): {"answer": "4: Jalapenos"},
    str(battery_q1.question_content): {"answer": "2: Mushrooms"},
}

battery_result = InferenceResult(
    questionnaire=json_prompt,
    results={
        -1: QuestionLLMResponseTuple(
            question="All questions answered in one battery prompt",
            llm_response=json.dumps(valid_battery_payload),
            logprobs=None,
            reasoning=None,
        )
    },
)

battery_parsed_df = create_one_dataframe(parse_json_battery([battery_result]))

display(battery_parsed_df)

	questionnaire_name	questionnaire_item_id	question	answer
0	FoundationsResponseContract	101	Pick one topping for a pizza made for movie ni...	4: Jalapenos
1	FoundationsResponseContract	102	Pick one topping for a pizza made for a rainy ...	2: Mushrooms

7. Block F: Hand-Created Logprobs with `parse_logprobs`#

By default, parse_logprobs preserves every returned token and normalizes probabilities across that complete set. Set filter_to_answer_options=True when you only want the choices attached to each question.

choice_aliases defines a complete custom mapping. Each dictionary key becomes one output column, and every listed spelling contributes to that column. For example, {"Yes": ["Y", "Yes", "Yeah"]} adds the returned probabilities for all three spellings into the single Yes column. Alias mappings override both default token parsing and answer-option filtering.

logprob_options = generate_likert_options(
    n=5,
    answer_texts=pizza_toppings,
    response_generation_method=LogprobResponseGenerationMethod(output_index_only=True),
)
logprob_prompt = base_prompt.duplicate()
logprob_prompt.prepare_prompt(question_stem=question_stem, answer_options=logprob_options)

logprob_result = InferenceResult(
    questionnaire=logprob_prompt,
    results={
        101: QuestionLLMResponseTuple(
            question=logprob_prompt.generate_question_prompt(logprob_prompt.get_question(0)),
            llm_response="4",
            logprobs={
                "1": -1.80,
                "1.": -2.20,
                "Pepperoni": -3.00,
                "2": -1.40,
                "3": -1.10,
                "4": -0.20,
                "5": -0.75,
                "x": -4.00,
            },
            reasoning=None,
        )
    },
)

parsed_all_tokens = create_one_dataframe(parse_logprobs([logprob_result]))
parsed_answer_options = create_one_dataframe(
    parse_logprobs([logprob_result], filter_to_answer_options=True)
)

choice_aliases = {
    "Pepperoni": ["1", "1.", "Pepperoni"],
    "Mushrooms": ["2"],
    "Pineapple": ["3"],
    "Jalapenos": ["4"],
    "Extra cheese": ["5"],
}
parsed_with_aliases = create_one_dataframe(
    parse_logprobs([logprob_result], choice_aliases=choice_aliases)
)

print("=== all returned tokens (default) ===")
display(parsed_all_tokens)

print("=== filtered to questionnaire answer options ===")
display(parsed_answer_options)

print("=== aliases aggregate the same returned tokens ===")
display(parsed_with_aliases)

=== all returned tokens (default) ===

	questionnaire_name	questionnaire_item_id	question	1	1.	Pepperoni	2	3	4	5	x
0	FoundationsResponseContract	101	Pick one topping for a pizza made for movie ni...	0.074635	0.050029	0.02248	0.111342	0.150296	0.369669	0.21328	0.00827

=== filtered to questionnaire answer options ===

	questionnaire_name	questionnaire_item_id	question	1	2	3	4	5
0	FoundationsResponseContract	101	Pick one topping for a pizza made for movie ni...	0.081193	0.121126	0.163504	0.402154	0.232023

=== aliases aggregate the same returned tokens ===

	questionnaire_name	questionnaire_item_id	question	Pepperoni	Mushrooms	Pineapple	Jalapenos	Extra cheese
0	FoundationsResponseContract	101	Pick one topping for a pizza made for movie ni...	0.148371	0.11227	0.151549	0.372751	0.215059

8. Block G: Core Failure Mode for `parse_logprobs`#

A missing logprob payload should be flagged explicitly.

missing_logprobs_result = InferenceResult(
    questionnaire=logprob_prompt,
    results={
        101: QuestionLLMResponseTuple(
            question=logprob_prompt.generate_question_prompt(logprob_prompt.get_question(0)),
            llm_response="No token probabilities returned",
            logprobs=None,
            reasoning=None,
        )
    },
)

with warnings.catch_warnings():
    warnings.simplefilter("ignore", UserWarning)
    missing_logprobs_df = create_one_dataframe(
        parse_logprobs([missing_logprobs_result])
    )

display(missing_logprobs_df)
print("error_col values:", missing_logprobs_df["error_col"].dropna().tolist())

	questionnaire_name	questionnaire_item_id	question	error_col
0	FoundationsResponseContract	101	Pick one topping for a pizza made for movie ni...	MISSING_LOGPROBS

error_col values: ['MISSING_LOGPROBS']

9. Quick Recap#

Choice, Logprob, and JSON answer methods derive answer options from AnswerOptions and constrain the model tokens by default.
constrain_answer_options=False disables hard generation constraints, but still can modify the prompt and collect logprobabilites.
JSON methods can be parsed with parse_json / parse_json_battery.
Normalized probabilities can be obtained from logprobs with parse_logprobs. You can define if you want only the answer tokens, or all possible tokens with filter_to_answer_options. You can aggregate multiple tokens with choice_aliases.

When you are ready for real model runs, continue with the Guides (for example Tutorial 1).

Response Generation & Parsing Tutorial (Without Inference)

Contents

Response Generation & Parsing Tutorial (Without Inference)#

1. Minimal Setup#

2. Block A: Automatic Output Instructions by Method#

2.1 No response-generation method#

2.2 ChoiceResponseGenerationMethod#

2.3 JSONSingleResponseGenerationMethod#

2.4 JSONReasoningResponseGenerationMethod#

2.5 LogprobResponseGenerationMethod#

3. Block B: Single-Item vs Battery Rendering#

3.1 SINGLE_ITEM rendering#

3.2 BATTERY rendering#

4. Block C: Hand-Created JSON Strings with parse_json_str#

5. Block D: Synthetic Single-Item Survey Result with parse_json#

6. Block E: Synthetic Battery Result with parse_json_battery#

7. Block F: Hand-Created Logprobs with parse_logprobs#

8. Block G: Core Failure Mode for parse_logprobs#

9. Quick Recap#

2.2 `ChoiceResponseGenerationMethod`#

2.3 `JSONSingleResponseGenerationMethod`#

2.4 `JSONReasoningResponseGenerationMethod`#

2.5 `LogprobResponseGenerationMethod`#

3.1 `SINGLE_ITEM` rendering#

3.2 `BATTERY` rendering#

4. Block C: Hand-Created JSON Strings with `parse_json_str`#

5. Block D: Synthetic Single-Item Survey Result with `parse_json`#

6. Block E: Synthetic Battery Result with `parse_json_battery`#

7. Block F: Hand-Created Logprobs with `parse_logprobs`#

8. Block G: Core Failure Mode for `parse_logprobs`#