Response Generation & Parsing Tutorial (Without Inference)#

This Foundations tutorial shows how response-generation methods and parser utilities fit together. We focus on prompt contracts and parser behavior only, so there are no vLLM/OpenAI calls and no inference runs.

By the end, you will be able to inspect output instructions quickly and predict how parser helpers behave on clean and messy responses.

We will build one reusable pizza-themed setup, compare methods side by side, and then parse hand-crafted outputs. All examples are deterministic so you can inspect behavior directly.

import html
import json

import pandas as pd
from IPython.display import HTML, display

from qstn.inference import (
    ChoiceResponseGenerationMethod,
    JSONReasoningResponseGenerationMethod,
    JSONSingleResponseGenerationMethod,
    LogprobResponseGenerationMethod,
)
from qstn.parser import parse_json, parse_json_battery, parse_json_str, parse_logprobs
from qstn.prompt_builder import LLMPrompt, generate_likert_options
from qstn.utilities import constants, placeholder
from qstn.utilities.constants import QuestionnairePresentation
from qstn.utilities.survey_objects import InferenceResult, QuestionLLMResponseTuple
from qstn.utilities.utils import create_one_dataframe

1. Minimal Setup#

We start with two pizza moments and one shared topping option set. This gives us a clean baseline that we can reuse in every block.

questionnaire_df = pd.DataFrame(
    {
        constants.QUESTIONNAIRE_ITEM_ID: [101, 102],
        constants.QUESTION_CONTENT: [
            "movie night with friends",
            "a rainy sunday at home",
        ],
    }
)

base_prompt = LLMPrompt(
    questionnaire_source=questionnaire_df,
    questionnaire_name="FoundationsResponseContract",
    system_prompt=(
        "You are a careful survey assistant.\n"
        f"{placeholder.PROMPT_AUTOMATIC_OUTPUT_INSTRUCTIONS}"
    ),
    prompt=(
        "Choose exactly one pizza topping for the following situation.\n"
        f"{placeholder.PROMPT_QUESTIONS}"
    ),
)

pizza_toppings = [
    "Pepperoni",
    "Mushrooms",
    "Pineapple",
    "Jalapenos",
    "Extra cheese",
]

shared_options = generate_likert_options(n=5, answer_texts=pizza_toppings)

question_stem = (
    f"Pick one topping for a pizza made for {placeholder.QUESTION_CONTENT}.\n"
    f"{placeholder.PROMPT_OPTIONS}"
)

base_prompt.prepare_prompt(question_stem=question_stem, answer_options=shared_options)

display(questionnaire_df)
questionnaire_item_id question_content
0 101 movie night with friends
1 102 a rainy sunday at home

generate_likert_options creates index-prefixed labels by default (for example 4: Jalapenos). The parser examples below intentionally use that exact format so the prompt contract and parsed payloads stay aligned.

2. Block A: Automatic Output Instructions by Method#

Each variant below uses the same pipeline. The only thing that changes is response_generation_method=... inside generate_likert_options(...).

For every method, we show the full SYSTEM PROMPT and PROMPT. The automatically injected instruction is highlighted in bold inside the system prompt.

Keep in mind that you can change the automatically injected instructions to your liking by simply changing the output_template of the Response Generation Method.

def show_variant_prompts(method_name: str, variant_options):
    variant = base_prompt.duplicate()
    variant.prepare_prompt(question_stem=question_stem, answer_options=variant_options)
    system_text, user_text = variant.get_prompt_for_questionnaire_type(
        questionnaire_type=QuestionnairePresentation.SINGLE_ITEM,
        item_position=0,
    )

    system_lines = [line.strip() for line in system_text.splitlines()]
    automatic_instruction = "\n".join(system_lines[1:]).strip() if len(system_lines) > 1 else ""

    escaped_system = html.escape(system_text)
    if automatic_instruction:
        escaped_instruction = html.escape(automatic_instruction)
        highlighted_system = escaped_system.replace(
            escaped_instruction, f"<strong>{escaped_instruction}</strong>", 1
        )
        instruction_summary = f"<strong>{escaped_instruction}</strong>"
    else:
        highlighted_system = escaped_system
        instruction_summary = "<em>(none)</em>"

    print(f"=== {method_name} ===")
    display(HTML(f"<strong>SYSTEM PROMPT</strong><pre>{highlighted_system}</pre>"))
    display(HTML(f"<strong>PROMPT</strong><pre>{html.escape(user_text)}</pre>"))
    display(HTML(f"<strong>Automatic instruction</strong>: {instruction_summary}"))

2.1 No response-generation method#

This is the unconstrained baseline.

options = generate_likert_options(
    n=5,
    answer_texts=pizza_toppings,
    response_generation_method=None,
)
show_variant_prompts("No Response Generation Method", options)
=== No Response Generation Method ===
SYSTEM PROMPT
You are a careful survey assistant.
PROMPT
Choose exactly one pizza topping for the following situation.
Pick one topping for a pizza made for movie night with friends.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese
Automatic instruction: (none)

2.2 ChoiceResponseGenerationMethod#

This variant asks for one choice label without JSON wrapping.

options = generate_likert_options(
    n=5,
    answer_texts=pizza_toppings,
    response_generation_method=ChoiceResponseGenerationMethod(
        allowed_choices_template="{options}"
    ),
)
show_variant_prompts("ChoiceResponseGenerationMethod", options)
=== ChoiceResponseGenerationMethod ===
SYSTEM PROMPT
You are a careful survey assistant.
You only respond with the most probable answer option.
PROMPT
Choose exactly one pizza topping for the following situation.
Pick one topping for a pizza made for movie night with friends.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese
Automatic instruction: You only respond with the most probable answer option.

2.3 JSONSingleResponseGenerationMethod#

This version enforces a JSON object with one answer field.

options = generate_likert_options(
    n=5,
    answer_texts=pizza_toppings,
    response_generation_method=JSONSingleResponseGenerationMethod(),
)
show_variant_prompts("JSONSingleResponseGenerationMethod", options)
=== JSONSingleResponseGenerationMethod ===
SYSTEM PROMPT
You are a careful survey assistant.
You only respond with the most probable answer option in the following JSON format:
{
  "answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
}
PROMPT
Choose exactly one pizza topping for the following situation.
Pick one topping for a pizza made for movie night with friends.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese
Automatic instruction: You only respond with the most probable answer option in the following JSON format: { "answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese" }

2.4 JSONReasoningResponseGenerationMethod#

This version keeps both reasoning and final answer in structured JSON.

options = generate_likert_options(
    n=5,
    answer_texts=pizza_toppings,
    response_generation_method=JSONReasoningResponseGenerationMethod(),
)
show_variant_prompts("JSONReasoningResponseGenerationMethod", options)
=== JSONReasoningResponseGenerationMethod ===
SYSTEM PROMPT
You are a careful survey assistant.
You always reason about the possible answer options first.
You respond with your reasoning and the most probable answer option in the following JSON format:
{
  "reasoning": "your reasoning about the answer options",
  "answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
}
PROMPT
Choose exactly one pizza topping for the following situation.
Pick one topping for a pizza made for movie night with friends.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese
Automatic instruction: You always reason about the possible answer options first. You respond with your reasoning and the most probable answer option in the following JSON format: { "reasoning": "your reasoning about the answer options", "answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese" }

2.5 LogprobResponseGenerationMethod#

This variant is built for token-probability post-processing.

options = generate_likert_options(
    n=5,
    answer_texts=pizza_toppings,
    response_generation_method=LogprobResponseGenerationMethod(
        allowed_choices_template="{options}"
    ),
)
show_variant_prompts("LogprobResponseGenerationMethod", options)
=== LogprobResponseGenerationMethod ===
SYSTEM PROMPT
You are a careful survey assistant.
You only respond with the most probable answer option.
PROMPT
Choose exactly one pizza topping for the following situation.
Pick one topping for a pizza made for movie night with friends.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese
Automatic instruction: You only respond with the most probable answer option.

3. Block B: Single-Item vs Battery Rendering#

Now we keep one JSON method and compare the two rendering modes.

json_prompt = base_prompt.duplicate()
json_options = generate_likert_options(
    n=5,
    answer_texts=pizza_toppings,
    response_generation_method=JSONSingleResponseGenerationMethod(),
)
json_prompt.prepare_prompt(question_stem=question_stem, answer_options=json_options)
<qstn.prompt_builder.LLMPrompt at 0x7d440cb08080>

3.1 SINGLE_ITEM rendering#

single_system, single_user = json_prompt.get_prompt_for_questionnaire_type(
    questionnaire_type=QuestionnairePresentation.SINGLE_ITEM,
    item_position=0,
)

single_lines = [line.strip() for line in single_system.splitlines()]
single_instruction = "\n".join(single_lines[1:]).strip() or "(none)"

print("=== SINGLE_ITEM: SYSTEM PROMPT ===")
print(single_system)
print("\n=== SINGLE_ITEM: PROMPT ===")
print(single_user)
print("\n=== SINGLE_ITEM: AUTOMATIC INSTRUCTION ===")
print(single_instruction)
=== SINGLE_ITEM: SYSTEM PROMPT ===
You are a careful survey assistant.
You only respond with the most probable answer option in the following JSON format:
{
  "answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
}

=== SINGLE_ITEM: PROMPT ===
Choose exactly one pizza topping for the following situation.
Pick one topping for a pizza made for movie night with friends.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese

=== SINGLE_ITEM: AUTOMATIC INSTRUCTION ===
You only respond with the most probable answer option in the following JSON format:
{
"answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
}

3.2 BATTERY rendering#

The battery format automatically nests each answer option, so that full patterns such as reasoning or verbalized distribution are automatically supported.

battery_system, battery_user = json_prompt.get_prompt_for_questionnaire_type(
    questionnaire_type=QuestionnairePresentation.BATTERY,
)

battery_lines = [line.strip() for line in battery_system.splitlines()]
battery_instruction = "\n".join(battery_lines[1:]).strip() or "(none)"

expected_keys = [
    f"answer_{content}" for content in questionnaire_df[constants.QUESTION_CONTENT].tolist()
]

print("=== BATTERY: SYSTEM PROMPT ===")
print(battery_system)
print("\n=== BATTERY: PROMPT ===")
print(battery_user)
print("\n=== BATTERY: AUTOMATIC INSTRUCTION ===")
print(battery_instruction)
=== BATTERY: SYSTEM PROMPT ===
You are a careful survey assistant.
You only respond with the most probable answer option in the following JSON format:
{
  "movie night with friends": {
    "answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
  },
  "a rainy sunday at home": {
    "answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
  }
}

=== BATTERY: PROMPT ===
Choose exactly one pizza topping for the following situation.
Pick one topping for a pizza made for movie night with friends.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese
Pick one topping for a pizza made for a rainy sunday at home.
Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese

=== BATTERY: AUTOMATIC INSTRUCTION ===
You only respond with the most probable answer option in the following JSON format:
{
"movie night with friends": {
"answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
},
"a rainy sunday at home": {
"answer": "choose one of: Options are: 1: Pepperoni, 2: Mushrooms, 3: Pineapple, 4: Jalapenos, 5: Extra cheese"
}
}

4. Block C: Hand-Created JSON Strings with parse_json_str#

Three tiny inputs show the core behavior quickly: valid JSON, fixable malformed JSON, and plain text.

valid_json = '{"answer": "4: Jalapenos"}'
repairable_json = '{"answer": "4: Jalapenos",}'
invalid_json = "Jalapenos for sure."

examples = {
    "valid_json": valid_json,
    "repairable_json": repairable_json,
    "invalid_json": invalid_json,
}

for name, payload in examples.items():
    parsed = parse_json_str(payload)
    print(f"{name}: {parsed} (type={type(parsed).__name__})")
valid_json: {'answer': '4: Jalapenos'} (type=dict)
repairable_json: {'answer': '4: Jalapenos'} (type=dict)
invalid_json:  (type=str)

5. Block D: Synthetic Single-Item Survey Result with parse_json#

One valid row and one invalid row make the parser success/failure paths explicit.

q0 = json_prompt.get_question(0)
q1 = json_prompt.get_question(1)

single_item_result = InferenceResult(
    questionnaire=json_prompt,
    results={
        101: QuestionLLMResponseTuple(
            question=json_prompt.generate_question_prompt(q0),
            llm_response='{"answer": "4: Jalapenos"}',
            logprobs=None,
            reasoning=None,
        ),
        102: QuestionLLMResponseTuple(
            question=json_prompt.generate_question_prompt(q1),
            llm_response="Not valid JSON",
            logprobs=None,
            reasoning=None,
        ),
    },
)

single_parsed_df = create_one_dataframe(parse_json([single_item_result]))

display(single_parsed_df)
if "error_col" in single_parsed_df.columns:
    print(
        "Rows with parsing errors:", int((single_parsed_df["error_col"] == "ERROR: Parsing").sum())
    )
questionnaire_name questionnaire_item_id question answer llm_response error_col
0 FoundationsResponseContract 101 Pick one topping for a pizza made for movie ni... 4: Jalapenos NaN NaN
1 FoundationsResponseContract 102 Pick one topping for a pizza made for a rainy ... NaN Not valid JSON ERROR: Parsing
Rows with parsing errors: 1

6. Block E: Synthetic Battery Result with parse_json_battery#

Here we run the normal battery path with correctly suffixed keys and inspect the reshaped dataframe.

battery_q0 = json_prompt.get_question(0)
battery_q1 = json_prompt.get_question(1)

valid_battery_payload = {
    str(battery_q0.question_content): {"answer": "4: Jalapenos"},
    str(battery_q1.question_content): {"answer": "2: Mushrooms"},
}

battery_result = InferenceResult(
    questionnaire=json_prompt,
    results={
        -1: QuestionLLMResponseTuple(
            question="All questions answered in one battery prompt",
            llm_response=json.dumps(valid_battery_payload),
            logprobs=None,
            reasoning=None,
        )
    },
)

battery_parsed_df = create_one_dataframe(parse_json_battery([battery_result]))

display(battery_parsed_df)
questionnaire_name questionnaire_item_id question answer
0 FoundationsResponseContract 101 Pick one topping for a pizza made for movie ni... 4: Jalapenos
1 FoundationsResponseContract 102 Pick one topping for a pizza made for a rainy ... 2: Mushrooms

7. Block F: Hand-Created Logprobs with parse_logprobs#

Now we parse token probabilities using the full five-option set.

logprob_result = InferenceResult(
    questionnaire=base_prompt,
    results={
        101: QuestionLLMResponseTuple(
            question=base_prompt.generate_question_prompt(base_prompt.get_question(0)),
            llm_response="4",
            logprobs={
                "1": -1.80,
                "2": -1.40,
                "3": -1.10,
                "4": -0.20,
                "5": -0.75,
                "x": -4.00,
            },
            reasoning=None,
        )
    },
)

parsed_from_list = create_one_dataframe(parse_logprobs([logprob_result], ["1", "2", "3", "4", "5"]))

parsed_from_dict = create_one_dataframe(
    parse_logprobs(
        [logprob_result],
        {
            "Pepperoni": ["1", "1."],
            "Mushrooms": ["2", "2."],
            "Pineapple": ["3", "3."],
            "Jalapenos": ["4", "4."],
            "Extra cheese": ["5", "5."],
        },
    )
)

print("=== allowed_choices as full index list ===")
display(parsed_from_list)

print("=== allowed_choices as topping label map ===")
display(parsed_from_dict)
=== allowed_choices as full index list ===
questionnaire_name questionnaire_item_id question 1 2 3 4 5
0 FoundationsResponseContract 101 Pick one topping for a pizza made for movie ni... 0.081193 0.121126 0.163504 0.402154 0.232023
=== allowed_choices as topping label map ===
questionnaire_name questionnaire_item_id question Pepperoni Mushrooms Pineapple Jalapenos Extra cheese
0 FoundationsResponseContract 101 Pick one topping for a pizza made for movie ni... 0.081193 0.121126 0.163504 0.402154 0.232023

8. Block G: Core Failure Mode for parse_logprobs#

A missing logprob payload should be flagged explicitly.

missing_logprobs_result = InferenceResult(
    questionnaire=base_prompt,
    results={
        101: QuestionLLMResponseTuple(
            question=base_prompt.generate_question_prompt(base_prompt.get_question(0)),
            llm_response="No token probabilities returned",
            logprobs=None,
            reasoning=None,
        )
    },
)

missing_logprobs_df = create_one_dataframe(
    parse_logprobs([missing_logprobs_result], ["1", "2", "3", "4", "5"])
)

display(missing_logprobs_df)
print("error_col values:", missing_logprobs_df["error_col"].dropna().tolist())
/tmp/ipykernel_3845664/1601005650.py:14: UserWarning: No logprobs found in InterviewResult. Make sure to use Logprob_AnswerProductionMethod to generate logprobs.
  parse_logprobs([missing_logprobs_result], ["1", "2", "3", "4", "5"])
questionnaire_name questionnaire_item_id question 1 2 3 4 5 error_col
0 FoundationsResponseContract 101 Pick one topping for a pizza made for movie ni... NaN NaN NaN NaN NaN MISSING_LOGPROBS
error_col values: ['MISSING_LOGPROBS']

9. Quick Recap#

  • ChoiceResponseGenerationMethod is a lightweight path when one label is enough.

  • JSONSingleResponseGenerationMethod and JSONReasoningResponseGenerationMethod are the main structured-output options.

  • LogprobResponseGenerationMethod is for token-level probability aggregation.

  • JSON methods pair with parse_json / parse_json_battery; logprob methods pair with parse_logprobs.

  • In battery mode, exact key suffix matching matters: answer_{question_content}.

When you are ready for real model runs, continue with the Guides (for example Tutorial 1).