Utilities

Contents

Utilities#

Constants#

class qstn.utilities.constants.QuestionnairePresentation(*values)[source]#

Bases: Enum

BATTERY: str = 'questionnaire_type_battery'#
SEQUENTIAL: str = 'questionnaire_type_sequential'#
SINGLE_ITEM: str = 'questionnaire_type_single_item'#

Placeholder#

Prompt Perturbation#

qstn.utilities.prompt_perturbations.apply_safe_perturbation(prompts, perturbation_func, **kwargs)[source]#

Splits list of prompts by curly brace placeholders (e.g., {PROMPT_OPTIONS}). Applies the perturbation_func ONLY to the prompts segments, protecting the keys.

Parameters:
  • prompts (List[str]) – The input prompts containing placeholders.

  • perturbation_func (function) – The function to apply to non-placeholder text.

  • **kwargs – Additional keyword arguments to pass to the perturbation function (e.g., probability).

Returns:

The prompts with perturbations applied safely.

Return type:

List[str]

qstn.utilities.prompt_perturbations.key_typos(text, probability=0.1)[source]#

Randomly replaces characters with random alphabet letters to simulate typos. :param text: The input text to perturb. :type text: str :param probability: The probability of replacing each character. :type probability: float

Returns:

The text with random character replacements based on the given probability.

Return type:

str

Parameters:
  • text (str)

  • probability (float)

qstn.utilities.prompt_perturbations.keyboard_typos(text, probability=0.1)[source]#

Introduces typos based on keyboard proximity. :param text: The input text to perturb. :type text: str :param probability: The probability of introducing a typo for each character. :type probability: float

Returns:

The text with keyboard-based typos introduced based on the given probability.

Return type:

str

Parameters:
  • text (str)

  • probability (float)

qstn.utilities.prompt_perturbations.letter_swaps(text, probability=0.1)[source]#

Randomly swaps adjacent letters in the text. :param text: The input text to perturb. :type text: str :param probability: The probability of swapping each adjacent letter pair. :type probability: float

Returns:

The text with adjacent letters swapped based on the given probability.

Return type:

str

Parameters:
  • text (str)

  • probability (float)

qstn.utilities.prompt_perturbations.make_paraphrase(all_prompts, model, instruction)[source]#

Uses a language model to paraphrase the input text. :param all_prompts: The input prompts as a list to perturb. :type all_prompts: List[str] :param model: The language model to use for paraphrasing as a vllm LLM object. :type model: str :param instruction: The instruction prompt for the model. :type instruction: str

Returns:

The paraphrased text as a list of strings.

Return type:

List[str]

Parameters:
  • all_prompts (list[str])

  • model (str)

  • instruction (str)

qstn.utilities.prompt_perturbations.make_synonyms(all_prompts, model, instruction)[source]#

Uses a language model to replace words with their synonyms. :param all_prompts: The input prompts as a list to perturb. :type all_prompts: List[str] :param model: The language model to use for generating synonyms as a vllm LLM object. :type model: str :param instruction: The instruction prompt for the model. :type instruction: str

Returns:

The prompts with words replaced by their synonyms as a list of strings.

Return type:

List[str]

Parameters:
  • all_prompts (list[str])

  • model (str)

  • instruction (str)

Prompt Templates#

Survey Objects#

class qstn.utilities.survey_objects.AnswerOptions(answer_texts, from_to_scale=False, list_prompt_template='Options are: {options}', scale_prompt_template='Options range from {start} to {end}', response_generation_method=None)[source]#

Bases: object

Stores answer options for a single question or a full questionnaire.

Parameters:
  • answer_texts (list) – A list of possible answer strings.

  • index (list | None) – Optionally store answer option indices separately, e.g., for structured outputs.

  • from_to_scale (bool) – If True, treat answer_text as a scale [start, …, end].

  • list_prompt_template (str) – A format string for list-based options. Must contain an ‘{options}’ placeholder.

  • scale_prompt_template (str) – A format string for scale-based options. Must contain ‘{start}’ and ‘{end}’ placeholders.

  • response_generation_method (ResponseGenerationMethod | None)

answer_texts: AnswerTexts#
create_options_str()[source]#
Return type:

str

from_to_scale: bool = False#
list_prompt_template: str = 'Options are: {options}'#
response_generation_method: ResponseGenerationMethod | None = None#
scale_prompt_template: str = 'Options range from {start} to {end}'#
class qstn.utilities.survey_objects.AnswerTexts(answer_texts, indices=None, index_answer_seperator=': ', option_seperators=', ', only_scale=False)[source]#

Bases: object

Represents the answer choices for a questionnaire item.

This class manages the different formats of answer texts, including lists of options and scales. It can handle answers with or without all_answers.

Parameters:
  • answer_texts (list[str] | None)

  • indices (list[str] | None)

  • index_answer_seperator (str)

  • option_seperators (str)

  • only_scale (bool)

full_answers#

A list of the complete answer strings, including indices and separators if provided.

Type:

List[str]

answer_texts#

The text of the answer options.

Type:

Optional[List[str]]

indices#

The indices corresponding to the answer options.

Type:

Optional[List[str]]

index_answer_seperator#

The separator between an index and its corresponding answer text. Defaults to “: “.

Type:

str

option_seperators#

The separators used to join multiple answer options into a single string. Defaults to (”, “,).

Type:

Tuple[str, …]

only_scale#

If True, the answers represent a scale, and only the first and last answer texts are used to create a range of options. Defaults to False.

Type:

bool

answer_texts: list[str] | None = None#
full_answers: list[str]#
get_list_answer_texts()[source]#

Returns the answer texts as a single string, joined by the option separators.

Returns:

A string representation of the list of answers.

Return type:

str

get_scale_answer_texts()[source]#

Returns the first and last answer texts for a scale.

Returns:

A tuple containing the first and last answer

texts.

Return type:

Tuple[str, str]

index_answer_seperator: str = ': '#
indices: list[str] | None = None#
only_scale: bool = (False,)#
option_seperators: str = (', ',)#
class qstn.utilities.survey_objects.InferenceResult(questionnaire, results)[source]#

Bases: object

Contains a prompt and the corresponding responses by the LLM. Can return results as a dataframe or return the transcript of all questions and answers.

Parameters:
get_questions_transcript()[source]#
Return type:

str

questionnaire: LLMPrompt#
results: dict[int, QuestionLLMResponseTuple]#
to_dataframe()[source]#
Return type:

DataFrame

class qstn.utilities.survey_objects.QuestionLLMResponseTuple(question, llm_response, logprobs, reasoning)[source]#

Bases: NamedTuple

Contains the question, llm_response and optionally logprobs and built-in reasoning.

Parameters:
  • question (str)

  • llm_response (str)

  • logprobs (dict[str, float] | None)

  • reasoning (str | None)

llm_response: str#

Alias for field number 1

logprobs: dict[str, float] | None#

Alias for field number 2

question: str#

Alias for field number 0

reasoning: str | None#

Alias for field number 3

class qstn.utilities.survey_objects.QuestionnaireItem(item_id, question_content, question_stem=None, answer_options=None, prefilled_response=None)[source]#

Bases: object

Represents a single questionnaire item.

Parameters:
  • item_id (str)

  • question_content (str | int)

  • question_stem (str | None)

  • answer_options (AnswerOptions | None)

  • prefilled_response (str | None)

answer_options: AnswerOptions | None = None#
item_id: str#
prefilled_response: str | None = None#
question_content: str | int#
question_stem: str | None = None#

Util Functions#

qstn.utilities.utils.create_one_dataframe(parsed_results)[source]#

Concatenates a dictionary of DataFrames into a single DataFrame.

Parameters:

parsed_results (Dict[Any, pd.DataFrame]) – A dictionary mapping objects to DataFrames. Each key must be an object that has an interview_name attribute (e.g., a custom class instance). The values are the pandas DataFrames to be merged.

Returns:

A single DataFrame containing the vertically concatenated data from all input DataFrames. Returns an empty DataFrame if the input dictionary is empty.

Return type:

pd.DataFrame

qstn.utilities.utils.generate_seeds(seed, batch_size)[source]#

Generate a list of random seeds.

Parameters:
  • seed (int) – Base random seed

  • batch_size (int) – Number of seeds to generate

Returns:

Generated random seeds

Return type:

List[int]

qstn.utilities.utils.safe_format_with_regex(template_string, data)[source]#

Safely substitutes {{variable}} style placeholders using a regex.

Parameters:
  • template_string (str)

  • data (dict)

Return type:

str