API: Prompt Manager and Other Eval Config Knobs
RFPromptManager
This class wraps around some LangChain APIs to manage dynamic few-shot example selection. It provides semantic similarity-based example selection to construct prompts with the most relevant examples for each input query.
The individual arguments (knobs) can be List valued or Range valued in an RFPromptManager.
That is how you can specify a base set of knob combinations from which a config group can be produced.
Also read the Multi-Config Specification page.
- class RFPromptManager
- Parameters:
instructions (str, optional) – The main instructions for the prompt that guide the generator’s behavior. This sets the overall task description and role for the assistant. Either this or
instructions_file_pathmust be provided.instructions_file_path (str, optional) – Path to a file containing the instructions. Use this as an alternative to the
instructionsparameter for loading instructions from a file, say, if they are very long.examples (list[dict[str, str]], optional) – A list of example dictionaries for few-shot learning. Each example should be a dictionary with keys matching the expected input-output format (e.g., “question” and “answer”).
embedding_cls (type[Embeddings], optional) – The embedding class to use for computing semantic similarity between examples and queries. Options include
HuggingFaceEmbeddingsandOpenAIEmbeddings. Pass the class itself, not an instance.embedding_kwargs (dict[str, Any], optional) – Dictionary containing all parameters needed to initialize the embedding class above. Required parameters vary by embedding class.
HuggingFaceEmbeddingsneedsmodel_name,model_kwargsanddevice, whileOpenAIEmbeddingsneeds"model"and"api_key".example_selector_cls (type[MaxMarginalRelevanceExampleSelector | SemanticSimilarityExampleSelector], optional) – The example selector class that determines how to choose relevant examples based on the input query. Must be either
SemanticSimilarityExampleSelectororMaxMarginalRelevanceExampleSelector(for diversity) from LangChain.example_prompt_template (PromptTemplate, optional) – A LangChain
PromptTemplatethat defines how to format each example. Should specifyinput_variablesand atemplatestring with placeholders matching the keys in the examples dictionaries.k (int, optional) – Number of most similar or diverse examples to retrieve and include in the prompt for each query. Default is 3.
Example:
# Based on GSM8K chatbot tutorial notebook; specify your INSTRUCTIONS and OPENAI_API_KEY beforehand
fewshot_prompt_manager = RFPromptManager(
instructions=INSTRUCTIONS,
examples=examples,
embedding_cls=OpenAIEmbeddings,
embedding_kwargs={"model": "text-embedding-3-small", "api_key": OPENAI_API_KEY},
example_selector_cls=SemanticSimilarityExampleSelector,
example_prompt_template=PromptTemplate(
input_variables=["question", "answer"],
template="Question: {question}\nAnswer: {answer}",
),
k=5,
)
Other Eval Config Knobs
Finally, apart from the Generator, the following knobs can also be included in your eval config dictionary. Each of
these can also be a knob set generator, viz., List() for a discrete and Range() for continuous knobs.
For more details on the four user-given functions listed below, see the API: User-Provided Functions for Run Evals page.
For more details on the semantics of the online aggregation strategy arguments listed below, see the Online Aggregation for Evals page.
- batch_sizeint
Number of examples to process in one batch for GPU efficiency (if applicable)
- preprocess_fnCallable
User-given function to preprocess a batch of examples; an eval config’s RagSpec and PromptManager are input by the system
- postprocess_fnCallable, optional
User-given function to postprocess a batch of examples and generations; a single cfg is passed as input by the system
- compute_metrics_fnCallable
User-given evaluation function to compute eval metrics per batch
- accumulate_metrics_fnCallable, optional
User-given evaluation function to aggregate algebraic eval metrics across batches. If this is not given, all metrics provided in
eval_compute_metrics_fnwill be assumed to be distributive by default.- online_strategy_kwargsdict[str, Any], optional
Parameters for evals online aggregation strategy. The dictionary must include the following keys:
"strategy_name"(str) - Must be"normal","wilson", or"hoeffding"."confidence_level"(float) - Confidence level for confidence intervals on metrics. Must be in [0,1]. Default is 0.95 (95%)."use_fpc"(bool) - Whether to apply finite population correction. Default isTrue.