API: LangChain RAG Spec =============== RapidFire AI's core API for defining the stages of a RAG pipeline before the generator itself is a wrapper around the corresponding APIs of LangChain. In particular, this class specifies all of the following stages: data loading, chunking, embedding, indexing, retrieval, and reranking steps. Note that many of these stages are optional. Some of the arguments (knobs) here can also be :class:`List` valued or :class:`Range` valued depending on its data type, as explained below. All this forms the base set of knob combinations from which a config group can be produced. Also read :doc:`the Multi-Config Specification page`. .. py:class:: RFLangChainRagSpec .. py:method:: __init__(document_loader: BaseLoader, text_splitter: TextSplitter, embedding_cls: type[Embeddings] = None, embedding_kwargs: dict[str, Any] = None, vector_store: VectorStore = None, retriever: BaseRetriever = None, search_type: str = "similarity", search_kwargs: dict = None, reranker_cls: type[BaseDocumentCompressor] = None, reranker_kwargs: dict[str, Any] = None, enable_gpu_search: bool = False, document_template: Callable[[Document], str] = None) Initialize the RAG specification with document loading, chunking, embedding, indexing, retrieval, and reranking configurations. :param document_loader: The loader for source documents from various sources (files, directories, databases, etc.). Must be a LangChain BaseLoader implementation. :type document_loader: BaseLoader :param text_splitter: The text splitter for chunking documents for RAG purposes. Controls chunk size, overlap, and splitting strategy. Must be a LangChain TextSplitter. :type text_splitter: TextSplitter :param embedding_cls: Optional embedding class to convert a chunk/query into a vector. Options include :class:`HuggingFaceEmbeddings`, :class:`OpenAIEmbeddings`, etc.. Pass the class itself, not an instance. :type embedding_cls: type[Embeddings], optional :param embedding_kwargs: Dictionary containing all parameters needed to initialize the embedding class above. Required parameters vary by embedding class. For example, :class:`HuggingFaceEmbeddings` needs :code:`model_name`, :code:`model_kwargs` and :code:`device`. :type embedding_kwargs: dict[str, Any], optional :param vector_store: Optional vector store for storing and possibly indexing over embedding vectors. If not provided, a default FAISS flat vector store will be created automatically. Must be a LangChain VectorStore implementation. :type vector_store: VectorStore, optional :param retriever: Optional custom retriever for chunk retrieval. If not provided, a default FAISS vector store will be created automatically using the specified search configuration below. Must be a LangChain BaseRetriever implementation. :type retriever: BaseRetriever, optional :param search_type: The search algorithm type for retrieval. Must be one of the following three options. Default is :code:`"similarity"`. * :code:`"similarity"`: Standard cosine similarity search. * :code:`"similarity_score_threshold"`: Similarity search with minimum score threshold (SST). * :code:`"mmr"`: Maximum Marginal Relevance (MMR) search for diversity. :type search_type: str :param search_kwargs: Additional parameters for search configuration. The keys can include the following: * :code:`"k"`: Number of documents to retrieve. Default is 5. * :code:`"filter"`: Optional filter criteria function for search results. * :code:`"score_threshold"`: Only for SST. Minimum similarity score threshold. * :code:`"fetch_k"`: Only for MMR. Number of documents to fetch before MMR reranking. Default is 20. * :code:`"lambda_mult"`: Only for MMR. Diversity parameter for MMR balancing relevance vs. diversity. Default is 0.5. :type search_kwargs: dict, optional :param reranker_cls: Optional reranker class for reordering retrieved chunks by relevance. Options include :class:`CrossEncoderReranker` from :code:`langchain.retrievers.document_compressors`. The instantiated reranker is applied to each query's results individually. Pass the class itself, not an instance. :type reranker_cls: type[BaseDocumentCompressor], optional :param reranker_kwargs: Dictionary containing all parameters needed to initialize the reranker class above. Required parameters vary by reranker class. For example, :class:`CrossEncoderReranker` needs :code:`model_name`, :code:`model_kwargs` and :code:`top_n`. :type reranker_kwargs: dict[str, Any], optional :param enable_gpu_search: If :code:`True`, uses GPU-accelerated FAISS (IndexFlatL2 on GPU) with matrix multiply for exact search. Otherwise uses CPU-based FAISS HNSW index (IndexHNSWFlat) for approximate search. GPU mode requires :code:`faiss-gpu` package and CUDA-compatible GPU. Default is :code:`False`. :type enable_gpu_search: bool, optional :param document_template: Optional function to format chunks for display or downstream processing. Should accept a single LangChain Document object and return a formatted string. Default template format is :code:`"metadata:\\ncontent"`. Multiple documents are separated by double newlines. If not provided, uses default template. :type document_template: Callable[[Document], str], optional .. py:method:: serialize_documents(batch_docs: list[list[Document]]) -> list[str] Serialize batch of context document chunks into formatted strings for context injection. :param batch_docs: List of Document lists, where each inner list contains Documents for one query. :type batch_docs: list[list[Document]] :return: List of formatted document chunk strings, one per query, with different document chunks separated by double newlines. :rtype: list[str] .. py:method:: get_context(batch_queries: list[str], use_reranker: bool = True, serialize: bool = True) -> list[str] | list[list[Document]] Convenience function to retrieve and optionally also serialize relevant context document chunks for batch queries. By default, if a reranker is provided in the RAG spec it will be applied. :param batch_queries: List of query strings to retrieve context for. :type batch_queries: list[str] :param use_reranker: Whether to apply reranking if a reranker is provided. Default is True. Set to False to skip reranking. :type use_reranker: bool, optional :param serialize: Whether to serialize documents into strings. If False, returns raw Document objects. Default is True. :type serialize: bool, optional :return: List of formatted context strings (if :code:`serialize`=True) or list of Document lists (if :code:`serialize`=False), one per query. :rtype: list[str] | list[list[Document]] :raises ValueError: If retriever is not configured in RAG spec; internal method :code:`build_index()` will fail. .. seealso:: - `DirectoryLoader API Reference `_ - `HuggingFaceEmbeddings API Reference `_ - `LangChain Text Splitters `_ - `LangChain Embeddings `_ - `LangChain Retrievers `_ - `LangChain Vector Stores `_ - `LangChain Document `_ - `FAISS `_ **Example:** .. code-block:: python # Based on the FiQA tutorial notebook rag_gpu = RFLangChainRagSpec( document_loader=DirectoryLoader( path="data/fiqa/", glob="corpus.jsonl", loader_cls=JSONLoader, loader_kwargs={ "jq_schema": ".", "content_key": "text", "metadata_func": lambda record, metadata: { "corpus_id": int(record.get("_id")) }, # store the document id "json_lines": True, "text_content": False, }, sample_seed=42, ), # 2 chunking strategies with different chunk sizes text_splitter=List( [ RecursiveCharacterTextSplitter.from_tiktoken_encoder( encoding_name="gpt2", chunk_size=256, chunk_overlap=32 ) ], [ RecursiveCharacterTextSplitter.from_tiktoken_encoder( encoding_name="gpt2", chunk_size=128, chunk_overlap=32 ) ], ), embedding_cls=HuggingFaceEmbeddings, embedding_kwargs={ "model_name": "sentence-transformers/all-MiniLM-L6-v2", "model_kwargs": {"device": "cuda:0"}, "encode_kwargs": {"normalize_embeddings": True, "batch_size": batch_size}, }, vector_store=None, # uses FAISS by default search_type="similarity", search_kwargs={"k": 15}, # 2 reranking strategies with different top-n values reranker_cls=CrossEncoderReranker, reranker_kwargs={ "model_name": "cross-encoder/ms-marco-MiniLM-L6-v2", "model_kwargs": {"device": "cuda:0"}, "top_n": List([2, 5]), }, enable_gpu_search=True, # GPU-based exact search instead of ANN index ) **Notes:** Note that one :class:`RFLangChainRagSpec` object can have only one :code:`document_loader` to specify the base data. But you can specify a :class:`List` or :class:`Range` (when applicable) for all the other values in a multi-config specification. For instance, the example above showcases two text splitters and two rerankers with different hyperparameters.