API: LangChain RAG Spec
===============

RapidFire AI's core API for defining the stages of a RAG pipeline before the generator itself is a wrapper around the corresponding APIs of LangChain.
In particular, this class specifies all of the following stages: data loading, chunking, embedding, indexing, retrieval, and reranking steps. 
Note that many of these stages are optional.

Some of the arguments (knobs) here can also be :class:`List` valued or :class:`Range` valued depending 
on its data type, as explained below. All this forms the base set of knob combinations from which 
a config group can be produced. Also read :doc:`the Multi-Config Specification page</configs>`.


.. py:class:: RFLangChainRagSpec

  .. py:method:: __init__(document_loader: BaseLoader, text_splitter: TextSplitter, embedding_cls: type[Embeddings] = None, embedding_kwargs: dict[str, Any] = None, vector_store: VectorStore = None, retriever: BaseRetriever = None, search_type: str = "similarity", search_kwargs: dict = None, reranker_cls: type[BaseDocumentCompressor] = None, reranker_kwargs: dict[str, Any] = None, enable_gpu_search: bool = False, document_template: Callable[[Document], str] = None)

    Initialize the RAG specification with document loading, chunking, embedding, indexing, retrieval, and reranking configurations.

    :param document_loader: The loader for source documents from various sources (files, directories, databases, etc.). Must be a LangChain BaseLoader implementation.
    :type document_loader: BaseLoader

    :param text_splitter: The text splitter for chunking documents for RAG purposes. Controls chunk size, overlap, and splitting strategy. Must be a LangChain TextSplitter.
    :type text_splitter: TextSplitter

    :param embedding_cls: Optional embedding class to convert a chunk/query into a vector. Options include :class:`HuggingFaceEmbeddings`, :class:`OpenAIEmbeddings`, etc.. Pass the class itself, not an instance.
    :type embedding_cls: type[Embeddings], optional

    :param embedding_kwargs: Dictionary containing all parameters needed to initialize the embedding class above. Required parameters vary by embedding class. For example, :class:`HuggingFaceEmbeddings` needs :code:`model_name`, :code:`model_kwargs` and :code:`device`.
    :type embedding_kwargs: dict[str, Any], optional

    :param vector_store: Optional vector store for storing and possibly indexing over embedding vectors. If not provided, a default FAISS flat vector store will be created automatically. Must be a LangChain VectorStore implementation.
    :type vector_store: VectorStore, optional

    :param retriever: Optional custom retriever for chunk retrieval. If not provided, a default FAISS vector store will be created automatically using the specified search configuration below. Must be a LangChain BaseRetriever implementation.
    :type retriever: BaseRetriever, optional

    :param search_type: The search algorithm type for retrieval. Must be one of the following three options. Default is :code:`"similarity"`.
    
      * :code:`"similarity"`: Standard cosine similarity search.
      * :code:`"similarity_score_threshold"`: Similarity search with minimum score threshold (SST).
      * :code:`"mmr"`: Maximum Marginal Relevance (MMR) search for diversity.
    
    :type search_type: str

    :param search_kwargs: Additional parameters for search configuration. The keys can include the following:

      * :code:`"k"`: Number of documents to retrieve. Default is 5.
      * :code:`"filter"`: Optional filter criteria function for search results.
      * :code:`"score_threshold"`: Only for SST. Minimum similarity score threshold. 
      * :code:`"fetch_k"`: Only for MMR. Number of documents to fetch before MMR reranking. Default is 20.
      * :code:`"lambda_mult"`: Only for MMR. Diversity parameter for MMR balancing relevance vs. diversity. Default is 0.5.

    :type search_kwargs: dict, optional

    :param reranker_cls: Optional reranker class for reordering retrieved chunks by relevance. Options include :class:`CrossEncoderReranker` from :code:`langchain.retrievers.document_compressors`. The instantiated reranker is applied to each query's results individually. Pass the class itself, not an instance.
    :type reranker_cls: type[BaseDocumentCompressor], optional

    :param reranker_kwargs: Dictionary containing all parameters needed to initialize the reranker class above. Required parameters vary by reranker class. For example, :class:`CrossEncoderReranker` needs :code:`model_name`, :code:`model_kwargs` and :code:`top_n`.
    :type reranker_kwargs: dict[str, Any], optional

    :param enable_gpu_search: If :code:`True`, uses GPU-accelerated FAISS (IndexFlatL2 on GPU) with matrix multiply for exact search. Otherwise uses CPU-based FAISS HNSW index (IndexHNSWFlat) for approximate search. GPU mode requires :code:`faiss-gpu` package and CUDA-compatible GPU. Default is :code:`False`.
    :type enable_gpu_search: bool, optional

    :param document_template: Optional function to format chunks for display or downstream processing. Should accept a single LangChain Document object and return a formatted string. Default template format is :code:`"metadata:\\ncontent"`. Multiple documents are separated by double newlines. If not provided, uses default template.
    :type document_template: Callable[[Document], str], optional


  .. py:method:: serialize_documents(batch_docs: list[list[Document]]) -> list[str]

    Serialize batch of context document chunks into formatted strings for context injection.

    :param batch_docs: List of Document lists, where each inner list contains Documents for one query.
    :type batch_docs: list[list[Document]]

    :return: List of formatted document chunk strings, one per query, with different document chunks separated by double newlines.
    :rtype: list[str]


  .. py:method:: get_context(batch_queries: list[str], use_reranker: bool = True, serialize: bool = True) -> list[str] | list[list[Document]]

    Convenience function to retrieve and optionally also serialize relevant context document chunks for batch queries. 
    By default, if a reranker is provided in the RAG spec it will be applied.
    
    :param batch_queries: List of query strings to retrieve context for.
    :type batch_queries: list[str]

    :param use_reranker: Whether to apply reranking if a reranker is provided. Default is True. Set to False to skip reranking.
    :type use_reranker: bool, optional

    :param serialize: Whether to serialize documents into strings. If False, returns raw Document objects. Default is True.
    :type serialize: bool, optional
    
    :return: List of formatted context strings (if :code:`serialize`=True) or list of Document lists (if :code:`serialize`=False), one per query.
    :rtype: list[str] | list[list[Document]]
        
    :raises ValueError: If retriever is not configured in RAG spec; internal method :code:`build_index()` will fail.


  .. seealso::
     - `DirectoryLoader API Reference <https://api.python.langchain.com/en/latest/document_loaders/langchain_community.document_loaders.directory.DirectoryLoader.html>`_
     - `HuggingFaceEmbeddings API Reference <https://python.langchain.com/api_reference/huggingface/embeddings/langchain_huggingface.embeddings.huggingface.HuggingFaceEmbeddings.html>`_
     - `LangChain Text Splitters <https://docs.langchain.com/oss/python/integrations/splitters>`_
     - `LangChain Embeddings <https://reference.langchain.com/python/langchain/embeddings/>`_
     - `LangChain Retrievers <https://reference.langchain.com/python/langchain_core/retrievers/>`_
     - `LangChain Vector Stores <https://reference.langchain.com/python/langchain_core/vectorstores/>`_
     - `LangChain Document <https://reference.langchain.com/python/langchain_core/documents/>`_
     - `FAISS <https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.faiss.FAISS.html>`_


**Example:**

.. code-block:: python

    # Based on the FiQA tutorial notebook
    rag_gpu = RFLangChainRagSpec(
        document_loader=DirectoryLoader(
            path="data/fiqa/",
            glob="corpus.jsonl",
            loader_cls=JSONLoader,
            loader_kwargs={
                "jq_schema": ".",
                "content_key": "text",
                "metadata_func": lambda record, metadata: {
                    "corpus_id": int(record.get("_id"))
                },  # store the document id
                "json_lines": True,
                "text_content": False,
            },
            sample_seed=42,
        ),
        # 2 chunking strategies with different chunk sizes
        text_splitter=List(
            [
                RecursiveCharacterTextSplitter.from_tiktoken_encoder(
                    encoding_name="gpt2", chunk_size=256, chunk_overlap=32
                )
            ],
            [
                RecursiveCharacterTextSplitter.from_tiktoken_encoder(
                    encoding_name="gpt2", chunk_size=128, chunk_overlap=32
                )
            ],
        ),
        embedding_cls=HuggingFaceEmbeddings,
        embedding_kwargs={
            "model_name": "sentence-transformers/all-MiniLM-L6-v2",
            "model_kwargs": {"device": "cuda:0"},
            "encode_kwargs": {"normalize_embeddings": True, "batch_size": batch_size},
        },
        vector_store=None,  # uses FAISS by default
        search_type="similarity",
        search_kwargs={"k": 15},
        # 2 reranking strategies with different top-n values
        reranker_cls=CrossEncoderReranker,
        reranker_kwargs={
            "model_name": "cross-encoder/ms-marco-MiniLM-L6-v2",
            "model_kwargs": {"device": "cuda:0"},
            "top_n": List([2, 5]),
        },
        enable_gpu_search=True,  # GPU-based exact search instead of ANN index
    )

**Notes:**

Note that one :class:`RFLangChainRagSpec` object can have only one :code:`document_loader` to specify the base data.
But you can specify a :class:`List` or :class:`Range` (when applicable) for all the other values in a multi-config specification. 
For instance, the example above showcases two text splitters and two rerankers with different hyperparameters.