Install and Get Started: RAG and Context Engineering
To install RapidFire AI for RAG and context engineering on your local machine or remote/cloud instance, follow the steps below.
Note that if you only plan to use OpenAI APIs and not self-hosted models, you do not need GPUs on your machine. But you must provide a valid OpenAI API key via a config argument as shown in the tutorial notebooks.
Note
An example notebook for RapidFire AI RAG fully on Google Colab is coming soon. Watch this space for updates.
Step 1: Install dependencies and package
Obtain the RapidFire AI OSS package from pypi (includes all dependencies) and ensure it is installed correctly.
Important
Requires Python 3.12+. Ensure that python3 resolves to Python 3.12 before creating the venv.
python3 --version # must be 3.12.x
python3 -m venv .venv
source .venv/bin/activate
pip install rapidfireai
rapidfireai --version
# Verify it prints the following:
# RapidFire AI 0.12.3
# Due to current issue: https://github.com/huggingface/xet-core/issues/527
pip uninstall -y hf-xet
The tutorial notebooks for RAG evals do not use any gated models from Hugging Face. If you want to access gated models, provide your Hugging Face account token. For more details on that, see Step 1 here.
Step 2: Initialize RapidFire AI
Run the following command to initialize rapidfireai to use the correct dependencies for RAG evals:
rapidfireai init --evals
# It will install specific dependencies and initialize rapidfireai for RAG evals
Note
You need to run init only once for a new venv or when switching GPU(s) on your machine. You do NOT need to run it after a reboot or for a new terminal tab.
Step 3: Open the tutorial notebooks
After completing Step 2, open one of the tutorial notebooks via Jupyter (explained further here: Example Use Case Tutorials), say the the FiQA RAG Q&A chabtot use case. You can see the files under the “tutorial_notebooks” folder in the directory where you initialized rapidfireai.
FiQA: RAG for Financial Opinion Q&A Chatbot: View on GitHub
GSM8K: Context Engineering for Math Reasoning: View on GitHub
SciFact: RAG for Scientific Claim Verification: View on GitHub
Quickstart Video (3.5min)
Full Usage Walkthrough Video (13.5min)
Step 4: Run the notebook cells
Run the cells one by one as shown in the above videos. Wait for a cell to finish before running the next.
Imports
Load datasets
Create named RF experiment
Define RF RAG spec that wraps LangChain classes
Define data preprocessing and post processing functions
Define eval metrics functions per batch and for accumulation
Define RF generator spec that wraps vLLM or OpenAI classes
Define rest of multi-config knob dictionary and generate config group
Launch multi-config evals; adjust
num_shardsas per desired concurrency (see Run Evals for details)
Step 5: Monitor online aggregation of eval metrics on in-notebook table

Step 6: Interactive Control (IC) Ops: Stop, Clone-Modify; check their results



Step 7: Inspect results, end experiment, and check logs.
Run the cell to print some entries of the evals results. End the expeirment after you are done with it.
You can then move on to another (named) experiment in the same session.
Run as many experiments as you like; each will have its metrics apppear on its own table under the run_evals() cell.
All experiment artifacts (metrics files, logs, checkpoints, etc.) are persistent on
your machine in the experiments path specified in the constructor.
When you are done overall, just close the notebook. RapidFire AI for evals does not maintain any running server processes.
Step 8: Venture Beyond!
After trying out the tutorial notebooks, explore the rest of this docs website, especially the API pages for RAG and context engineering. Play around more with IC Ops and/or run more experiments as you wish, including changing the prompt schemes, generator models and its knobs, chunking / reranking / retrieval knobs, and eval metrics definitions.
You are now up to speed! Enjoy the power of rapid AI customization with RapidFire AI!