SFT for Q&A Chatbot
=======================

Please check out the tutorial notebooks on the links below. Right click on the GitHub link to save that file locally.

SFT for customer support Q&A chatbot: `View on GitHub <https://github.com/RapidFireAI/rapidfireai/blob/main/tutorial_notebooks/fit/rf-tutorial-sft-chatqa.ipynb>`__. 
Use this version if your GPU has >= 80 GB HBM.

Lite version: `View on GitHub <https://github.com/RapidFireAI/rapidfireai/blob/main/tutorial_notebooks/fit/rf-tutorial-sft-chatqa-lite.ipynb>`__. 
Use this version if your GPU has < 80 GB HBM; it just uses smaller LLMs and finishes faster. 


Task, Dataset, and Prompt
-------

This tutorial shows Supervised Fine-Tuning (SFT) for creating a customer support Q&A chatbot.

It uses the "Bitext customer support" dataset; 
`see its details on Hugging Face <https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset>`__. 
We use a sample of 5,000 training examples and 200 evaluation examples for tractable demo runtimes.

The prompt format includes a system message defining the assistant as "helpful and friendly customer 
support" with user instructions and assistant responses


Model, Adapter, and Trainer Knobs
-------

We compare 2 base model architectures: Llama-3.1-8B-Instruct and Mistral-7B-Instruct-v0.3. 
The lite version uses only one: TinyLlama-1.1B-Chat-v1.0.

There are 2 different LoRA adapter configurations: a low-capacity adapter (rank 16; 8 for lite) targeting 
only 2 modules and a high-capacity adapter (rank 128; 32 for lite) targeting 4 modules.

All other knobs are fixed across all configs. Thus, there are a total of 4 combinations, 
all launched with a simple grid search.