SFT for Q&A Chatbot ======================= Please check out the tutorial notebooks on the links below. Right click on the GitHub link to save that file locally. SFT for customer support Q&A chatbot: `View on GitHub `__. Use this version if your GPU has >= 80 GB HBM. Lite version: `View on GitHub `__. Use this version if your GPU has < 80 GB HBM; it just uses smaller LLMs and finishes faster. Task, Dataset, and Prompt ------- This tutorial shows Supervised Fine-Tuning (SFT) for creating a customer support Q&A chatbot. It uses the "Bitext customer support" dataset; `see its details on Hugging Face `__. We use a sample of 5,000 training examples and 200 evaluation examples for tractable demo runtimes. The prompt format includes a system message defining the assistant as "helpful and friendly customer support" with user instructions and assistant responses Model, Adapter, and Trainer Knobs ------- We compare 2 base model architectures: Llama-3.1-8B-Instruct and Mistral-7B-Instruct-v0.3. The lite version uses only one: TinyLlama-1.1B-Chat-v1.0. There are 2 different LoRA adapter configurations: a low-capacity adapter (rank 16; 8 for lite) targeting only 2 modules and a high-capacity adapter (rank 128; 32 for lite) targeting 4 modules. All other knobs are fixed across all configs. Thus, there are a total of 4 combinations, all launched with a simple grid search.