API: LoRA and Model Configs
RapidFire AI’s core APIs for model and adapter specifications are all thin wrappers around the corresponding APIs of the Hugging Face’s libraries transformers and PEFT LoRA.
RFLoraConfig
This is a wrapper around LoraConfig in HF PEFT.
The full signature and list of arguments are available on this page.
The difference here is that the individual arguments (knobs) can be List valued or
Range valued in a RFLoraConfig.
That is how you can specify a base set of knob combinations from which a config group can
be produced. Also read the Multi-Config Specification page.
Example:
# Singleton config
RFLoraConfig(
r=128, lora_alpha=256, lora_dropout=0.05,
target_modules=["q_proj", "v_proj"], bias="none"
)
# 4 combinations
RFLoraConfig(
r=List([16, 32]), lora_alpha=List([16,32]), lora_dropout=0.05,
target_modules=["q_proj", "v_proj"], bias="none"
)
# 2 combinations
RFLoraConfig(
r=64, lora_alpha=128,
target_modules=List([["q_proj", "v_proj"],
["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]]),
bias="none"
)
Notes:
In terms of impact on LLM behavior, the LoRA knobs that are usually experimented with are as follows:
r(rank): The most critical knob. Typically a power of 2 between 8 and 128. Higher rank means higher adapter learning capacity but slightly higher GPU memory footprint and compute time.lora_alpha: Controls adaptation strength. Typical values are 16, 32, and 64. Usually set to 2x
r, but this can be varied too.target_modules: Which layers to apply LoRA to. Common options:Query/Value only: [“q_proj”, “v_proj”]
Query/Key/Value: [“q_proj”, “k_proj”, “v_proj”]
All linear layers: [“q_proj”, “k_proj”, “v_proj”, “o_proj”, “gate_proj”, “up_proj”, “down_proj”]
lora_dropout: Controls regularization. Typically between 0.0 to 0.05; often set at 0.0 unless there is overfitting.init_lora_weights: Initialization strategy. Most use defaultTruebut some experiment with"gaussian"or newer methods such as"pissa".
Most other knobs such as bias, use_rslora, modules_to_save, etc. can be left as PEFT defaults unless you want to explore specific advanced variations.
A common combination is to start with rank 16 and 32 (resp. alpha 32 and 64) and only target query/value projection modules. Then expand to more based on observed loss and eval metrics behavior (overfitting or underfitting) and your time/compute constraints.
RFModelConfig
This is a core class in the RapidFire AI API that abstracts multiple Hugging Face APIs under the hood to simplify and unify all model-related specifications. In particular, it unifies model loading, training configurations, and LoRA settings into one class.
It gives you flexibility to try out variations of LoRA adapter structures, training arguments for multiple control flows (SFT, DPO, and GRPO), formatting and metrics functions, and generation specifics.
Some of the arguments (knobs) here can also be List valued or Range valued depending
on its data type, as explained below. All this helps form the base set of knob combinations from which
a config group can be produced. Also read the Multi-Config Specification page.
- class RFModelConfig
- Parameters:
model_name (str) – Model identifier for use with Hugging Face’s
AutoModel.from_pretrained(). Can be a Hugging Face model hub name (e.g.,"Qwen/Qwen2.5-7B-Instruct") or local path to a checkpoint.tokenizer (str, optional) – Hugging Face Tokenizer identifier, typically same as
model_namestring but can be different.tokenizer_kwargs (Dict[str, Any], optional) – Additional keyword arguments passed to tokenizer’s
from_pretrained()method (e.g.,padding_side,truncation, andmodel_max_length).formatting_func (Callable |
Listof Callable, optional) – Custom user-given data preprocessing function for preparing a single example with system prompt, roles, etc. Can be aListfor multi-config.compute_metrics (Callable |
Listof Callable, optional) – Custom user-given evaluation function passed to Hugging Face’sTrainer.compute_metricsfor use during evaluation phases. Can be aListfor multi-config.peft_config (RFLoraConfig |
Listof RFLoraConfig, optional) – RFLoraConfig as described above; thin wrapper aroundpeft.LoraConfigfor LoRA fine-tuning. Can be aListfor multi-config.training_args (
RFSFTConfig|RFDPOConfig|RFGRPOConfig) – RF trainer configuration object specifying the training flow and its parameters. Also read the Trainer Configs page.model_type (str, optional) – Custom user-defined string to identify model type inside your
create_model_fn()given torun_fit(); default:"causal_lm"model_kwargs (Dict[str, Any], optional) – Additional parameters for model initialization, passed to
AutoModel.from_pretrained()(e.g.,torch_dtype,device_map,trust_remote_code).ref_model_name (str, optional) – For DPO and GRPO only; akin to
model_nameabove but for the frozen reference model.ref_model_type (str, optional) – For DPO and GRPO only; akin
model_typeabove but for the frozen reference model.ref_model_kwargs (Dict[str, Any], optional) – For DPO and GRPO only; akin
model_kwargsabove but for the frozen reference model.reward_funcs (Callable | [Callable] |
Listof Callable |Listof [Callable], optional) – Reward functions for evaluating generated outputs during training (mainly for GRPO but can be used for DPO too). Can be aListfor multi-config.generation_config (Dict[str, Any], optional) – Arguments for text generation passed to
model.generate()(e.g.,max_new_tokens,temperature,top_p).
See also
Hugging Face Transformers documentation
Hugging Face PEFT library documentation
RFSFTConfig,RFDPOConfig,RFGRPOConfigfor training argument configurations
Examples:
# Based on the SFT tutorial notebook
RFModelConfig(
model_name="meta-llama/Llama-3.1-8B-Instruct",
peft_config=rfloraconfig1,
training_args=rfsftconfig1,
model_type="causal_lm",
model_kwargs={"device_map": "auto", "torch_dtype": "auto","use_cache":False},
formatting_func=sample_formatting_function,
compute_metrics=sample_compute_metrics,
generation_config = {
"max_new_tokens": 256, "temperature": 0.6, "top_p": 0.9,
"top_k": 40, "repetition_penalty": 1.18,
}
)
# Based on the GRPO tutorial notebook
RFModelConfig(
model_name="Qwen/Qwen2.5-7B-Instruct",
peft_config=rfloraconfig,
training_args=rfgrpoconfig1,
formatting_func=sample_formatting_function,
reward_funcs=reward_funcs,
model_kwargs={"load_in_4bit": True, "device_map": "auto", "torch_dtype": "auto", "use_cache": False},
tokenizer_kwargs={"model_max_length": 2048, "padding_side": "left", "truncation": True}
)
Notes:
Note that one RFModelConfig object can have only one base model configuration and
one training control flow arguments dictionary.
But you can specify a List of PEFT configs, formatting functions, eval metrics
functions, and reward functions list as part of your multi-config specification.