API: LoRA and Model Configs
RapidFire AI’s core APIs for model and adapter specifications are all thin wrappers around the corresponding APIs of the Hugging Face’s libraries transformers and PEFT LoRA.
RFLoraConfig
This is a wrapper around LoraConfig
in HF PEFT.
The full signature and list of arguments are available on this page.
The difference here is that the individual arguments (knobs) can be List
valued or
Range
valued in a RFLoraConfig
.
That is how you can specify a base set of knob combinations from which a config group can
be produced. Also read the Multi-Config Specification page.
Example:
# Singleton config
RFLoraConfig(
r=128, lora_alpha=256, lora_dropout=0.05,
target_modules=["q_proj", "v_proj"], bias="none"
)
# 4 combinations
RFLoraConfig(
r=List([16, 32]), lora_alpha=List([16,32]), lora_dropout=0.05,
target_modules=["q_proj", "v_proj"], bias="none"
)
# 2 combinations
RFLoraConfig(
r=64, lora_alpha=128,
target_modules=List([["q_proj", "v_proj"],
["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]]),
bias="none"
)
Notes:
In terms of impact on LLM behavior, the LoRA knobs that are usually experimented with are as follows:
r
(rank): The most critical knob. Typically a power of 2 between 8 and 128. Higher rank means higher adapter learning capacity but slightly higher GPU memory footprint and compute time.lora_alpha: Controls adaptation strength. Typical values are 16, 32, and 64. Usually set to 2x
r
, but this can be varied too.target_modules
: Which layers to apply LoRA to. Common options:Query/Value only: [“q_proj”, “v_proj”]
Query/Key/Value: [“q_proj”, “k_proj”, “v_proj”]
All linear layers: [“q_proj”, “k_proj”, “v_proj”, “o_proj”, “gate_proj”, “up_proj”, “down_proj”]
lora_dropout
: Controls regularization. Typically between 0.0 to 0.05; often set at 0.0 unless there is overfitting.init_lora_weights
: Initialization strategy. Most use defaultTrue
but some experiment with"gaussian"
or newer methods such as"pissa"
.
Most other knobs such as bias, use_rslora, modules_to_save, etc. can be left as PEFT defaults unless you want to explore specific advanced variations.
A common combination is to start with rank 16 and 32 (resp. alpha 32 and 64) and only target query/value projection modules. Then expand to more based on observed loss and eval metrics behavior (overfitting or underfitting) and your time/compute constraints.
RFModelConfig
This is a core class in the RapidFire AI API that abstracts multiple Hugging Face APIs under the hood to simplify and unify all model-related specifications. In particular, it unifies model loading, training configurations, and LoRA settings into one class.
It gives you flexibility to try out variations of LoRA adapter structures, training arguments for multiple control flows (SFT, DPO, and GRPO), formatting and metrics functions, and generation specifics.
Some of the arguments (knobs) here can also be List
valued or Range
valued depending
on its data type, as explained below. All this helps form the base set of knob combinations from which
a config group can be produced. Also read the Multi-Config Specification page.
- class RFModelConfig
- Parameters:
model_name (str) – Model identifier for use with Hugging Face’s
AutoModel.from_pretrained()
. Can be a Hugging Face model hub name (e.g.,"Qwen/Qwen2.5-7B-Instruct"
) or local path to a checkpoint.tokenizer (str, optional) – Hugging Face Tokenizer identifier, typically same as
model_name
string but can be different.tokenizer_kwargs (Dict[str, Any], optional) – Additional keyword arguments passed to tokenizer’s
from_pretrained()
method (e.g.,padding_side
,truncation
, andmodel_max_length
).formatting_func (Callable |
List
of Callable, optional) – Custom user-given data preprocessing function for preparing a single example with system prompt, roles, etc. Can be aList
for multi-config.compute_metrics (Callable |
List
of Callable, optional) – Custom user-given evaluation function passed to Hugging Face’sTrainer.compute_metrics
for use during evaluation phases. Can be aList
for multi-config.peft_config (RFLoraConfig |
List
of RFLoraConfig, optional) – RFLoraConfig as described above; thin wrapper aroundpeft.LoraConfig
for LoRA fine-tuning. Can be aList
for multi-config.training_args (
RFSFTConfig
|RFDPOConfig
|RFGRPOConfig
) – RF trainer configuration object specifying the training flow and its parameters. Also read the Trainer Configs page.model_type (str, optional) – Custom user-defined string to identify model type inside your
create_model_fn()
given torun_fit()
; default:"causal_lm"
model_kwargs (Dict[str, Any], optional) – Additional parameters for model initialization, passed to
AutoModel.from_pretrained()
(e.g.,torch_dtype
,device_map
,trust_remote_code
).ref_model_name (str, optional) – For DPO and GRPO only; akin to
model_name
above but for the frozen reference model.ref_model_type (str, optional) – For DPO and GRPO only; akin
model_type
above but for the frozen reference model.ref_model_kwargs (Dict[str, Any], optional) – For DPO and GRPO only; akin
model_kwargs
above but for the frozen reference model.reward_funcs (Callable | [Callable] |
List
of Callable |List
of [Callable], optional) – Reward functions for evaluating generated outputs during training (mainly for GRPO but can be used for DPO too). Can be aList
for multi-config.generation_config (Dict[str, Any], optional) – Arguments for text generation passed to
model.generate()
(e.g.,max_new_tokens
,temperature
,top_p
).
See also
Hugging Face Transformers documentation
Hugging Face PEFT library documentation
RFSFTConfig
,RFDPOConfig
,RFGRPOConfig
for training argument configurations
Examples:
# Based on the SFT tutorial notebook
RFModelConfig(
model_name="meta-llama/Llama-3.1-8B-Instruct",
peft_config=rfloraconfig1,
training_args=rfsftconfig1,
model_type="causal_lm",
model_kwargs={"device_map": "auto", "torch_dtype": "auto","use_cache":False},
formatting_func=sample_formatting_function,
compute_metrics=sample_compute_metrics,
generation_config = {
"max_new_tokens": 256, "temperature": 0.6, "top_p": 0.9,
"top_k": 40, "repetition_penalty": 1.18,
}
)
# Based on the GRPO tutorial notebook
RFModelConfig(
model_name="Qwen/Qwen2.5-7B-Instruct",
peft_config=rfloraconfig,
training_args=rfgrpoconfig1,
formatting_func=sample_formatting_function,
reward_funcs=reward_funcs,
model_kwargs={"load_in_4bit": True, "device_map": "auto", "torch_dtype": "auto", "use_cache": False},
tokenizer_kwargs={"model_max_length": 2048, "padding_side": "left", "truncation": True}
)
Notes:
Note that one RFModelConfig
object can have only one base model configuration and
one training control flow arguments dictionary.
But you can specify a List
of PEFT configs, formatting functions, eval metrics
functions, and reward functions list as part of your multi-config specification.