LoRA

2 sources - 10 claims

LoRA achieved the best aggregate metrics and fastest training among the compared PEFT methods, although it used the most memory. LoRA was interpreted as stronger than IA3 because it modifies weight matrices through low-rank updates instead of only rescaling activations. The LoRA configuration trained about 13.7 million parameters, roughly 0.34% of total model weights. LoRA adapters were inserted into the query, key, value, and output projection matrices. Only LoRA matrices and classification head weights were trained while the backbone stayed frozen. LoRA reduces the number of trainable parameters by freezing the backbone and training low-rank adapter matrices. For LoRA, transformer layers keep frozen weights and train low-rank matrices B and A. The reported experiments use LoRA rank 16 and LoRA alpha 32. Memory-constrained settings use QLoRA-style four-bit frozen backbones while training adapters in mixed precision. Repeated transmission of LoRA adapter updates can remain expensive across many federated rounds.