AI Glossary · technique

What is LoRA?

Definition

LoRA (Low-Rank Adaptation) is a fine-tuning technique that teaches an AI model a new style, character, or concept by training a small adapter matrix instead of modifying the full model. A typical Stable Diffusion LoRA is 100–200 MB — tiny compared to a 4 GB base checkpoint — and can be swapped in or stacked at inference time.

LoRAs solved a big problem with AI model customization: full fine-tuning requires GBs of GPU memory and hours of training, and produces a full-sized copy of the model for every variant. LoRA, introduced by Microsoft researchers in 2021, trains only a low-rank decomposition of weight updates — typically 0.1–1% of the original parameters.

This makes LoRAs practical to train on a single consumer GPU in under an hour and cheap to share (hundreds of MB instead of many GB). They've become the de facto customization format for Stable Diffusion, with thousands of community-trained LoRAs for specific art styles, characters, and photographic looks.

How it works

Low-rank decomposition

Instead of updating the full weight matrix W (millions of parameters), LoRA trains two smaller matrices A and B such that the update ΔW = B·A has the same shape but far fewer trainable parameters.

Mergeable at inference

At inference time, the LoRA's weights can be merged into the base model (W' = W + s·B·A, where s is the LoRA strength) with no runtime overhead.

Stackable

Multiple LoRAs can be composed at different strengths — e.g., 0.8 strength for an art-style LoRA and 0.5 for a character LoRA, producing that character in that style.

Common use cases

Training a model on a specific person's likeness from 5–20 photos
Teaching a model a specific art style (e.g., 1980s anime, watercolor)
Creating brand-consistent generations across a whole marketing campaign
Fine-tuning LLMs on domain-specific data without catastrophic forgetting

Frequently asked questions

What's a typical LoRA file size?

Most Stable Diffusion LoRAs are 50–200 MB, compared to 2–7 GB for a full checkpoint. That size-efficiency is why LoRAs are the dominant format for community-shared style and character customizations.

Can I stack multiple LoRAs?

Yes. Most inference tools (including Oakgen) let you load multiple LoRAs at once with individual strength values. Common combos: art-style LoRA + character LoRA + detail-enhancer LoRA.

Do LoRAs work with GPT-4 or Claude?

The LoRA technique works with any transformer, but commercial LLM APIs (OpenAI, Anthropic) don't expose LoRA endpoints. Open LLMs like Llama and Mistral support LoRA fine-tuning.