Beginners Guide
This guide walks you through your first fine-tuning run with Backpropagate, from zero to a working Ollama model. No prior experience with LoRA, GGUF, or training pipelines is required.
1. What is fine-tuning?
Section titled “1. What is fine-tuning?”Large language models (LLMs) like Qwen and Llama are trained on broad internet text. Fine-tuning teaches an existing model new behavior using your own data — customer support logs, code examples, domain-specific Q&A, or any conversational dataset. Instead of retraining billions of parameters from scratch, Backpropagate uses LoRA (Low-Rank Adaptation) to train a small set of adapter weights that modify the base model’s behavior. This is fast, uses far less GPU memory, and produces results you can export and run locally with Ollama.
2. Prerequisites
Section titled “2. Prerequisites”Before you start, make sure you have:
- Python 3.10 or newer — check with
python --version - A CUDA GPU with 8GB+ VRAM — NVIDIA RTX 3060 or better. Check with
nvidia-smi - PyTorch 2.0+ with CUDA support — install from pytorch.org
- Ollama (optional) — for running your exported model locally. Install from ollama.com
If you are on Windows, Backpropagate handles the common PyTorch/CUDA pitfalls automatically (multiprocessing crashes, xformers incompatibilities, dataloader issues).
3. Installation
Section titled “3. Installation”Install Backpropagate with the recommended extras. pipx is the recommended install path — it puts Backpropagate in its own isolated environment with PATH integration, so you don’t have to manage a virtualenv:
pipx install "backpropagate[standard]"This gives you the core library plus Unsloth (2× faster training) and the Reflex web interface. Other isolated install paths:
uv tool install "backpropagate[standard]" # uv's equivalent, faster installpip install "backpropagate[standard]" # if you already manage a venvIf you only want the Python API with no extras:
pipx install backpropagateVerify the install:
backprop infoThis prints your Python version, GPU details, VRAM, and which optional features are available.
4. Prepare your dataset
Section titled “4. Prepare your dataset”Backpropagate accepts JSONL files with conversation data. The simplest format is OpenAI-style messages:
{"messages": [{"role": "user", "content": "What is LoRA?"}, {"role": "assistant", "content": "LoRA stands for Low-Rank Adaptation..."}]}{"messages": [{"role": "user", "content": "How do I export to GGUF?"}, {"role": "assistant", "content": "Use trainer.export('gguf')..."}]}Save this as my_data.jsonl. Each line is one conversation. Aim for at least 100 examples for a meaningful fine-tune, though 500+ is better.
Backpropagate also auto-detects ShareGPT, Alpaca, and ChatML formats, so use whatever you have. The repo ships an examples/quickstart.jsonl (5 ShareGPT examples) you can use to verify your install before bringing your own data.
5. Train your first model
Section titled “5. Train your first model”Three lines of Python:
from backpropagate import Trainer
trainer = Trainer("Qwen/Qwen2.5-7B-Instruct")trainer.train("my_data.jsonl", steps=100)trainer.save("./my-model")What happens behind the scenes:
- The model downloads from HuggingFace (first run only, cached afterward)
- Backpropagate detects your GPU VRAM and picks a safe batch size
- LoRA adapters are applied to the model’s attention layers
- Training runs for 100 steps with cosine learning rate scheduling
- The trained adapter is saved to
./my-model
You can also train from the command line:
backprop train --data my_data.jsonl --steps 100Or use the web UI:
backprop uiIf you plan to share the UI on a public URL (backprop ui --share), you also need --auth user:password — see the troubleshooting page for the reasoning. Local-only backprop ui (no --share) needs no auth.
What you’ll see
Section titled “What you’ll see”A successful first run prints something like:
run_started run_id=8f3a2c1d-9e4b-4c5a-...Trainer initialized: Qwen/Qwen2.5-7B-Instruct LoRA: r=256, alpha=512 Batch: 2, LR: 0.0002 Degradation knobs: oom_recovery=True, unsloth_fallback=TrueTraining: [####################] 100% loss=0.42 steps=100Saved to ./output/lorarun_ended run_id=8f3a2c1d-... duration_seconds=412.3After the run, your output directory has:
my-model/├── adapter_config.json <- adapter metadata├── adapter_model.safetensors <- the trained LoRA weights└── tokenizer.json <- copied from the base modelTo know it worked: adapter_model.safetensors should be a few hundred MB to ~1.5 GB on a 7B base (v1.3 default is rank 256 + all-linear; pass --lora-preset=fast for the v1.2.x rank-16 ~50–200 MB footprint), and backprop info should show no errors. If the loss decreased over the run (you’ll see logging lines every 10 steps), the model learned something.
If something went wrong, see the troubleshooting page — it’s keyed by what you actually saw in stderr.
6. Export and run with Ollama
Section titled “6. Export and run with Ollama”Once training is done, export to GGUF and register with Ollama:
# Export to GGUF (quantized for fast local inference)result = trainer.export("gguf", quantization="q4_k_m")
# Register with Ollamafrom backpropagate import register_with_ollamaregister_with_ollama(result.path, "my-finetuned-model")Now run it:
ollama run my-finetuned-modelThe q4_k_m quantization gives a good balance between file size and quality. For higher quality at larger file size, use q8_0. For the smallest file, use q2_k.
CLI equivalent for export:
backprop export ./my-model --format gguf --quantization q4_k_m --ollama --ollama-name my-finetuned-model7. Next steps
Section titled “7. Next steps”Once you have a working fine-tune, here are ways to improve:
- More data — Fine-tuning quality scales with dataset size and diversity. 1,000+ high-quality examples produce noticeably better results than 100.
- Multi-run SLAO training — Prevents catastrophic forgetting during longer training by merging LoRA adapters between runs. Use
trainer.multi_run()instead oftrainer.train()for extended fine-tuning. - Training presets — Use
get_preset("balanced")orget_preset("quality")frombackpropagate.configfor research-backed hyperparameter combinations. - Dataset quality tools — The
backpropagate.datasetsmodule offers deduplication, perplexity filtering, and curriculum learning to improve your training data before training. - GPU monitoring — For long training runs,
GPUMonitorwatches temperature and VRAM, pausing training before your hardware hits dangerous levels. - Experiment tracking — Install the
[monitoring]extra to log training runs to Weights & Biases.
For detailed coverage of each topic, see the Training, Export, and Reference pages.