Fine-tune LLMs in 3 lines.
Headless LLM fine-tuning with smart defaults. Automatic hyperparameter tuning, VRAM-aware batch sizing, multi-run SLAO training to prevent catastrophic forgetting, and one-click GGUF export for Ollama. First-class Windows and CUDA support.
Quickstart
pip install backpropagate[standard]
# Train in 3 lines
from backpropagate import Trainer
trainer = Trainer("unsloth/Qwen2.5-7B-Instruct-bnb-4bit")
trainer.train("my_data.jsonl", steps=100)
trainer.export("gguf", quantization="q4_k_m") # Ready for Ollama
Multi-run SLAO
# SLAO: prevents catastrophic forgetting across long runs
from backpropagate import MultiRunTrainer
trainer = MultiRunTrainer(
model="unsloth/Llama-3.2-3B-Instruct-bnb-4bit",
strategy="slao", # smart loss-aware ordering
checkpoint_every=500,
max_runs=5,
)
trainer.train("my_data.jsonl")
Export to Ollama
# Export GGUF and register with local Ollama
trainer.export(
format="gguf",
quantization="q4_k_m", # q2_k / q4_k_m / q8_0 / f16
register_ollama=True, # auto-creates Modelfile
model_name="my-model", # ollama run my-model
)
Fine-tuning without the friction
Built for developers who want results, not configuration.
Smart defaults
Automatically configures learning rate, batch size, gradient accumulation, and LoRA rank based on your hardware and dataset size. No hyperparameter guesswork.
VRAM-aware training
Auto batch sizing and gradient checkpointing keep training stable on any GPU. Built-in VRAM monitoring with warnings before OOM. Works from 8GB up to multi-GPU setups.
First-class Windows
Tested and optimized for Windows + CUDA. Avoids the common PyTorch/Unsloth pitfalls on Windows. If it runs on Linux, it runs on Windows too.
Modular installation
Install only the dependencies you need.
Get started
Install
# Recommended
pip install backpropagate[standard]
# Minimal core only
pip install backpropagate
# All extras
pip install backpropagate[full]
# Requires: Python 3.10+ · CUDA GPU (8GB+ VRAM) Basic training
from backpropagate import Trainer
# Smart defaults — no config needed
trainer = Trainer("unsloth/Qwen2.5-7B-Instruct-bnb-4bit")
trainer.train("my_data.jsonl", steps=100)
trainer.save("./my-model") Multi-run SLAO
from backpropagate import MultiRunTrainer
# Prevents catastrophic forgetting
trainer = MultiRunTrainer(
model="unsloth/Llama-3.2-3B-Instruct-bnb-4bit",
strategy="slao",
checkpoint_every=500,
max_runs=5,
)
trainer.train("my_data.jsonl") Export to Ollama
trainer.export(
format="gguf",
quantization="q4_k_m",
register_ollama=True,
model_name="my-finetuned-model",
)
# Then use it locally:
# ollama run my-finetuned-model Production-ready by design
Built for CI/CD pipelines, automated workflows, and long training runs.
Headless by design
No UI required. Runs in CI/CD pipelines, SSH sessions, and automated workflows. Full Python API with structured logging. Callbacks for progress tracking and early stopping.
Multi-run SLAO
Smart Loss-Aware Ordering prevents catastrophic forgetting during extended fine-tuning campaigns. Checkpoint-and-resume keeps long runs recoverable after crashes.
LoRA + QLoRA + Unsloth
Supports LoRA, QLoRA (4-bit), and Unsloth-accelerated training. Mix quantization levels per layer. Export to GGUF at any quantization: q2_k, q4_k_m, q8_0, or f16.
Quality scorecard
Ship Gate audit — 23/31 checked, 14 skipped, 100% pass.