Fine-tune LLMs in 3 lines.
Headless LLM fine-tuning with smart defaults. Automatic hyperparameter tuning, VRAM-aware batch sizing, multi-run SLAO training to prevent catastrophic forgetting, and one-click GGUF export for Ollama. First-class Windows and CUDA support.
Quickstart
pip install backpropagate[standard]
# Train in 3 lines
from backpropagate import Trainer
trainer = Trainer("unsloth/Qwen2.5-7B-Instruct-bnb-4bit")
trainer.train("my_data.jsonl", steps=100)
trainer.export("gguf", quantization="q4_k_m") # Ready for Ollama
Multi-run SLAO
from backpropagate.multi_run import MultiRunTrainer
runner = MultiRunTrainer(
model="unsloth/Llama-3.2-3B-Instruct-bnb-4bit",
num_runs=5, steps_per_run=100,
merge_mode="slao",
)
result = runner.run("my_data.jsonl")
Export to Ollama
from backpropagate.export import export_gguf, register_with_ollama
result = export_gguf(model, tokenizer, "./output", quantization="q4_k_m")
register_with_ollama(result.path, model_name="my-model")
Fine-tuning without the friction
Built for developers who want results, not configuration.
Smart defaults
Automatically configures learning rate, batch size, gradient accumulation, and LoRA rank based on your hardware and dataset size. No hyperparameter guesswork.
VRAM-aware training
Auto batch sizing and gradient checkpointing keep training stable on any GPU. Built-in VRAM monitoring with warnings before OOM. Works from 8GB up to multi-GPU setups.
First-class Windows
Tested and optimized for Windows + CUDA. Avoids the common PyTorch/Unsloth pitfalls on Windows. If it runs on Linux, it runs on Windows too.
Modular installation
Install only the dependencies you need.
Get started
Install
# Recommended
pip install backpropagate[standard]
# Minimal core only
pip install backpropagate
# All extras
pip install backpropagate[full]
# Requires: Python 3.10+ · CUDA GPU (8GB+ VRAM) Basic training
from backpropagate import Trainer
# Smart defaults — no config needed
trainer = Trainer("unsloth/Qwen2.5-7B-Instruct-bnb-4bit")
trainer.train("my_data.jsonl", steps=100)
trainer.save("./my-model") Multi-run SLAO
from backpropagate.multi_run import MultiRunTrainer
runner = MultiRunTrainer(
model="unsloth/Llama-3.2-3B-Instruct-bnb-4bit",
num_runs=5, steps_per_run=100,
merge_mode="slao",
)
result = runner.run("my_data.jsonl") Export to Ollama
from backpropagate.export import export_gguf, register_with_ollama
result = export_gguf(model, tokenizer, "./output", quantization="q4_k_m")
register_with_ollama(result.path, model_name="my-model")
# ollama run my-model Production-ready by design
Built for CI/CD pipelines, automated workflows, and long training runs.
Headless by design
No UI required. Runs in CI/CD pipelines, SSH sessions, and automated workflows. Full Python API with structured logging. Callbacks for progress tracking and early stopping.
Multi-run SLAO
Smart Loss-Aware Ordering prevents catastrophic forgetting during extended fine-tuning campaigns. Checkpoint-and-resume keeps long runs recoverable after crashes.
LoRA + QLoRA + Unsloth
Supports LoRA, QLoRA (4-bit), and Unsloth-accelerated training. Mix quantization levels per layer. Export to GGUF at any quantization: q2_k, q4_k_m, q8_0, or f16.
Quality scorecard
Ship Gate audit — 23/35 checked, 12 skipped, 100% pass.