kossisoroyce/timber: Ollama for classical ML models. AOT compiler that turns XGBoost, LightGBM, scikit-learn, CatBoost & ONNX models into native C99 inference code. One command to load, one command to serve. 336x faster than Python inference.

🚀 Explore this trending post from Hacker News 📖

📂 **Category**:

💡 **What You’ll Learn**:

PyPI version
Python versions
License: Apache-2.0

Ollama for classical ML models.

Timber compiles trained tree-based models (XGBoost, LightGBM, scikit-learn, CatBoost, ONNX) into optimized native C and serves them over a local HTTP API.

  • No Python runtime in the inference hot path
  • Native latency (microseconds)
  • One command to load, one command to serve

📚 Docs: https://kossisoroyce.github.io/timber/

Timber is built for teams that need fast, predictable, portable inference:

  • Fraud/risk teams running classical models in low-latency transaction paths
  • Edge/IoT teams deploying models to gateways and embedded devices
  • Regulated industries (finance, healthcare, automotive) needing deterministic artifacts and audit trails
  • Platform/infra teams replacing Python model-serving overhead with native binaries
pip install timber-compiler
# Load any supported model (auto-detected)
timber load model.json --name fraud-detector

# Serve it (Ollama-style workflow)
timber serve fraud-detector
curl http://localhost:11434/api/predict \
  -d '💬'

Format Framework File Types
XGBoost JSON XGBoost .json
LightGBM text LightGBM .txt, .model, .lgb
scikit-learn pickle scikit-learn .pkl, .pickle
ONNX ML opset (TreeEnsemble) ONNX .onnx
CatBoost JSON CatBoost .json

Benchmarks (Methodology + Reproducibility)

The 336× claim is measured against Python XGBoost single-sample inference.

  • Hardware: Apple M2 Pro, 16 GB RAM, macOS (recorded by script)
  • Model: XGBoost binary classifier, 50 trees, max depth 4, 30 features
  • Dataset: breast_cancer (sklearn)
  • Warmup: 1,000 iterations
  • Timed: 10,000 single-sample predictions
  • Metric: in-process latency (not HTTP/network round-trip)
  • Baseline: Python XGBoost (booster.predict)

See benchmarks/ for:

  • run_benchmarks.py (Timber vs Python XGBoost + optional ONNX Runtime/Treelite/lleaves)
  • system_info.py (hardware/software metadata)
  • render_table.py (markdown table output)

Run:

python benchmarks/run_benchmarks.py --output benchmarks/results.json
python benchmarks/render_table.py --input benchmarks/results.json

Runtime Runtime deps Typical artifact size Latency profile Notes
Timber None (generated C99) ~48 KB (example model) ~2 µs native call Strong fit for edge/embedded and deterministic deployments
Python (xgboost/sklearn serving) Python + framework stack 50–200+ MB process footprint 100s of µs to ms Easy dev loop, high runtime overhead
ONNX Runtime ONNX Runtime libs MBs to 10s of MBs usually low 100s of µs Broad model ecosystem, larger runtime
Treelite Runtime Treelite runtime + compiled artifact MB-scale runtime + model lib low-latency when compiled Great for GBDTs; separate compile/runtime flow
lleaves Python package + LightGBM text model Python runtime + compiled code lower than pure Python LightGBM-focused

Limitations / Known Issues

  • ONNX support is currently focused on TreeEnsembleClassifier/Regressor operators.
  • CatBoost support expects JSON exports (not native binary formats).
  • scikit-learn parser supports major tree estimators and pipelines; uncommon/custom estimator wrappers may fail.
  • Pickle parsing follows Python pickle semantics — only load trusted artifacts.
  • XGBoost support is JSON-model based. Binary booster formats are not the primary input path.
  • Optional benchmark backends (ONNX Runtime, Treelite, lleaves) are skipped unless installed/configured.

API Endpoints (serve mode)

Endpoint Method Description
/api/predict POST Run inference
/api/generate POST Alias for /api/predict (Ollama compat)
/api/models GET List loaded models
/api/model/:name GET Get model metadata
/api/health GET Health check

  • Improve framework/version compatibility coverage (including more edge-case model exports)
  • Broaden ONNX operator support beyond tree ensembles
  • Strengthen embedded deployment profiles (ARM Cortex-M / RISC-V presets)
  • Add richer benchmark matrices and public reproducibility reports
  • Expand safety/regulatory tooling around audit + MISRA-C workflows

End-to-end runnable examples live in examples/:

  • quickstart_xgboost.py
  • quickstart_lightgbm.py
  • quickstart_sklearn.py

They generate model files you can load immediately with timber load.

Timber includes a full technical paper: paper/timber_paper.pdf

@misc{royce2026timber,
  title        = {Timber: Compiling Classical Machine Learning Models to Native Inference Binaries},
  author       = {Kossiso Royce},
  year         = {2026},
  howpublished = {GitHub repository and technical paper},
  institution  = {Electricsheep Africa},
  url          = {https://github.com/kossisoroyce/timber}
}
pip install -e ".[dev]"
pytest tests/ -v

Apache-2.0

{💬|⚡|🔥} **What’s your take?**
Share your thoughts in the comments below!

#️⃣ **#kossisoroycetimber #Ollama #classical #models #AOT #compiler #turns #XGBoost #LightGBM #scikitlearn #CatBoost #ONNX #models #native #C99 #inference #code #command #load #command #serve #336x #faster #Python #inference**

🕒 **Posted on**: 1772418428

🌟 **Want more?** Click here for more info! 🌟

By

Leave a Reply

Your email address will not be published. Required fields are marked *