GitHub – hash-anu/snkv: SNKV- key value store using sqlite b-tree APIs

🔥 Check out this awesome post from Hacker News 📖

📂 **Category**:

💡 **What You’ll Learn**:

Memory Leaks
Tests
Peak Memory
GitHub Issues
GitHub Closed Issues
License

SNKV is a lightweight, ACID-compliant embedded key-value store built directly on SQLite’s B-Tree storage engine — without SQL.

The idea: bypass the SQL layer entirely and talk directly to SQLite’s storage engine. No SQL parser. No query planner. No virtual machine. Just a clean KV API on top of a proven, battle-tested storage core.

SQLite-grade reliability. KV-first design. Lower overhead for read-heavy and mixed key-value workloads.

Single-header integration — drop it in and go:

#define SNKV_IMPLEMENTATION
#include "snkv.h"

int main(void) 💬

Use kvstore_open_v2 to control how the store is opened. Zero-initialise the
config and set only what you need — unset fields resolve to safe defaults.

KVStoreConfig cfg = ⚡;
cfg.journalMode = KVSTORE_JOURNAL_WAL;   /* WAL mode (default) */
cfg.syncLevel   = KVSTORE_SYNC_NORMAL;   /* survives process crash (default) */
cfg.cacheSize   = 4000;                  /* ~16 MB page cache (default 2000 ≈ 8 MB) */
cfg.pageSize    = 4096;                  /* DB page size, new DBs only (default 4096) */
cfg.busyTimeout = 5000;                  /* retry 5 s on SQLITE_BUSY (default 0) */
cfg.readOnly    = 0;                     /* read-write (default) */

KVStore *db;
kvstore_open_v2("mydb.db", &db, &cfg);

Field	Default	Options
`journalMode`	`KVSTORE_JOURNAL_WAL`	`KVSTORE_JOURNAL_DELETE`
`syncLevel`	`KVSTORE_SYNC_NORMAL`	`KVSTORE_SYNC_OFF`, `KVSTORE_SYNC_FULL`
`cacheSize`	2000 pages (~8 MB)	Any positive integer
`pageSize`	4096 bytes	Power of 2, 512–65536; new DBs only
`readOnly`	0	1 to open read-only
`busyTimeout`	0 (fail immediately)	Milliseconds; useful for multi-process use

kvstore_open remains fully supported and uses all defaults except journalMode.

make              # builds libsnkv.a
make snkv.h       # generates single-header version
make examples     # builds examples
make run-examples # run all examples
make test         # run all tests (CI suite)
make clean

Windows (MSYS2 / MinGW64)

1. Install MSYS2.

2. Launch “MSYS2 MinGW 64-bit” from the Start menu (not the plain MSYS2 terminal).

3. Install the toolchain:

pacman -S --needed mingw-w64-x86_64-gcc make

4. Clone and build:

git clone https://github.com/hash-anu/snkv.git
cd snkv
make              # builds libsnkv.a
make snkv.h       # generates single-header
make examples     # builds .exe examples
make run-examples
make test

All commands must be run from the MSYS2 MinGW64 shell. Running mingw32-make from
a native cmd.exe or PowerShell window will not work — the Makefile relies on sh and
standard Unix tools that are only available inside the MSYS2 environment.

Available on PyPI — no compiler needed:

from snkv import KVStore

with KVStore("mydb.db") as db:
    db["hello"] = "world"
    print(db["hello"].decode())   # world

Full documentation — installation, API reference, examples, and thread-safety notes — is in
python/README.md.

SNKV Python API Demo

10 GB Crash-Safety Stress Test

A production-scale kill-9 test is included but kept separate from the CI suite.
It writes unique deterministic key-value pairs into a 10 GB WAL-mode database,
forcibly kills the writer with SIGKILL during active writes, and verifies on
restart that every committed transaction is present with byte-exact values, no
partial transactions are visible, and the database has zero corruption.

make test-crash-10gb          # run full 5-cycle kill-9 + verify (Linux / macOS)

# individual modes
./tests/test_crash_10gb write  tests/crash_10gb.db   # continuous writer
./tests/test_crash_10gb verify tests/crash_10gb.db   # post-crash verifier
./tests/test_crash_10gb clean  tests/crash_10gb.db   # remove DB files

Requires ~11 GB free disk. run mode is POSIX-only; write and verify work on all platforms.

Standard database path:

Application → SQL Parser → Query Planner → VDBE (VM) → B-Tree → Disk

SNKV path:

Application → KV API → B-Tree → Disk

By removing the layers you don’t need for key-value workloads, SNKV keeps the proven storage core and cuts the overhead.

Layer	SQLite	SNKV
SQL Parser	✅	❌
Query Planner	✅	❌
VDBE (VM)	✅	❌
B-Tree Engine	✅	✅
Pager / WAL	✅	✅

1M records, Linux, averaged across 3 runs.
Both SNKV and SQLite use identical settings: WAL mode, synchronous=NORMAL, 2000-page (8 MB) page cache, 4096-byte pages.

Benchmark source: SNKV · SQLite

SNKV vs SQLite (KV workloads)

SQLite benchmark uses WITHOUT ROWID with a BLOB primary key — the fairest possible comparison, both using a single B-tree keyed on the same field. Both run with identical settings: WAL mode, synchronous=NORMAL, 2000-page (8 MB) cache, 4096-byte pages. This isolates the pure cost of the SQL layer for KV operations.

Note: Both SNKV and SQLite (WITHOUT ROWID) use identical peak RSS (~10.8 MB) since they share the same underlying pager and page cache infrastructure.

Benchmark	SQLite	SNKV	Notes
Sequential writes	140K ops/s	146K ops/s	SNKV 1.05x faster
Random reads	87K ops/s	139K ops/s	SNKV 1.6x faster
Sequential scan	1.61M ops/s	3.16M ops/s	SNKV 2x faster
Random updates	17K ops/s	24K ops/s	SNKV 1.4x faster
Random deletes	17K ops/s	20K ops/s	SNKV 1.2x faster
Exists checks	87K ops/s	149K ops/s	SNKV 1.7x faster
Mixed workload	35K ops/s	50K ops/s	SNKV 1.4x faster
Bulk insert	211K ops/s	240K ops/s	SNKV 1.1x faster

With identical storage configuration, SNKV wins across every benchmark. The gains come from two sources: bypassing the SQL layer (no parsing, no query planner, no VDBE) and a per-column-family cached read cursor that eliminates repeated cursor open/close overhead on the hot read path. The biggest wins are on read-heavy operations — random reads (+60%), exists checks (+70%), and sequential scan (+100%) — exactly where the cursor caching pays off most.

Running your own LMDB / RocksDB comparison

If you want to benchmark SNKV against LMDB or RocksDB, the benchmark harnesses are here:

SNKV is a good fit if:

Your workload is read-heavy or mixed (reads + writes)
You’re running in a memory-constrained or embedded environment
You want a clean KV API without writing SQL strings, preparing statements, and binding parameters
You need single-header C integration with no external dependencies
You want predictable latency — no compaction stalls, no mmap tuning

Consider alternatives if:

You need maximum write/update/delete throughput → RocksDB (LSM-tree)
You need maximum read/scan speed and memory isn’t a constraint → LMDB (memory-mapped)
You already use SQL elsewhere and want to consolidate → SQLite directly

ACID Transactions — commit / rollback safety
WAL Mode — concurrent readers + single writer
Column Families — logical namespaces within a single database
Iterators — ordered key traversal
Thread Safe — built-in synchronization
Single-header — drop snkv.h into any C/C++ project
Zero memory leaks — verified with Valgrind
SSD-friendly — WAL appends sequentially, reducing random writes
Python Bindings — idiomatic Python 3.8+ API with dict-style access, context managers, typed exceptions, and prefix iterators — see python/README.md

Backup & Tooling Compatibility

Because SNKV uses SQLite’s file format and pager layer, backup tools that operate at the WAL or page level work out of the box:

✅ LiteFS — distributed SQLite replication works with SNKV databases
✅ SQLite Online Backup API — operates at the page level, fully compatible
✅ WAL-based backup tools — any tool consuming WAL files works correctly
✅ Rollback journal tools — journal mode is fully supported

Note: Tools that rely on SQLite’s schema layer — like the sqlite3 CLI or DB Browser for SQLite — won’t work. SNKV bypasses the schema layer entirely by design.

Internals & Documentation

I documented the SQLite internals explored while building this:

Minimalism wins — fewer layers, less overhead
Proven foundations — reuse battle-tested storage, don’t reinvent it
Predictable performance — no hidden query costs, no compaction stalls
Honest tradeoffs — SNKV is not the fastest at everything; it’s optimized for its target use case

{💬|⚡|🔥} **What’s your take?**
Share your thoughts in the comments below!

#️⃣ **#GitHub #hashanusnkv #SNKV #key #store #sqlite #btree #APIs**

🕒 **Posted on**: 1771938949

🌟 **Want more?** Click here for more info! 🌟