GitHub – hash-anu/snkv: SNKV- key value store using sqlite b-tree APIs

🔥 Check out this awesome post from Hacker News 📖

📂 **Category**:

💡 **What You’ll Learn**:

Build
Memory Leaks
Tests
Peak Memory
GitHub Issues
GitHub Closed Issues
License


SNKV is a lightweight, ACID-compliant embedded key-value store built directly on SQLite’s B-Tree storage engine — without SQL.

The idea: bypass the SQL layer entirely and talk directly to SQLite’s storage engine. No SQL parser. No query planner. No virtual machine. Just a clean KV API on top of a proven, battle-tested storage core.

SQLite-grade reliability. KV-first design. Lower overhead for read-heavy and mixed key-value workloads.


Single-header integration — drop it in and go:

#define SNKV_IMPLEMENTATION
#include "snkv.h"

int main(void) 💬

Use kvstore_open_v2 to control how the store is opened. Zero-initialise the
config and set only what you need — unset fields resolve to safe defaults.

KVStoreConfig cfg = ⚡;
cfg.journalMode = KVSTORE_JOURNAL_WAL;   /* WAL mode (default) */
cfg.syncLevel   = KVSTORE_SYNC_NORMAL;   /* survives process crash (default) */
cfg.cacheSize   = 4000;                  /* ~16 MB page cache (default 2000 ≈ 8 MB) */
cfg.pageSize    = 4096;                  /* DB page size, new DBs only (default 4096) */
cfg.busyTimeout = 5000;                  /* retry 5 s on SQLITE_BUSY (default 0) */
cfg.readOnly    = 0;                     /* read-write (default) */

KVStore *db;
kvstore_open_v2("mydb.db", &db, &cfg);

Field Default Options
journalMode KVSTORE_JOURNAL_WAL KVSTORE_JOURNAL_DELETE
syncLevel KVSTORE_SYNC_NORMAL KVSTORE_SYNC_OFF, KVSTORE_SYNC_FULL
cacheSize 2000 pages (~8 MB) Any positive integer
pageSize 4096 bytes Power of 2, 512–65536; new DBs only
readOnly 0 1 to open read-only
busyTimeout 0 (fail immediately) Milliseconds; useful for multi-process use

kvstore_open remains fully supported and uses all defaults except journalMode.


make              # builds libsnkv.a
make snkv.h       # generates single-header version
make examples     # builds examples
make run-examples # run all examples
make test         # run all tests (CI suite)
make clean

Windows (MSYS2 / MinGW64)

1. Install MSYS2.

2. Launch “MSYS2 MinGW 64-bit” from the Start menu (not the plain MSYS2 terminal).

3. Install the toolchain:

pacman -S --needed mingw-w64-x86_64-gcc make

4. Clone and build:

git clone https://github.com/hash-anu/snkv.git
cd snkv
make              # builds libsnkv.a
make snkv.h       # generates single-header
make examples     # builds .exe examples
make run-examples
make test

All commands must be run from the MSYS2 MinGW64 shell. Running mingw32-make from
a native cmd.exe or PowerShell window will not work — the Makefile relies on sh and
standard Unix tools that are only available inside the MSYS2 environment.


Available on PyPI — no compiler needed:

from snkv import KVStore

with KVStore("mydb.db") as db:
    db["hello"] = "world"
    print(db["hello"].decode())   # world

Full documentation — installation, API reference, examples, and thread-safety notes — is in
python/README.md.

SNKV Python API Demo


10 GB Crash-Safety Stress Test

A production-scale kill-9 test is included but kept separate from the CI suite.
It writes unique deterministic key-value pairs into a 10 GB WAL-mode database,
forcibly kills the writer with SIGKILL during active writes, and verifies on
restart that every committed transaction is present with byte-exact values, no
partial transactions are visible, and the database has zero corruption.

make test-crash-10gb          # run full 5-cycle kill-9 + verify (Linux / macOS)

# individual modes
./tests/test_crash_10gb write  tests/crash_10gb.db   # continuous writer
./tests/test_crash_10gb verify tests/crash_10gb.db   # post-crash verifier
./tests/test_crash_10gb clean  tests/crash_10gb.db   # remove DB files

Requires ~11 GB free disk. run mode is POSIX-only; write and verify work on all platforms.


Standard database path:

Application → SQL Parser → Query Planner → VDBE (VM) → B-Tree → Disk

SNKV path:

Application → KV API → B-Tree → Disk

By removing the layers you don’t need for key-value workloads, SNKV keeps the proven storage core and cuts the overhead.

Layer SQLite SNKV
SQL Parser
Query Planner
VDBE (VM)
B-Tree Engine
Pager / WAL


1M records, Linux, averaged across 3 runs.
Both SNKV and SQLite use identical settings: WAL mode, synchronous=NORMAL, 2000-page (8 MB) page cache, 4096-byte pages.

Benchmark source: SNKV · SQLite

SNKV vs SQLite (KV workloads)

SQLite benchmark uses WITHOUT ROWID with a BLOB primary key — the fairest possible comparison, both using a single B-tree keyed on the same field. Both run with identical settings: WAL mode, synchronous=NORMAL, 2000-page (8 MB) cache, 4096-byte pages. This isolates the pure cost of the SQL layer for KV operations.

Note: Both SNKV and SQLite (WITHOUT ROWID) use identical peak RSS (~10.8 MB) since they share the same underlying pager and page cache infrastructure.

Benchmark SQLite SNKV Notes
Sequential writes 140K ops/s 146K ops/s SNKV 1.05x faster
Random reads 87K ops/s 139K ops/s SNKV 1.6x faster
Sequential scan 1.61M ops/s 3.16M ops/s SNKV 2x faster
Random updates 17K ops/s 24K ops/s SNKV 1.4x faster
Random deletes 17K ops/s 20K ops/s SNKV 1.2x faster
Exists checks 87K ops/s 149K ops/s SNKV 1.7x faster
Mixed workload 35K ops/s 50K ops/s SNKV 1.4x faster
Bulk insert 211K ops/s 240K ops/s SNKV 1.1x faster

With identical storage configuration, SNKV wins across every benchmark. The gains come from two sources: bypassing the SQL layer (no parsing, no query planner, no VDBE) and a per-column-family cached read cursor that eliminates repeated cursor open/close overhead on the hot read path. The biggest wins are on read-heavy operations — random reads (+60%), exists checks (+70%), and sequential scan (+100%) — exactly where the cursor caching pays off most.


Running your own LMDB / RocksDB comparison

If you want to benchmark SNKV against LMDB or RocksDB, the benchmark harnesses are here:


SNKV is a good fit if:

  • Your workload is read-heavy or mixed (reads + writes)
  • You’re running in a memory-constrained or embedded environment
  • You want a clean KV API without writing SQL strings, preparing statements, and binding parameters
  • You need single-header C integration with no external dependencies
  • You want predictable latency — no compaction stalls, no mmap tuning

Consider alternatives if:

  • You need maximum write/update/delete throughput → RocksDB (LSM-tree)
  • You need maximum read/scan speed and memory isn’t a constraint → LMDB (memory-mapped)
  • You already use SQL elsewhere and want to consolidate → SQLite directly

  • ACID Transactions — commit / rollback safety

  • WAL Mode — concurrent readers + single writer

  • Column Families — logical namespaces within a single database

  • Iterators — ordered key traversal

  • Thread Safe — built-in synchronization

  • Single-header — drop snkv.h into any C/C++ project

  • Zero memory leaks — verified with Valgrind

  • SSD-friendly — WAL appends sequentially, reducing random writes

  • Python Bindings — idiomatic Python 3.8+ API with dict-style access, context managers, typed exceptions, and prefix iterators — see python/README.md


Backup & Tooling Compatibility

Because SNKV uses SQLite’s file format and pager layer, backup tools that operate at the WAL or page level work out of the box:

  • LiteFS — distributed SQLite replication works with SNKV databases
  • SQLite Online Backup API — operates at the page level, fully compatible
  • WAL-based backup tools — any tool consuming WAL files works correctly
  • Rollback journal tools — journal mode is fully supported

Note: Tools that rely on SQLite’s schema layer — like the sqlite3 CLI or DB Browser for SQLite — won’t work. SNKV bypasses the schema layer entirely by design.


Internals & Documentation

I documented the SQLite internals explored while building this:


  • Minimalism wins — fewer layers, less overhead
  • Proven foundations — reuse battle-tested storage, don’t reinvent it
  • Predictable performance — no hidden query costs, no compaction stalls
  • Honest tradeoffs — SNKV is not the fastest at everything; it’s optimized for its target use case

Apache License 2.0 © 2025 Hash Anu

{💬|⚡|🔥} **What’s your take?**
Share your thoughts in the comments below!

#️⃣ **#GitHub #hashanusnkv #SNKV #key #store #sqlite #btree #APIs**

🕒 **Posted on**: 1771938949

🌟 **Want more?** Click here for more info! 🌟

By

Leave a Reply

Your email address will not be published. Required fields are marked *