zeroecco/holos: docker compose for kvm/qemu · GitHub

🔥 Check out this awesome post from Hacker News 📖

📂 **Category**:

💡 **What You’ll Learn**:

Docker compose for KVM. Define multi-VM stacks in a single YAML file. No libvirt, no XML, no distributed control plane.

The primitive is a VM, not a container. Every workload instance gets its own kernel boundary, its own qcow2 overlay, and its own cloud-init seed.

Requires Linux + /dev/kvm. macOS builds run the offline subcommands
(validate, import, images) so you can author compose files on a
laptop, but up/run need a real KVM host.

The shortest path — one disposable VM, no compose file:

holos run alpine
# prints the exact `holos exec` and `holos down` commands for the new VM

The next-shortest path — a single-service stack you can curl. Save as
holos.yaml:

name: hello

services:
  web:
    image: ubuntu:noble
    ports:
      - "8080:80"
    cloud_init:
      packages:
        - nginx
      write_files:
        - path: /var/www/html/index.html
          content: "hello from holos\n"
      runcmd:
        - systemctl restart nginx

holos up
curl localhost:8080                 # → hello from holos
holos down

That’s it — a working VM with a real package install, a config file,
and a host port forward. For multi-service stacks (depends_on, named
volumes, healthchecks, replicas), see examples/ and the
Compose File reference below.

holos up [-f holos.yaml]             start all services
holos run [flags]  [-- cmd...]
                                     launch a one-off VM from an image (no compose file)
holos down [-f holos.yaml]           stop and remove all services
holos ps [-f holos.yaml]             list running projects (-f narrows to one)
holos start [-f holos.yaml] [svc]    start a stopped service or all services
holos stop [-f holos.yaml] [svc]     stop a service or all services
holos console [-f holos.yaml]  attach serial console to an instance
holos exec [-f holos.yaml]  [cmd...]
                                     ssh into an instance (waits for sshd; project's generated key)
holos logs [-f holos.yaml] 
                                     show logs for a service (all replicas) or one instance
holos validate [-f holos.yaml]       validate compose file
holos pull                    pull a cloud image (e.g. alpine, ubuntu:noble)
holos images                         list available images
holos devices [--gpu]                list PCI devices and IOMMU groups
holos install [-f holos.yaml] [--system] [--enable]
                                     emit a systemd unit so the project survives reboot
holos uninstall [-f holos.yaml] [--system]
                                     remove the systemd unit written by `holos install`
holos import [vm...] [--all] [--xml file] [--connect uri] [-o file]
                                     convert virsh-defined VMs into a holos.yaml

For one-off VMs you don’t want to write a compose file for, holos run
is the analogue of docker run:

holos run ubuntu:noble                          # bare VM, default 1 vCPU / 512 MB
holos run --vcpu 4 --memory 4G ubuntu:noble     # bigger box
holos run -p 8080:80 --pkg nginx \
  --runcmd 'systemctl enable --now nginx' alpine
holos run -v ./code:/srv ubuntu:noble           # bind mount
holos run --device 0000:01:00.0 ubuntu:noble    # PCI passthrough
holos run alpine -- echo hello world            # trailing args become a runcmd

The synthesised compose file is persisted under
state_dir/runs//holos.yaml, and the project name is auto-derived
from the image (override with --name). All other CLI verbs work
against it the same way they do for hand-written projects:

holos exec -f ~/.local/state/holos/runs/<name>/holos.yaml vm-0
holos console -f ~/.local/state/holos/runs/<name>/holos.yaml vm-0
holos down <name>

holos run exits as soon as the VM is started — VMs are always
detached, just like holos up. There is no foreground/-it mode;
shell in via holos exec (the recommended path), or attach to the
serial console with holos console for boot/kernel logs.

The login user is inferred from the image (debian:* → debian,
alpine → alpine, etc.) so holos exec works out of the box.
Cloud-init takes ~30s on first boot to actually create the account,
so holos console may briefly show “Login incorrect” before the
autologin kicks in. Override with --user if you want a
different account.

Flags:

Flag	Description
`--name NAME`	project name (default: derived from image + random suffix)
`--vcpu N`	vCPU count (default 1)
`--memory SIZE`	memory, e.g. `512M`, `2G` (default 512M)
`-p HOST:GUEST`, `--port`	publish a port (repeatable)
`-v SRC:TGT[:ro]`, `--volume`	bind mount (repeatable)
`--device PCI`	PCI passthrough (repeatable, auto-enables UEFI)
`--pkg PKG`	cloud-init package (repeatable)
`--runcmd CMD`	first-boot shell command (repeatable)
`--user USER`	cloud-init user (default ubuntu)
`--dockerfile PATH`	use a Dockerfile instead of (or with) an image
`--uefi`	force OVMF boot

The holos.yaml format is deliberately similar to docker-compose:

services – each service is a VM with its own image, resources, and cloud-init config
depends_on – services start in dependency order
ports – "host:guest" syntax, auto-incremented across replicas
volumes – "./source:/target:ro" for bind mounts, "name:/target" for top-level named volumes
replicas – run N instances of a service
cloud_init – packages, write_files, runcmd — standard cloud-init
stop_grace_period – how long to wait for ACPI shutdown before SIGTERM/SIGKILL (e.g. "30s", "2m"); defaults to 30s
healthcheck – test, interval, retries, start_period, timeout to gate dependents
top-level volumes block – declare named data volumes that persist across holos down

holos stop and holos down send QMP system_powerdown to the guest
(equivalent to pressing the power button), then wait up to
stop_grace_period for QEMU to exit on its own. If the guest doesn’t
halt in time — or QMP is unreachable — the runtime falls back to SIGTERM,
then SIGKILL, matching docker-compose semantics.

services:
  db:
    image: ubuntu:noble
    stop_grace_period: 60s    # flush DB buffers before hard stop

Top-level volumes: declares named data stores that live under
state_dir/volumes//.qcow2 and are symlinked into each
instance’s work directory. They survive holos down — tearing down a
project only removes the symlink, never the backing file.

name: demo
services:
  db:
    image: ubuntu:noble
    volumes:
      - pgdata:/var/lib/postgresql

volumes:
  pgdata:
    size: 20G

Volumes attach as virtio-blk devices with a stable serial=vol-,
so inside the guest they appear as /dev/disk/by-id/virtio-vol-pgdata.
Cloud-init runs an idempotent mkfs.ext4 + /etc/fstab snippet on
first boot so there’s nothing to configure by hand.

Healthchecks and `depends_on`

A service with a healthcheck blocks its dependents from starting until
the check passes. The probe runs via SSH (same key holos exec uses):

services:
  db:
    image: postgres-cloud.qcow2
    healthcheck:
      test: ["pg_isready", "-U", "postgres"]
      interval: 2s
      retries: 30
      start_period: 10s
      timeout: 3s
  api:
    image: api.qcow2
    depends_on: [db]     # waits for db to be healthy

test: accepts either a list (exec form) or a string (wrapped in
sh -c). Set HOLOS_HEALTH_BYPASS=1 to skip the actual probe — handy
for CI environments without in-guest SSHD.

Every holos up auto-generates a per-project SSH keypair under
state_dir/ssh// and injects the public key via cloud-init.
A host port is allocated for each instance and forwarded to guest port
22, so you can:

holos exec web-0                 # interactive shell
holos exec db-0 -- pg_isready    # one-off command

-u overrides the login user. The default is the service’s
explicit cloud_init.user, falling back to the image’s conventional
cloud user (debian for debian:*, alpine for alpine:*,
fedora, arch, ubuntu), and finally ubuntu for local images.

On a fresh VM holos exec waits up to 60s for sshd to be ready
before attempting the handshake, masking the brief window where
cloud-init regenerates host keys and bounces sshd. Use -w 0 to
opt out, or -w 5m to wait longer for slow first boots.

Emit a systemd unit so a project comes back up after the host reboots:

holos install --enable           # per-user, no sudo needed
holos install --system --enable  # host-wide, before any login
holos install --dry-run          # print the unit and exit

User units land under ~/.config/systemd/user/holos-.service;
system units under /etc/systemd/system/. holos uninstall reverses
it (and is idempotent — safe to call twice).

Every service can reach every other service by name. Under the hood:

Each VM gets two NICs: user-mode (for host port forwarding) and socket multicast (for inter-VM L2)
Static IPs are assigned automatically on the internal 10.10.0.0/24 segment
/etc/hosts is populated via cloud-init so db, web-0, web-1 all resolve
No libvirt. No bridge configuration. No root required for inter-VM networking.

Pass physical GPUs (or any PCI device) directly to a VM via VFIO:

services:
  ml:
    image: ubuntu:noble
    vm:
      vcpu: 8
      memory_mb: 16384
    devices:
      - pci: "01:00.0"       # GPU
      - pci: "01:00.1"       # GPU audio
    ports:
      - "8888:8888"

What holos handles:

UEFI boot is enabled automatically when devices are present (OVMF firmware)
kernel-irqchip=on is set on the machine for NVIDIA compatibility
Per-instance OVMF_VARS copy so each VM has its own EFI variable store
Optional rom_file for custom VBIOS ROMs

What you handle (host setup):

Enable IOMMU in BIOS and kernel (intel_iommu=on or amd_iommu=on)
Bind the GPU to vfio-pci driver
Run holos devices --gpu to find PCI addresses and IOMMU groups

Use pre-built cloud images instead of building your own:

services:
  web:
    image: alpine           # auto-pulled and cached
  api:
    image: ubuntu:noble     # specific tag
  db:
    image: debian           # defaults to debian:12

Available: alpine, arch, debian, ubuntu, fedora. Run holos images to see all tags.

Use a Dockerfile to provision a VM. RUN, COPY, ENV, and WORKDIR instructions are converted into a shell script that runs via cloud-init:

services:
  api:
    dockerfile: ./Dockerfile
    ports:
      - "3000:3000"

FROM ubuntu:noble

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get install -y nodejs npm
COPY server.js /opt/app/
WORKDIR /opt/app
RUN npm init -y && npm install express

When image is omitted, the base image is taken from the Dockerfile’s FROM line. The Dockerfile’s instructions run before any cloud_init.runcmd entries.

Supported: FROM, RUN, COPY, ENV, WORKDIR. Unsupported instructions (CMD, ENTRYPOINT, EXPOSE, etc.) are silently skipped. COPY sources are resolved relative to the Dockerfile’s directory and must be files, not directories — use volumes for directory mounts.

Pass arbitrary flags straight to qemu-system-x86_64 with extra_args:

services:
  gpu:
    image: ubuntu:noble
    vm:
      vcpu: 4
      memory_mb: 8192
      extra_args:
        - "-device"
        - "virtio-gpu-pci"
        - "-display"
        - "egl-headless"

Arguments are appended after all holos-managed flags. No validation — you own it.

Field	Default
replicas	1
vm.vcpu	1
vm.memory_mb	512
vm.machine	q35
vm.cpu_model	host
cloud_init.user	ubuntu
image_format	inferred from extension

Already running VMs under libvirt? holos import reads libvirt domain
XML and emits an equivalent holos.yaml so you can move existing
workloads onto holos without retyping every field.

holos import web-prod db-prod                # via `virsh dumpxml`
holos import --all -o holos.yaml             # every defined domain
holos import --xml ./web.xml                 # offline, no virsh needed
holos import --connect qemu:///system api    # non-default libvirt URI

The mapping covers the fields holos has a direct equivalent for:

libvirt	holos
	`vm.vcpu`
/	`vm.memory_mb`
	`vm.machine` (collapsed)
	`vm.cpu_model: host`
	`vm.uefi: true`
first	`image:` + `image_format:`
	`devices: [💬]`

Anything holos can’t translate cleanly — extra disks, bridged NICs,
USB passthrough, custom emulators — is reported as a warning on stderr
so you know what to revisit before holos up. Output goes to stdout
unless you pass -o, so it composes with shell redirection
(holos import vm > holos.yaml).

Pre-built binaries (Linux + macOS, amd64 + arm64) are attached to every
GitHub release:

TAG=v0.1.0
curl -L https://github.com/zeroecco/holos/releases/download/$TAG/holos_$⚡_Linux_x86_64.tar.gz \
  | sudo tar -xz -C /usr/local/bin holos
holos version

Or build from source (see below).

Linux is the only runtime target — holos up needs /dev/kvm and
qemu-system-x86_64. macOS builds exist so the offline subcommands
(validate, import, images) work for compose-file authoring on
a laptop.

go build -o bin/holos ./cmd/holos

Releases are produced by GoReleaser on every
v* git tag (see .github/workflows/release.yml):

git tag -a v0.1.0 -m "v0.1.0"
git push origin v0.1.0

The workflow cross-compiles four targets, packages them with the
LICENSE/NOTICE/README.md, computes SHA-256 checksums, drafts
release notes from the commit log, and publishes a GitHub release.

To rehearse locally without publishing:

goreleaser release --snapshot --clean --skip=publish
ls dist/

Build a guest image (requires mkosi):

/dev/kvm
qemu-system-x86_64
qemu-img
One of cloud-localds, genisoimage, mkisofs, or xorriso
mkosi (only for building the base image)

`kex_exchange_identification: read: Connection reset by peer`

sshd accepted the TCP connection but closed it before the SSH
handshake completed. On a fresh VM this almost always means
cloud-init is still regenerating host keys and bouncing sshd —
the listener is briefly flapping, not broken.

holos exec waits up to 60s for sshd to be ready by default; if
you hit this immediately after holos run or holos up, give it
another 30s and retry. Use -w 5m to wait longer for slow first
boots, or -w 0 to disable the wait entirely.

If the error persists past two minutes, attach the serial console
(holos console ) and look for cloud-init failures.

Console shows `Login incorrect` and `Password:` repeatedly

Same window as above. The serial-getty autologin retries the
configured user (e.g. debian) before cloud-init has actually
created the account, so the first few attempts fail. Watch the
console log for cloud-init … finished — after that line the
autologin succeeds and you land in a shell.

For interactive shell access the supported path is holos exec,
which uses the project’s auto-generated SSH key over a forwarded
port. The serial console is meant for boot/kernel diagnostics, not
day-to-day operation — cloud images don’t ship with a console
password and we don’t add one.

holos needs /dev/kvm and qemu-system-x86_64 to actually launch
VMs, and KVM is a Linux kernel feature. macOS builds exist so the
offline subcommands (validate, import, images, pull) work
for compose-file authoring on a laptop, but up and run only
work on Linux hosts.

Run holos against a remote KVM host via SSH, or use a Linux VM /
dev container for actual workload execution.

This is not Kubernetes. It does not try to solve:

Multi-host clustering
Live migration
Service meshes
Overlay networks
Scheduler, CRDs, or control plane quorum

The goal is to make KVM workable for single-host stacks without importing the operational shape of Kubernetes.

Licensed under the Apache License, Version 2.0. See
NOTICE for attribution.

⚡ **What’s your take?**
Share your thoughts in the comments below!

#️⃣ **#zeroeccoholos #docker #compose #kvmqemu #GitHub**

🕒 **Posted on**: 1776732535

🌟 **Want more?** Click here for more info! 🌟

zeroecco/holos: docker compose for kvm/qemu · GitHub

Healthchecks and `depends_on`

`kex_exchange_identification: read: Connection reset by peer`

Console shows `Login incorrect` and `Password:` repeatedly

By

Leave a Reply Cancel reply

zeroecco/holos: docker compose for kvm/qemu · GitHub

Healthchecks and depends_on

kex_exchange_identification: read: Connection reset by peer

Console shows Login incorrect and Password: repeatedly

By

Leave a Reply Cancel reply

Healthchecks and `depends_on`

`kex_exchange_identification: read: Connection reset by peer`

Console shows `Login incorrect` and `Password:` repeatedly