Upgrades & migrations — greatmemory docs

greatmemory is built to upgrade in place without disrupting a running service. This page covers what happens to your data on a version bump, how to roll out a new version with zero downtime, and how to roll back safely.

What happens on upgrade

When a new binary opens an existing database it runs any pending schema migrations automatically — you don't run a separate migrate step.

Stepwise and transactional. Migrations apply in order, each in its own transaction. A failure rolls that step back cleanly; the database is never left half-migrated.
Auto-snapshot first (SQLite). Before applying migrations to existing data, greatmemory writes a <db>.pre-v<N>.bak snapshot next to the database, so a bad upgrade is instantly recoverable.
Downgrade protection. If a database has a newer schema than the binary understands (for example after a rollback), greatmemory refuses to open it with a clear message instead of silently operating on — and corrupting — an unknown schema.

Readiness: the key to zero downtime

The server exposes two probes:

Endpoint	Meaning	Use for
`GET /v1/healthz`	process is up (liveness)	restart-on-crash checks
`GET /v1/readyz`	storage reachable and migrated (readiness)	load-balancer / rolling-deploy health checks

readyz returns 200 only once the new instance can actually serve, and 503 while it is starting up or migrating. Both probes are always unauthenticated, so they work even with GM_API_KEYS set.

Point your platform's health check at /v1/readyz. During a rolling deploy the orchestrator starts the new instance, waits for readyz to go green, shifts traffic, then drains and stops the old one — so requests are only ever sent to an instance that's ready.

Rolling upgrades by backend

Postgres (true zero-downtime)

Postgres deployments can run several instances against one database, so a standard rolling deploy works:

Roll out the new version one instance at a time (ECS/Cloud Run/Container Apps do this by default; set the health check to /v1/readyz).
The first new instance applies any pending migrations on startup; the others see an up-to-date schema.
Old and new instances briefly run side by side — which is safe as long as migrations are additive (see below).

SQLite (single-writer → brief restart)

A SQLite database is a single file with a single writer, so you run one instance. Upgrades are a fast restart rather than a true rolling swap: the process stops, the new binary opens the file (auto-snapshot + migrate), and starts — typically sub-second since the model loads lazily. For most single-node deployments this is effectively seamless; if you need true zero-downtime, use the Postgres backend.

The additive (expand/contract) rule

So that an old and a new version can coexist during a rollout, schema changes are additive: new tables and new nullable/defaulted columns, never a rename or drop in the same release. A column that must change shape is handled across two releases — add the new shape and write both (expand), migrate readers, then remove the old shape later (contract). greatmemory's own migrations follow this rule; keep it in mind if you fork or extend the schema.

Rolling back

Stop the new version.
Postgres: if the new version added only additive changes, the old binary keeps working against the migrated schema — just redeploy it. If a migration was not backward compatible, restore from a backup (see API keys & air-gapped use and the gmem backup/restore commands in the CLI reference).
SQLite: restore the <db>.pre-v<N>.bak snapshot that the upgrade wrote, or any gmem backup, then start the old binary. The downgrade guard ensures the old binary won't touch a database it doesn't understand.

Before a major upgrade — checklist

Take an explicit backup: gmem backup (SQLite) or your provider's managed snapshot / pg_dump (Postgres). See the CLI reference.
Confirm your health check targets /v1/readyz.
Pin the image to an explicit version tag rather than latest, so rollouts and rollbacks are deterministic.
Upgrade one environment (staging) first and watch gmem status / GET /v1/stats — counts should carry over and memory should stay flat.