Reference

API keys & air-gapped use

API keys & air-gapped deployment

How greatmemory authenticates clients, how to create and rotate keys, and how to run everything with no internet access at all.

How verification works

Verification is entirely local — there is no license server, no phone-home, no token issuer. A key is a shared secret: the server loads GM_API_KEYS at startup and compares each request's Authorization: Bearer <key> header against that list using a constant-time comparison, so timing attacks cannot probe key bytes.

client ── Authorization: Bearer gm_a1b2… ──► gmem server
                                              │ constant-time compare vs
                                              │ GM_API_KEYS (in memory)
                                              ▼
                                    200 … or 401 {"error":"unauthorized"}
  • If GM_API_KEYS is unset, auth is off — the local-first default (server binds to loopback, nothing else can reach it).
  • If it is set, every route requires a valid key except GET /v1/healthz, which stays open so load balancers can probe.
  • The MCP endpoint (/mcp) sits behind the same middleware.

Because verification is a local string comparison, an air-gapped client and server behave exactly like an internet-connected pair.

Creating keys

Key generation is up to you — any high-entropy string works. A common convention:

echo "gm_$(openssl rand -hex 32)"

Configure one or more (comma-separated):

GM_API_KEYS=gm_abc...,gm_def... gmem serve

or in greatmemory.toml:

api_keys = ["gm_abc...", "gm_def..."]

Clients authenticate with the GM_API_KEY environment variable (CLI) or by sending the Bearer header directly (REST, SDKs, MCP over HTTP):

GM_API_KEY=gm_abc... gmem search "deploy checklist"

curl -H "Authorization: Bearer gm_abc..." \
  -X POST https://memory.example.internal/v1/search \
  -d '{"query":"deploy checklist"}'

Managing and rotating keys

Multiple keys make rotation a rolling operation with no downtime window for well-behaved clients:

  1. Add the new key to GM_API_KEYS (keep the old one) and restart.
  2. Move clients to the new key at their own pace.
  3. Remove the old key and restart. It is rejected from that moment.

Practical guidance:

  • One key per client or service. Revoking one client then never affects the others, and access logs stay attributable.
  • Keys are read at startup — changes require a restart (fast: the server starts in well under a second; the embedding model loads lazily).
  • Protect wherever the keys live: chmod 600 on env files, or your platform's secret store (Railway variables, AWS SSM/Secrets Manager, Azure Key Vault references, GCP Secret Manager).

Fully air-gapped operation

greatmemory needs the network for exactly one thing, exactly once: the embedding model (~100 MB) downloads on first ingest into GM_DATA_DIR/models/. For hosts with no internet access:

  1. On any connected machine, run the server and ingest one document (or run the test suite) so the model lands in GM_DATA_DIR/models/.
  2. Copy that models/ directory to the restricted host's data directory.
  3. Start the server. Embedding, search, and storage now run fully offline.

Fact distillation is optional and also stays local if you point GM_LLM at an Ollama instance on the same network segment. With GM_LLM unset, storage and retrieval work and only fact extraction is off.

Nothing in the binary calls out: no telemetry, no update checks, no license validation. If your firewall logs show traffic from greatmemory, it is only your configured embedder or LLM endpoint (when you chose a remote one).

Current limitations (v0.1)

Stated plainly so you can plan around them:

  • Keys are compared in memory as plain strings — they are not hashed at rest. Treat the env file/secret store as the security boundary.
  • Keys are all-or-nothing: any valid key can read and write every space. If you need hard tenant isolation today, run one instance per tenant (the 19 MB idle footprint makes that practical).
  • No built-in expiry or audit UI. Revocation = remove the key and restart. Hashed keys, per-key space scoping, and a management UI are on the roadmap.

Security checklist for a public-facing server

  • Set GM_API_KEYS before binding to anything other than loopback.
  • Terminate TLS at your load balancer or reverse proxy (the server speaks plain HTTP).
  • Restrict GM_CORS_ORIGINS to your real frontend origins.
  • Keep /v1/healthz as the only thing your probe needs — it leaks nothing but {"status":"ok"}.