Deployment

Azure

greatmemory is just a container. One image (~158 MB), one port (7437), and either one volume (/data) or a Postgres URL (GM_DB). Any Azure service that runs containers works — Container Apps is the recommended path, Container Instances the simpler one.

Environment variables

VariablePurpose
GM_HOSTBind address — the published image already sets 0.0.0.0
GM_PORTBind/container port (default 7437)
GM_DATA_DIRData directory — the image sets /data (SQLite db + embedding model cache)
GM_DBPostgres URL (postgres://...) to use Azure Database for PostgreSQL instead of SQLite
GM_API_KEYSComma-separated bearer keys — required on any non-loopback bind
GM_CORS_ORIGINSExact allowed origins; replaces the permissive localhost default
GM_EMBEDDERfastembed (default, local ONNX) | ollama | openai
GM_LLMnone (default) | ollama | openai — enables fact extraction

Path A: Azure Container Apps

Container Apps gives you TLS ingress, secrets, and scaling out of the box. Decide on storage first:

  • Azure Database for PostgreSQL Flexible Server (recommended): durable state lives in the database via GM_DB; replicas of the app stay stateless. pgvector must be allow-listed on the server first — add VECTOR to the azure.extensions server parameter (Server parameters blade or az postgres flexible-server parameter set --name azure.extensions --value VECTOR ...), then enable it in your database:
CREATE EXTENSION IF NOT EXISTS vector;

If the app role can't allow-list or create the extension, have a privileged role provision it and set GM_DB_ASSUME_PGVECTOR=1 — see Enterprise database (pgvector).

  • Azure Files volume at /data with SQLite: simpler, but cap the app at exactly one replica — a SQLite file must not be written by multiple instances. (Note that SQLite over SMB shares can be slow; Postgres is the better fit for anything beyond light use.)

Deployment sketch with Postgres:

az containerapp env create \
  --name gm-env --resource-group gm-rg --location <region>

az containerapp create \
  --name greatmemory \
  --resource-group gm-rg \
  --environment gm-env \
  --image <registry>.azurecr.io/greatmemory:latest \
  --target-port 7437 \
  --ingress external \
  --min-replicas 1 --max-replicas 1 \
  --cpu 1.0 --memory 2.0Gi \
  --secrets gm-db-url='postgres://gm:<password>@<server>.postgres.database.azure.com:5432/gm?sslmode=require' \
            gm-api-keys='<long-random-key>' \
  --env-vars GM_DB=secretref:gm-db-url \
             GM_API_KEYS=secretref:gm-api-keys \
             GM_CORS_ORIGINS='https://app.example.com'

Notes:

  • Ingress and TLS: external ingress terminates TLS for you on the Container Apps domain; the container keeps speaking plain HTTP on 7437. For private consumers, use --ingress internal inside a VNet instead.
  • Keep one warm replica (--min-replicas 1): the embedding model loads into memory and, with Postgres, re-downloads into the ephemeral /data on every cold start. Scale-to-zero trades that away for cold-start latency.
  • Health probes: point the readiness probe at /v1/readyz (200 only once storage is reachable and migrated) and liveness at /v1/healthz; both are always unauthenticated. See Upgrades & migrations.
  • Azure Files variant: create a storage definition on the environment (az containerapp env storage set with an Azure Files share), then mount it at /data in the app and omit GM_DB.

Path B: Azure Container Instances (simpler)

A single container group with an Azure Files share mounted at /data is the minimal stateful deployment:

az container create \
  --resource-group gm-rg \
  --name greatmemory \
  --image <registry>.azurecr.io/greatmemory:latest \
  --ports 7437 \
  --cpu 1 --memory 2 \
  --azure-file-volume-account-name <storage-account> \
  --azure-file-volume-account-key <key> \
  --azure-file-volume-share-name gmdata \
  --azure-file-volume-mount-path /data \
  --secure-environment-variables GM_API_KEYS='<long-random-key>' \
  --environment-variables GM_CORS_ORIGINS='https://app.example.com'

Container Instances has no built-in TLS termination — keep it on a private VNet, or put Application Gateway (or any reverse proxy) in front for a public, TLS-terminated endpoint. GM_API_KEYS is non-negotiable the moment the address is reachable by anything but you.

Storing the API key in Azure Key Vault

Create the vault (once) and the secret:

az keyvault create --name gm-kv --resource-group gm-rg
az keyvault secret set --vault-name gm-kv \
  --name gm-api-keys --value "gm_$(openssl rand -hex 32)"

Container Apps — reference the Key Vault secret instead of an inline value. Enable a system-assigned identity on the app, grant it the Key Vault Secrets User role on the vault, then:

az containerapp identity assign --name greatmemory --resource-group gm-rg --system-assigned

az containerapp secret set --name greatmemory --resource-group gm-rg \
  --secrets "gm-api-keys=keyvaultref:https://gm-kv.vault.azure.net/secrets/gm-api-keys,identityref:system"

The env var mapping stays exactly as in Path A: GM_API_KEYS=secretref:gm-api-keys. Container Apps re-resolves the reference on each revision, so rotation = set a new Key Vault secret version (with old + new comma-separated during the transition) and create a new revision.

Container Instances — ACI has no Key Vault reference; fetch at deploy time instead of pasting the value:

--secure-environment-variables GM_API_KEYS="$(az keyvault secret show \
  --vault-name gm-kv --name gm-api-keys --query value -o tsv)"

Security checklist

  • GM_API_KEYS set (long random values, stored as Container Apps secrets / secure environment variables — ideally sourced from Key Vault)
  • TLS terminated by Container Apps ingress or Application Gateway; the container itself speaks plain HTTP
  • GM_CORS_ORIGINS set to your real origins
  • Database reachable only privately (VNet integration / private endpoint), sslmode=require in GM_DB
  • Durable state on Azure Files snapshots or the Flexible Server's automated backups
  • /v1/readyz for probes; watch rss_bytes from /v1/stats — it should stay flat