greatmemory is just a container. One image (~158 MB), one port (
7437), and either one volume (/data) or a Postgres URL (GM_DB). Any Google Cloud service that runs containers works — Cloud Run with Cloud SQL is the recommended path, a GCE VM with Docker Compose the stateful-disk alternative.
Environment variables
| Variable | Purpose |
|---|---|
GM_HOST | Bind address — the published image already sets 0.0.0.0 |
GM_PORT | Bind/container port (default 7437) |
GM_DATA_DIR | Data directory — the image sets /data (SQLite db + embedding model cache) |
GM_DB | Postgres URL (postgres://...) to use Cloud SQL instead of SQLite |
GM_API_KEYS | Comma-separated bearer keys — required on any non-loopback bind |
GM_CORS_ORIGINS | Exact allowed origins; replaces the permissive localhost default |
GM_EMBEDDER | fastembed (default, local ONNX) | ollama | openai |
GM_LLM | none (default) | ollama | openai — enables fact extraction |
Path A: Cloud Run + Cloud SQL
Cloud Run's filesystem is ephemeral — anything written to /data disappears when the instance is recycled. So on Cloud Run, SQLite is not an option for durable state: use Cloud SQL for PostgreSQL with the pgvector extension via GM_DB. (The /data model cache still works; it just re-downloads on cold starts.)
- Create the database: a Cloud SQL for PostgreSQL instance (pgvector is supported on current versions). Enable the extension once in your database:
CREATE EXTENSION IF NOT EXISTS vector;
If your Cloud SQL role can't run
CREATE EXTENSION(a DBA provisions it for you), have a privileged role create it once and setGM_DB_ASSUME_PGVECTOR=1— see Enterprise database (pgvector).
- Deploy:
gcloud run deploy greatmemory \
--image <region>-docker.pkg.dev/<project>/<repo>/greatmemory:latest \
--port 7437 \
--cpu 1 --memory 2Gi \
--min-instances 1 --max-instances 1 \
--add-cloudsql-instances <project>:<region>:<instance> \
--set-secrets GM_DB=gm-db-url:latest,GM_API_KEYS=gm-api-keys:latest \
--set-env-vars GM_CORS_ORIGINS=https://app.example.com \
--no-allow-unauthenticated # or --allow-unauthenticated and rely on GM_API_KEYS
Notes:
--min-instances 1matters. The embedding model loads into the instance's memory (and downloads into the ephemeral/dataon a fresh instance). With scale-to-zero, every cold start pays that load — keep one instance warm.- Connecting to Cloud SQL — two options:
--add-cloudsql-instances(unix socket): the Cloud SQL connector exposes a socket at/cloudsql/<project>:<region>:<instance>; yourGM_DBURL must reference it as the host (Postgres URLs encode a socket directory as a URL-encodedhostparameter). Simple IAM-controlled setup, no IPs to manage.- Private IP: give the Cloud SQL instance a private IP, attach the Cloud Run service to the same VPC (Direct VPC egress or a connector), and use an ordinary
postgres://gm:pw@10.x.x.x:5432/gmURL inGM_DB. This is the more conventional URL shape and avoids socket-path encoding.
- Secrets: store the database URL and API keys in Secret Manager (
--set-secrets), not plain env vars. - TLS is handled by Cloud Run's HTTPS endpoint automatically; the container speaks plain HTTP on 7437.
- Health checks: point the startup/readiness probe at
/v1/readyz(200 only once storage is reachable and migrated) and liveness at/v1/healthz; both are always unauthenticated. See Upgrades & migrations. - Cloud Run can run multiple instances; with Postgres that is safe. Keep
--max-instances 1only if you have a reason to serialize writes — the engine itself is fine with several replicas on one database.
Path B: GCE VM + Docker Compose (stateful disk)
If you'd rather have SQLite and a real disk, a small Compute Engine VM is the straightforward alternative:
- Create a VM (e2-medium is plenty) with a persistent disk; install Docker and the compose plugin.
- Run greatmemory with the data dir on the persistent disk:
# /opt/greatmemory/docker-compose.yml
services:
greatmemory:
image: greatmemory # your pushed image tag (e.g. in Artifact Registry)
restart: unless-stopped
ports:
- "7437:7437"
volumes:
- /mnt/disks/gm-data:/data
environment:
GM_API_KEYS: ${GM_API_KEYS}
GM_CORS_ORIGINS: https://app.example.com
docker compose up -d
curl -s http://127.0.0.1:7437/v1/readyz
- Expose it: keep the VM private and reach it over the VPC/IAP, or put an HTTPS load balancer (or nginx/Caddy on the VM) in front for TLS. If you open port 7437 in a firewall rule, restrict the source ranges — and
GM_API_KEYSis mandatory. - Backups: schedule persistent-disk snapshots; the whole SQLite store is one file (
/mnt/disks/gm-data/greatmemory.db).
Storing the API key in Secret Manager
Create the secret once:
printf 'gm_%s' "$(openssl rand -hex 32)" | \
gcloud secrets create gm-api-keys --data-file=-
Cloud Run — the --set-secrets GM_API_KEYS=gm-api-keys:latest flag in
Path A mounts it as the env var. The service's runtime service account
needs access:
gcloud secrets add-iam-policy-binding gm-api-keys \
--member "serviceAccount:<runtime-sa>@<project>.iam.gserviceaccount.com" \
--role roles/secretmanager.secretAccessor
Because the mapping pins :latest, rotation = add a new version (old + new
keys comma-separated during the transition) and redeploy a revision, then
add a final version without the old key.
GCE — fetch at boot with the VM's service account (same IAM binding):
umask 077
echo "GM_API_KEYS=$(gcloud secrets versions access latest \
--secret gm-api-keys)" > /opt/greatmemory/.env
Security checklist
GM_API_KEYSset (long random values, in Secret Manager) — never expose the port without it- TLS terminated by Cloud Run / a load balancer; the container itself speaks plain HTTP
GM_CORS_ORIGINSset to your real origins- Cloud SQL reachable only via the connector or private IP — no public IP on the database
- Durable state in Cloud SQL automated backups or persistent-disk snapshots
/v1/readyzfor probes; watchrss_bytesfrom/v1/stats— it should stay flat