greatmemory

Deployment

AWS

EC2 + Docker Compose, or ECS Fargate with RDS Postgres + pgvector.

greatmemory is just a container. One image (~158 MB), one port (7437), and either one volume (/data) or a Postgres URL (GM_DB). Any AWS service that runs containers works - the two paths below are the ones we recommend.

Environment variables

VariablePurpose
GM_HOSTBind address - the published image already sets 0.0.0.0
GM_PORTBind/container port (default 7437)
GM_DATA_DIRData directory - the image sets /data (SQLite db + embedding model cache)
GM_DBPostgres URL (postgres://...) to use RDS instead of SQLite
GM_API_KEYSComma-separated bearer keys - required on any non-loopback bind
GM_CORS_ORIGINSExact allowed origins; replaces the permissive localhost default
GM_EMBEDDERfastembed (default, local ONNX) | ollama | openai
GM_LLMnone (default) | ollama | openai - enables fact extraction

Path A: EC2 + Docker Compose

The simplest stateful deployment: one instance, one EBS-backed volume, SQLite.

  1. Launch an instance. A small general-purpose instance (2 vCPU / 4 GB RAM) is plenty; the embedding model working set is a few hundred MB. Use an EBS root volume (or attach a dedicated EBS volume) so /data survives reboots and instance stops.
  2. Install Docker (and the compose plugin) via your distro's packages.
  3. Run greatmemory:
# /opt/greatmemory/docker-compose.yml
services:
  greatmemory:
    image: greatmemory          # your pushed image tag (e.g. in ECR)
    restart: unless-stopped
    ports:
      - "7437:7437"
    volumes:
      - /var/lib/greatmemory:/data    # EBS-backed host path
    environment:
      GM_API_KEYS: ${GM_API_KEYS}     # from an env file with mode 0600
      GM_CORS_ORIGINS: https://app.example.com
docker compose up -d
curl -s http://127.0.0.1:7437/v1/readyz
  1. Expose it - pick one:
    • Behind an ALB (recommended): keep the port mapping as 127.0.0.1:7437:7437, register the instance in a target group on port 7437, terminate TLS at the ALB with an ACM certificate, and point the target group health check at /v1/readyz (200 only once storage is reachable and migrated; always unauthenticated - see Upgrades & migrations). The instance security group should then allow 7437 only from the ALB's security group.
    • Behind nginx/Caddy on the instance: terminate TLS locally and proxy to 127.0.0.1:7437; open only 443 in the security group.
    • Direct on 7437: acceptable only inside a VPC/VPN. If you must open the port in a security group, restrict the source CIDR and treat GM_API_KEYS as mandatory - there is deliberately no unauthenticated mode worth running on a public address.
  2. Backups: SQLite is one file - snapshot the EBS volume, or run sqlite3 /var/lib/greatmemory/greatmemory.db ".backup ..." for a live-consistent copy. The model cache under /data/models never needs backing up.

Path B: ECS Fargate + RDS

Fargate tasks are ephemeral, so the durable state should live outside the task. Recommended: RDS for PostgreSQL with the pgvector extension, via GM_DB - then the task needs no persistent volume at all (the /data model cache simply re-downloads on a cold start). EFS mounted at /data with SQLite also works, but a managed Postgres is the better fit for tasks that come and go.

  1. Create the database: an RDS for PostgreSQL instance (pgvector is available on recent PostgreSQL versions). Enable the extension once:
CREATE EXTENSION IF NOT EXISTS vector;

If your RDS role can't run CREATE EXTENSION (a DBA provisions it for you), have a privileged role create it once and set GM_DB_ASSUME_PGVECTOR=1 - see Enterprise database (pgvector).

  1. Task definition sketch:
{
  "family": "greatmemory",
  "requiresCompatibilities": ["FARGATE"],
  "networkMode": "awsvpc",
  "cpu": "1024",
  "memory": "2048",
  "containerDefinitions": [
    {
      "name": "greatmemory",
      "image": "<account>.dkr.ecr.<region>.amazonaws.com/greatmemory:latest",
      "portMappings": [{ "containerPort": 7437, "protocol": "tcp" }],
      "secrets": [
        { "name": "GM_DB",       "valueFrom": "<secretsmanager-arn-for-db-url>" },
        { "name": "GM_API_KEYS", "valueFrom": "<secretsmanager-arn-for-api-keys>" }
      ],
      "environment": [
        { "name": "GM_CORS_ORIGINS", "value": "https://app.example.com" }
      ],
      "healthCheck": {
        "command": ["CMD-SHELL", "curl -sf http://127.0.0.1:7437/v1/readyz || exit 1"]
      }
    }
  ]
}
  1. Service: run the task in private subnets behind an ALB (TLS via ACM), target group port 7437, health check path /v1/readyz. Put GM_DB and GM_API_KEYS in Secrets Manager rather than plain environment values.
  2. If you prefer EFS + SQLite instead of RDS: add an EFS volume to the task definition, mount it at /data, omit GM_DB - and keep the service at exactly one task, since a SQLite file must not be written by multiple instances.
  3. Cold starts: with RDS, the first request after a new task starts triggers the one-time embedding model download into /data. Keep the service's desired count at ≥ 1 (no scale-to-zero) so the model stays warm.

Storing the API key in AWS Secrets Manager

Create the secret once:

aws secretsmanager create-secret \
  --name greatmemory/api-keys \
  --secret-string "gm_$(openssl rand -hex 32)"

ECS Fargate - reference it from the task definition (as in Path B); the task execution role needs secretsmanager:GetSecretValue on the secret's ARN:

"secrets": [
  { "name": "GM_API_KEYS",
    "valueFrom": "arn:aws:secretsmanager:<region>:<account>:secret:greatmemory/api-keys" }
]

EC2 - fetch it at boot into the 0600 env file Compose reads, using the instance role (no credentials on disk):

umask 077
echo "GM_API_KEYS=$(aws secretsmanager get-secret-value \
  --secret-id greatmemory/api-keys \
  --query SecretString --output text)" > /opt/greatmemory/.env

Rotation: write the new value with both keys comma-separated (gm_new...,gm_old...), redeploy, migrate clients, then drop the old key. SSM Parameter Store SecureString works the same way at lower cost (aws ssm get-parameter --with-decryption).

Security checklist

  • GM_API_KEYS set (long random values, stored in Secrets Manager / an 0600 env file) - never expose the port without it
  • TLS terminated by the ALB or a reverse proxy; the container itself speaks plain HTTP on a private network
  • GM_CORS_ORIGINS set to your real origins
  • Security groups: 7437 reachable only from the load balancer (or VPC-internal callers)
  • Durable state on EBS/EFS snapshots or RDS automated backups
  • /v1/readyz for health checks; watch rss_bytes from /v1/stats - it should stay flat