Deployment
AWS
EC2 + Docker Compose, or ECS Fargate with RDS Postgres + pgvector.
greatmemory is just a container. One image (~158 MB), one port (
7437), and either one volume (/data) or a Postgres URL (GM_DB). Any AWS service that runs containers works - the two paths below are the ones we recommend.
Environment variables
| Variable | Purpose |
|---|---|
GM_HOST | Bind address - the published image already sets 0.0.0.0 |
GM_PORT | Bind/container port (default 7437) |
GM_DATA_DIR | Data directory - the image sets /data (SQLite db + embedding model cache) |
GM_DB | Postgres URL (postgres://...) to use RDS instead of SQLite |
GM_API_KEYS | Comma-separated bearer keys - required on any non-loopback bind |
GM_CORS_ORIGINS | Exact allowed origins; replaces the permissive localhost default |
GM_EMBEDDER | fastembed (default, local ONNX) | ollama | openai |
GM_LLM | none (default) | ollama | openai - enables fact extraction |
Path A: EC2 + Docker Compose
The simplest stateful deployment: one instance, one EBS-backed volume, SQLite.
- Launch an instance. A small general-purpose instance (2 vCPU / 4 GB RAM) is plenty; the embedding model working set is a few hundred MB. Use an EBS root volume (or attach a dedicated EBS volume) so
/datasurvives reboots and instance stops. - Install Docker (and the compose plugin) via your distro's packages.
- Run greatmemory:
# /opt/greatmemory/docker-compose.yml
services:
greatmemory:
image: greatmemory # your pushed image tag (e.g. in ECR)
restart: unless-stopped
ports:
- "7437:7437"
volumes:
- /var/lib/greatmemory:/data # EBS-backed host path
environment:
GM_API_KEYS: ${GM_API_KEYS} # from an env file with mode 0600
GM_CORS_ORIGINS: https://app.example.com
docker compose up -d
curl -s http://127.0.0.1:7437/v1/readyz
- Expose it - pick one:
- Behind an ALB (recommended): keep the port mapping as
127.0.0.1:7437:7437, register the instance in a target group on port 7437, terminate TLS at the ALB with an ACM certificate, and point the target group health check at/v1/readyz(200 only once storage is reachable and migrated; always unauthenticated - see Upgrades & migrations). The instance security group should then allow 7437 only from the ALB's security group. - Behind nginx/Caddy on the instance: terminate TLS locally and proxy to
127.0.0.1:7437; open only 443 in the security group. - Direct on 7437: acceptable only inside a VPC/VPN. If you must open the port in a security group, restrict the source CIDR and treat
GM_API_KEYSas mandatory - there is deliberately no unauthenticated mode worth running on a public address.
- Behind an ALB (recommended): keep the port mapping as
- Backups: SQLite is one file - snapshot the EBS volume, or run
sqlite3 /var/lib/greatmemory/greatmemory.db ".backup ..."for a live-consistent copy. The model cache under/data/modelsnever needs backing up.
Path B: ECS Fargate + RDS
Fargate tasks are ephemeral, so the durable state should live outside the task. Recommended: RDS for PostgreSQL with the pgvector extension, via GM_DB - then the task needs no persistent volume at all (the /data model cache simply re-downloads on a cold start). EFS mounted at /data with SQLite also works, but a managed Postgres is the better fit for tasks that come and go.
- Create the database: an RDS for PostgreSQL instance (pgvector is available on recent PostgreSQL versions). Enable the extension once:
CREATE EXTENSION IF NOT EXISTS vector;
If your RDS role can't run
CREATE EXTENSION(a DBA provisions it for you), have a privileged role create it once and setGM_DB_ASSUME_PGVECTOR=1- see Enterprise database (pgvector).
- Task definition sketch:
{
"family": "greatmemory",
"requiresCompatibilities": ["FARGATE"],
"networkMode": "awsvpc",
"cpu": "1024",
"memory": "2048",
"containerDefinitions": [
{
"name": "greatmemory",
"image": "<account>.dkr.ecr.<region>.amazonaws.com/greatmemory:latest",
"portMappings": [{ "containerPort": 7437, "protocol": "tcp" }],
"secrets": [
{ "name": "GM_DB", "valueFrom": "<secretsmanager-arn-for-db-url>" },
{ "name": "GM_API_KEYS", "valueFrom": "<secretsmanager-arn-for-api-keys>" }
],
"environment": [
{ "name": "GM_CORS_ORIGINS", "value": "https://app.example.com" }
],
"healthCheck": {
"command": ["CMD-SHELL", "curl -sf http://127.0.0.1:7437/v1/readyz || exit 1"]
}
}
]
}
- Service: run the task in private subnets behind an ALB (TLS via ACM), target group port 7437, health check path
/v1/readyz. PutGM_DBandGM_API_KEYSin Secrets Manager rather than plain environment values. - If you prefer EFS + SQLite instead of RDS: add an EFS volume to the task definition, mount it at
/data, omitGM_DB- and keep the service at exactly one task, since a SQLite file must not be written by multiple instances. - Cold starts: with RDS, the first request after a new task starts triggers the one-time embedding model download into
/data. Keep the service's desired count at ≥ 1 (no scale-to-zero) so the model stays warm.
Storing the API key in AWS Secrets Manager
Create the secret once:
aws secretsmanager create-secret \
--name greatmemory/api-keys \
--secret-string "gm_$(openssl rand -hex 32)"
ECS Fargate - reference it from the task definition (as in Path B); the
task execution role needs secretsmanager:GetSecretValue on the secret's
ARN:
"secrets": [
{ "name": "GM_API_KEYS",
"valueFrom": "arn:aws:secretsmanager:<region>:<account>:secret:greatmemory/api-keys" }
]
EC2 - fetch it at boot into the 0600 env file Compose reads, using the instance role (no credentials on disk):
umask 077
echo "GM_API_KEYS=$(aws secretsmanager get-secret-value \
--secret-id greatmemory/api-keys \
--query SecretString --output text)" > /opt/greatmemory/.env
Rotation: write the new value with both keys comma-separated
(gm_new...,gm_old...), redeploy, migrate clients, then drop the old key.
SSM Parameter Store SecureString works the same way at lower cost
(aws ssm get-parameter --with-decryption).
Security checklist
GM_API_KEYSset (long random values, stored in Secrets Manager / an 0600 env file) - never expose the port without it- TLS terminated by the ALB or a reverse proxy; the container itself speaks plain HTTP on a private network
GM_CORS_ORIGINSset to your real origins- Security groups: 7437 reachable only from the load balancer (or VPC-internal callers)
- Durable state on EBS/EFS snapshots or RDS automated backups
/v1/readyzfor health checks; watchrss_bytesfrom/v1/stats- it should stay flat