Reference

Configuration reference

Every environment variable, greatmemory.toml key, CLI flag, and feature toggle in one place.

Configuration reference

Every knob in one place: the environment variables, the greatmemory.toml file, the gmem CLI flags, and the feature toggles. greatmemory layers its configuration so later layers win:

CLI flags  >  GM_* environment variables  >  greatmemory.toml  >  built-in defaults

CLI flags - gmem serve --host --port --data-dir --config plus the feature toggles below; gmem mcp --data-dir --config. See the CLI reference.
Environment - every GM_* variable in the tables below.
greatmemory.toml - read from the current directory by default (skipped silently when absent); --config <path> selects an explicit file, which must exist. Unknown keys are an error, so typos fail fast.
Defaults - local-first: loopback bind, SQLite, local embeddings, no auth, no LLM.

Booleans accept 1/0, true/false, yes/no, on/off (case-insensitive); any other value fails startup. Comma-separated lists (GM_CORS_ORIGINS, GM_API_KEYS) are trimmed and drop empty segments. An unparseable GM_PORT or GM_EMBEDDER_DIM fails startup rather than being silently ignored.

Server environment variables

Variable	Default	Example	Purpose
`GM_HOST`	`127.0.0.1`	`0.0.0.0`	Bind address for `serve`
`GM_PORT`	`7437`	`8080`	Bind port for `serve`
`GM_DATA_DIR`	`./.greatmemory`	`/var/lib/greatmemory`	Directory for the SQLite db and embedding model cache (created on demand)
`GM_DB`	(→ `<data_dir>/greatmemory.db`)	`postgres://gm:pw@db:5432/gm`	Database selector: a `postgres://`/`postgresql://` URL selects Postgres + pgvector; `:memory:` an in-memory SQLite db; anything else a SQLite file path
`GM_CORS_ORIGINS`	(empty → any localhost)	`https://app.example.com,https://admin.example.com`	Comma-separated exact allowed CORS origins
`GM_API_KEYS`	(empty → no auth)	`gm_k1,gm_k2`	Comma-separated bearer keys for `/v1` and `/mcp`
`GM_EMBEDDER`	`fastembed`	`ollama`	Embedder kind: `fastembed` \| `ollama` \| `openai` \| `fake`
`GM_EMBEDDER_URL`	(per kind)	`http://127.0.0.1:11434`	Base URL for HTTP embedders
`GM_EMBEDDER_API_KEY`	(unset)	`sk-...`	API key for the `openai` embedder (optional)
`GM_EMBEDDER_MODEL`	(unset)	`nomic-embed-text`	Model name (required for `ollama`/`openai`)
`GM_EMBEDDER_DIM`	(unset)	`768`	Embedding dimension (required for `ollama`/`openai`)
`GM_LLM`	`none`	`ollama`	LLM kind for fact extraction & reflection: `none` \| `ollama` \| `openai`
`GM_LLM_URL`	(per kind)	`https://api.example.com/v1`	Base URL (required for `openai`)
`GM_LLM_API_KEY`	(unset)	`sk-...`	API key for the `openai` LLM (optional)
`GM_LLM_MODEL`	(unset)	`llama3`	Model name (required for `ollama`/`openai`)
`GM_RERANK`	`none`	`lexical`	Second-stage reranker: `none` \| `noop` \| `lexical`
`GM_GRAPH_EXPAND_HOPS`	`0`	`1`	Knowledge-graph expansion hops during retrieval (`0` disables)
`GM_GRAPH_BACKEND`	`store`	`mysql://…`	Knowledge-graph backend: `store` reuses the main DB, or a separate `mysql://`/`postgres://`/sqlite path
`GM_CARDS_SEMANTIC_LINKING`	`false`	`true`	Embed memory cards on create and auto-link by cosine similarity instead of keyword/tag overlap
`GM_REFLECTION`	`true`	`false`	Prospective reflection (RMM): auto-summarize each ingested document into a memory card. Needs an LLM; a no-op without one
`GM_USEFULNESS`	`true`	`false`	Live-Evo usefulness reweighting: nudge frequently-retrieved memories up the ranking
`GM_TRUST`	`true`	`false`	Trust gating (InjecMEM): quarantine prompt-injection-looking content from the knowledge graph and downgrade its trust score

Memory feature toggles

Three higher-level memory behaviors are on by default and can be turned off independently - via greatmemory.toml, a GM_* env var, or a gmem serve flag (the flag wins, then env, then toml). The research behind each is on the Research & roadmap page.

Feature	toml key	env var	`serve` flags
Prospective reflection (RMM)	`features.reflection`	`GM_REFLECTION`	`--enable-reflection` / `--disable-reflection`
Usefulness reweighting (Live-Evo)	`features.usefulness`	`GM_USEFULNESS`	`--enable-usefulness` / `--disable-usefulness`
Trust gating (InjecMEM)	`features.trust`	`GM_TRUST`	`--enable-trust` / `--disable-trust`

# All three off (ingest-only, no LLM cards, plain relevance ranking, no quarantine):
GM_REFLECTION=false GM_USEFULNESS=false GM_TRUST=false gmem serve
# Same, via flags (flags override env/toml; both --enable-X and --disable-X is an error):
gmem serve --disable-reflection --disable-usefulness --disable-trust

Reflection - each ingested document is summarized by the LLM into one agentic memory card (title, summary, keywords, tags), auto-linked to related cards. Requires GM_LLM; a no-op without one. Off → ingest without generating cards (you can still create them via /v1/cards or gmem card create).
Usefulness - retrieval keeps a usefulness score per memory, raised when a memory is returned and decayed when it is not, then nudges frequently-used memories up the ranking. Off → purely relevance-ordered ranking.
Trust gating - content that looks like a prompt-injection attempt is stored (auditable) but quarantined from the knowledge graph: no facts/edges extracted, trust score downgraded. Off → no quarantine; use only when every source is trusted.

Other toggles

Behavior	How to enable	Notes
Reranking	`GM_RERANK=lexical`	Off by default (`none`); reorders fused candidates before truncation
Graph expansion	`GM_GRAPH_EXPAND_HOPS=1`	Off by default (`0`); walks the graph to pull in related facts
Semantic card linking	`GM_CARDS_SEMANTIC_LINKING=true`	Off by default; links cards by embedding cosine similarity
Separate graph backend	`GM_GRAPH_BACKEND=mysql://…`	Default `store` reuses the main DB

Hosted model providers

Any OpenAI-compatible endpoint works as the LLM (GM_LLM=openai) and/or embedder (GM_EMBEDDER=openai): point GM_*_URL at the provider's full versioned base (for OpenAI itself that's https://api.openai.com/v1) and greatmemory appends only /chat/completions or /embeddings, with Authorization: Bearer $GM_*_API_KEY. That's why the major clouds slot in without code changes, even though their version paths differ. The big three have dedicated walkthroughs:

Provider	Auth	LLM	Embeddings	Guide
Google Vertex AI	OAuth token (`gcloud`)	yes	yes	Vertex AI
Amazon Bedrock	Bedrock API key	yes	no (use `fastembed`)	Bedrock
Azure OpenAI	resource API key	yes	yes	Azure OpenAI

Gemini API (Google AI Studio)

The simplest Google path - one API key from aistudio.google.com, no Google Cloud project (for the project-based platform, see Vertex AI). Base URL https://generativelanguage.googleapis.com/v1beta/openai:

# Fact extraction + prospective reflection via Gemini
GM_LLM=openai \
GM_LLM_URL=https://generativelanguage.googleapis.com/v1beta/openai \
GM_LLM_API_KEY=$GEMINI_API_KEY \
GM_LLM_MODEL=gemini-3.5-flash \
gmem serve

# ...and/or Gemini embeddings (alongside the LLM vars above):
GM_EMBEDDER=openai \
GM_EMBEDDER_URL=https://generativelanguage.googleapis.com/v1beta/openai \
GM_EMBEDDER_API_KEY=$GEMINI_API_KEY \
GM_EMBEDDER_MODEL=gemini-embedding-001 \
GM_EMBEDDER_DIM=3072

Chat models: gemini-3.5-flash (latest GA flash - a good default), gemini-3.1-flash-lite (cheapest), or gemini-2.5-flash / gemini-2.5-pro (current as of June 2026; check the model list for newer GA models).
Embedding model: gemini-embedding-001, default output 3072 dims - set GM_EMBEDDER_DIM=3072 (the dimension must match exactly or ingestion fails).

Embeddings note (all providers). greatmemory requires the returned vector length to equal GM_EMBEDDER_DIM exactly and does not request a reduced dimension, so pick a model whose default output matches the dim you set. Changing the embedding model needs a fresh data dir (existing vectors won't match a new model).

CLI client variables

The client commands (add, search, facts, status) talk to a running server and use two variables of their own:

Variable	Default	Purpose
`GM_URL`	`http://127.0.0.1:7437`	Server base URL
`GM_API_KEY`	(unset)	Bearer token sent when the server has `GM_API_KEYS` configured

greatmemory.toml

All keys optional; this example shows everything, including the feature toggles:

host = "127.0.0.1"
port = 7437
data_dir = "./.greatmemory"
# db = "postgres://gm:password@localhost:5432/greatmemory"
cors_origins = ["http://localhost:3000"]
api_keys = ["gm_change_me"]

[embedder]
kind = "fastembed"          # fastembed | ollama | openai | fake
# base_url = "http://127.0.0.1:11434"
# api_key = "sk-..."
# model = "nomic-embed-text"
# dim = 768

[llm]
kind = "none"               # none | ollama | openai
# base_url = "http://127.0.0.1:11434"
# api_key = "sk-..."
# model = "llama3"

[rerank]
kind = "none"               # none | noop | lexical

[graph]
backend = "store"           # store | mysql://… | postgres://… | sqlite path | :memory:

[cards]
semantic_linking = false

[features]
reflection = true           # prospective reflection (RMM); needs an LLM
usefulness = true           # Live-Evo usefulness reweighting
trust = true                # trust gating (InjecMEM)

graph_expand_hops = 0       # graph expansion during retrieval (0 = off)

Auth & CORS

Set GM_API_KEYS (or api_keys in toml) to one or more bearer keys. When non-empty, every /v1 route except /v1/healthz - and the /mcp endpoint - requires Authorization: Bearer <key>. By default (empty GM_CORS_ORIGINS) the server allows any localhost origin; set an explicit comma-separated list to replace that policy entirely (the permissive localhost default is then disabled). See API keys & air-gapped use for rotation and offline operation.

GM_API_KEYS=gm_prod_key GM_CORS_ORIGINS=https://app.example.com gmem serve
GM_API_KEY=gm_prod_key gmem status   # client side