Agents

MCP server

greatmemory speaks the Model Context Protocol, so any MCP-capable agent gets persistent long-term memory as a set of five tools. The MCP server is the same binary and the same engine as the HTTP API — no separate install.

Transports

stdio (recommended for local agents)

gmem mcp

Runs the MCP server on stdin/stdout. This mode does not need gmem serve running: it opens the database directly from the configured data dir (GM_DATA_DIR, default ./.greatmemory). Logs go to stderr so stdout stays clean for MCP framing.

Flags: --config <path> (alternate greatmemory.toml) and --data-dir <path>. Because agents may launch the process from any working directory, prefer an absolute GM_DATA_DIR (or --data-dir) so every client sees the same memories.

Streamable HTTP

A running gmem serve also exposes MCP at POST /mcp (http://127.0.0.1:7437/mcp), in stateless JSON mode, behind the same bearer auth as /v1 when GM_API_KEYS is set. Use this when you want several clients sharing one live instance.

In v0.1 the HTTP transport keeps its loopback Host validation (DNS-rebinding protection), so remote /mcp deployments are out of scope — use stdio, or a local HTTP connection.

The five tools

All tools default space to "default". Spaces let different projects, users, or agents keep separate memories.

remember

Store a piece of information in long-term memory. Returns the new memory's id.

ParamTypeRequiredDefault
contentstringyes
spacestringno"default"
{"name": "remember", "arguments": {"content": "Deploys go through the release branch.", "space": "myproject"}}

Returns:

{"id": "0196a7c2-...", "space": "myproject"}

recall

Search long-term memory; returns the most relevant chunks and known facts as JSON.

ParamTypeRequiredDefault
querystringyes
spacestringno"default"
kintno8 (max chunks returned)
{"name": "recall", "arguments": {"query": "how do we deploy?", "space": "myproject", "k": 5}}

Returns:

{
  "chunks": [{"chunk_id": "...", "doc_id": "...", "text": "Deploys go through the release branch.", "score": 0.021}],
  "facts": []
}

get_context

Build a ready-to-use context block for a query: known facts first, then relevant memories, within a token budget. Returns plain text, not JSON — made to be pasted straight into a prompt. Usually the best single call before answering.

ParamTypeRequiredDefault
querystringyes
spacestringno"default"
max_tokensintno2000
{"name": "get_context", "arguments": {"query": "deployment process", "max_tokens": 1000}}

get_profile

Summarize everything known in a space: active facts grouped by predicate, as JSON. Call it once at session start for standing context.

ParamTypeRequiredDefault
spacestringno"default"

Returns:

{"facts": {"deploy_branch": [{"subject": "myproject", "object": "release", "confidence": 0.9, "fact_id": "..."}]}}

forget

Delete a memory by id. Facts already extracted from it are kept but lose their link to it.

ParamTypeRequired
memory_idstringyes (an id returned by remember)

Returns:

{"deleted": "<id>"}

Space conventions

  • space is a plain namespace string. Use one space per project (local multi-project use) or one space per user (multi-user servers) so memories don't bleed across contexts.
  • A sensible rhythm for an agent: get_profile once at session start, get_context per question, remember whenever something durable is learned, forget only on explicit request.
  • Don't store transient chit-chat — greatmemory's fact extractor also ignores it.

Client setup guides