MCP server

greatmemory speaks the Model Context Protocol, so any MCP-capable agent gets persistent long-term memory as a set of five tools. The MCP server is the same binary and the same engine as the HTTP API — no separate install.

Transports

stdio (recommended for local agents)

gmem mcp

Runs the MCP server on stdin/stdout. This mode does not need gmem serve running: it opens the database directly from the configured data dir (GM_DATA_DIR, default ./.greatmemory). Logs go to stderr so stdout stays clean for MCP framing.

Flags: --config <path> (alternate greatmemory.toml) and --data-dir <path>. Because agents may launch the process from any working directory, prefer an absolute GM_DATA_DIR (or --data-dir) so every client sees the same memories.

Streamable HTTP

A running gmem serve also exposes MCP at POST /mcp (http://127.0.0.1:7437/mcp), in stateless JSON mode, behind the same bearer auth as /v1 when GM_API_KEYS is set. Use this when you want several clients sharing one live instance.

In v0.1 the HTTP transport keeps its loopback Host validation (DNS-rebinding protection), so remote /mcp deployments are out of scope — use stdio, or a local HTTP connection.

The five tools

All tools default space to "default". Spaces let different projects, users, or agents keep separate memories.

remember

Store a piece of information in long-term memory. Returns the new memory's id.

Param	Type	Required	Default
`content`	string	yes	—
`space`	string	no	`"default"`

{"name": "remember", "arguments": {"content": "Deploys go through the release branch.", "space": "myproject"}}

Returns:

{"id": "0196a7c2-...", "space": "myproject"}

recall

Search long-term memory; returns the most relevant chunks and known facts as JSON.

Param	Type	Required	Default
`query`	string	yes	—
`space`	string	no	`"default"`
`k`	int	no	`8` (max chunks returned)

{"name": "recall", "arguments": {"query": "how do we deploy?", "space": "myproject", "k": 5}}

Returns:

{
  "chunks": [{"chunk_id": "...", "doc_id": "...", "text": "Deploys go through the release branch.", "score": 0.021}],
  "facts": []
}

get_context

Build a ready-to-use context block for a query: known facts first, then relevant memories, within a token budget. Returns plain text, not JSON — made to be pasted straight into a prompt. Usually the best single call before answering.

Param	Type	Required	Default
`query`	string	yes	—
`space`	string	no	`"default"`
`max_tokens`	int	no	`2000`

{"name": "get_context", "arguments": {"query": "deployment process", "max_tokens": 1000}}

get_profile

Summarize everything known in a space: active facts grouped by predicate, as JSON. Call it once at session start for standing context.

Param	Type	Required	Default
`space`	string	no	`"default"`

Returns:

{"facts": {"deploy_branch": [{"subject": "myproject", "object": "release", "confidence": 0.9, "fact_id": "..."}]}}

forget

Delete a memory by id. Facts already extracted from it are kept but lose their link to it.

Param	Type	Required
`memory_id`	string	yes (an id returned by `remember`)

Returns:

{"deleted": "<id>"}

Space conventions

space is a plain namespace string. Use one space per project (local multi-project use) or one space per user (multi-user servers) so memories don't bleed across contexts.
A sensible rhythm for an agent: get_profile once at session start, get_context per question, remember whenever something durable is learned, forget only on explicit request.
Don't store transient chit-chat — greatmemory's fact extractor also ignores it.

Transports

stdio (recommended for local agents)

Streamable HTTP

The five tools

remember

recall

get_context

get_profile

forget

Space conventions

Client setup guides