greatmemory speaks the Model Context Protocol, so any MCP-capable agent gets persistent long-term memory as a set of five tools. The MCP server is the same binary and the same engine as the HTTP API — no separate install.
Transports
stdio (recommended for local agents)
gmem mcp
Runs the MCP server on stdin/stdout. This mode does not need gmem serve running: it opens the database directly from the configured data dir (GM_DATA_DIR, default ./.greatmemory). Logs go to stderr so stdout stays clean for MCP framing.
Flags: --config <path> (alternate greatmemory.toml) and --data-dir <path>. Because agents may launch the process from any working directory, prefer an absolute GM_DATA_DIR (or --data-dir) so every client sees the same memories.
Streamable HTTP
A running gmem serve also exposes MCP at POST /mcp (http://127.0.0.1:7437/mcp), in stateless JSON mode, behind the same bearer auth as /v1 when GM_API_KEYS is set. Use this when you want several clients sharing one live instance.
In v0.1 the HTTP transport keeps its loopback
Hostvalidation (DNS-rebinding protection), so remote/mcpdeployments are out of scope — use stdio, or a local HTTP connection.
The five tools
All tools default space to "default". Spaces let different projects, users, or agents keep separate memories.
remember
Store a piece of information in long-term memory. Returns the new memory's id.
| Param | Type | Required | Default |
|---|---|---|---|
content | string | yes | — |
space | string | no | "default" |
{"name": "remember", "arguments": {"content": "Deploys go through the release branch.", "space": "myproject"}}
Returns:
{"id": "0196a7c2-...", "space": "myproject"}
recall
Search long-term memory; returns the most relevant chunks and known facts as JSON.
| Param | Type | Required | Default |
|---|---|---|---|
query | string | yes | — |
space | string | no | "default" |
k | int | no | 8 (max chunks returned) |
{"name": "recall", "arguments": {"query": "how do we deploy?", "space": "myproject", "k": 5}}
Returns:
{
"chunks": [{"chunk_id": "...", "doc_id": "...", "text": "Deploys go through the release branch.", "score": 0.021}],
"facts": []
}
get_context
Build a ready-to-use context block for a query: known facts first, then relevant memories, within a token budget. Returns plain text, not JSON — made to be pasted straight into a prompt. Usually the best single call before answering.
| Param | Type | Required | Default |
|---|---|---|---|
query | string | yes | — |
space | string | no | "default" |
max_tokens | int | no | 2000 |
{"name": "get_context", "arguments": {"query": "deployment process", "max_tokens": 1000}}
get_profile
Summarize everything known in a space: active facts grouped by predicate, as JSON. Call it once at session start for standing context.
| Param | Type | Required | Default |
|---|---|---|---|
space | string | no | "default" |
Returns:
{"facts": {"deploy_branch": [{"subject": "myproject", "object": "release", "confidence": 0.9, "fact_id": "..."}]}}
forget
Delete a memory by id. Facts already extracted from it are kept but lose their link to it.
| Param | Type | Required |
|---|---|---|
memory_id | string | yes (an id returned by remember) |
Returns:
{"deleted": "<id>"}
Space conventions
spaceis a plain namespace string. Use one space per project (local multi-project use) or one space per user (multi-user servers) so memories don't bleed across contexts.- A sensible rhythm for an agent:
get_profileonce at session start,get_contextper question,rememberwhenever something durable is learned,forgetonly on explicit request. - Don't store transient chit-chat — greatmemory's fact extractor also ignores it.