Integrations
Amazon Bedrock
Use Amazon Bedrock for fact extraction and reflection via the OpenAI-compatible Chat Completions endpoint and a Bedrock API key.
Amazon Bedrock
Use Amazon Bedrock as greatmemory's LLM for fact extraction and prospective
reflection. Bedrock exposes an OpenAI-compatible Chat Completions endpoint, so
greatmemory talks to it through the openai provider kind with no code changes.
Embeddings: Bedrock's OpenAI-compatible endpoint provides chat only - there is no
/embeddingsroute (Titan/Cohere embeddings are reachable only via Bedrock's nativeInvokeModelAPI, which is not OpenAI-shaped). So use Bedrock for the LLM, and keep greatmemory's localfastembedembedder (the default) - or another OpenAI-compatible embedder - for vectors.
How it fits
greatmemory POSTs to {GM_LLM_URL}/chat/completions with an
Authorization: Bearer token. GM_LLM_URL is the full versioned base -
greatmemory appends only /chat/completions.
Prerequisites
- An AWS account with Amazon Bedrock enabled in a supported region (e.g.
us-west-2) and model access granted for the model you intend to call. - A Bedrock API key (see below).
Authentication
Bedrock's OpenAI-compatible endpoint authenticates with a Bedrock API key passed as a bearer token - no AWS SigV4 signing required. Create one from the Bedrock console (or CLI) and export it:
export AWS_BEARER_TOKEN_BEDROCK=<your-bedrock-api-key>
- Long-term keys last until a configured expiry - the practical choice for a
long-running server's static
Authorizationheader. - Short-term keys last up to 12 hours and inherit the generating principal's permissions - better for production, but you must refresh them.
Configuration
REGION=us-west-2
GM_LLM=openai \
GM_LLM_URL=https://bedrock-runtime.$REGION.amazonaws.com/openai/v1 \
GM_LLM_API_KEY=$AWS_BEARER_TOKEN_BEDROCK \
GM_LLM_MODEL=openai.gpt-oss-20b-1:0 \
gmem serve
The region is required in the host; there is no region-less endpoint.
Models
The OpenAI-compatible Chat Completions route accepts the OpenAI open-weight models hosted on Bedrock. These are current as of June 2026; check the Bedrock model catalog for the latest.
| Role | model value | Notes |
|---|---|---|
| Chat (default) | openai.gpt-oss-20b-1:0 | Smaller, lower latency - cheap default |
| Chat (larger) | openai.gpt-oss-120b-1:0 | Higher quality |
Anthropic Claude and other Bedrock models are served through Bedrock's native Converse/Invoke APIs rather than this OpenAI Chat Completions route, so use the
openai.gpt-oss-*IDs here. greatmemory's fact-extraction prompt asks for strict JSON, which these models follow; OpenAI-styleresponse_formatJSON mode is not documented for them, so rely on the prompt (and the LLM remains optional - without it you still get full hybrid search, just no extracted facts).
Ingest large AWS sources
Bedrock config controls the LLM. Source ingestion is separate: read from S3 or AWS
databases, extract the text you want agents to recall, then POST it to
greatmemory. Adds are asynchronous, so a backfill process can stream documents or
rows into /v1/memories.
S3 documents
Use aws s3 sync for prefixes of many files, or aws s3 cp for one object. AWS
documents sync as recursively copying new and updated files between S3 and a
directory.
export GM_URL=http://127.0.0.1:7437
export SPACE=aws-knowledge-base
mkdir -p /tmp/gm-s3
aws s3 sync s3://my-company-docs/policies /tmp/gm-s3
find /tmp/gm-s3 -type f \( -name '*.md' -o -name '*.txt' -o -name '*.json' -o -name '*.csv' \) -print0 |
while IFS= read -r -d '' file; do
jq -n --rawfile content "$file" \
--arg space "$SPACE" \
--arg source "s3:${file#/tmp/gm-s3/}" \
'{space:$space, content:("SOURCE: " + $source + "\n\n" + $content)}' |
curl -sS "$GM_URL/v1/memories" \
-H 'Content-Type: application/json' \
-d @- >/dev/null
done
For PDF, DOCX, and other binary document formats, extract text first and include the S3 key in the content prefix so retrieval can point back to the source object.
RDS or Aurora PostgreSQL rows
Use psql against Amazon RDS for PostgreSQL or Aurora PostgreSQL. AWS documents
connecting to RDS PostgreSQL with psql by providing the DB endpoint, port,
credentials, and database name.
export GM_URL=http://127.0.0.1:7437
export SPACE=aws-rds
export PGHOST=mydb.abc123.us-west-2.rds.amazonaws.com
export PGPORT=5432
export PGDATABASE=appdb
export PGUSER=readonly
psql -At -c "
select json_build_object(
'account_id', account_id,
'title', title,
'notes', notes,
'updated_at', updated_at
)
from account_notes
where updated_at > now() - interval '180 days';
" |
while read -r row; do
account=$(jq -r '.account_id' <<<"$row")
jq -n \
--arg space "$SPACE" \
--arg account "$account" \
--argjson row "$row" \
'{space:$space, content:("RDS account note for " + $account + ":\n" + ($row|tojson))}' |
curl -sS "$GM_URL/v1/memories" \
-H 'Content-Type: application/json' \
-d @- >/dev/null
done
DynamoDB items
For DynamoDB, use scan for a controlled backfill or query for a partitioned
job. AWS documents Scan as reading every item in a table or secondary index, so
prefer a projection and segment the job for large tables.
export GM_URL=http://127.0.0.1:7437
export SPACE=aws-dynamodb
aws dynamodb scan \
--table-name CustomerCases \
--projection-expression 'customerId, caseId, summary, updatedAt' \
--output json |
jq -c '.Items[] | {
customerId: .customerId.S,
caseId: .caseId.S,
summary: .summary.S,
updatedAt: .updatedAt.S
}' |
while read -r item; do
customer=$(jq -r '.customerId' <<<"$item")
jq -n \
--arg space "$SPACE" \
--arg customer "$customer" \
--argjson item "$item" \
'{space:$space, content:("DynamoDB case memory for " + $customer + ":\n" + ($item|tojson))}' |
curl -sS "$GM_URL/v1/memories" \
-H 'Content-Type: application/json' \
-d @- >/dev/null
done
Managed ETL with AWS Glue
For scheduled imports from S3, RDS/Aurora, DynamoDB, or cataloged data lake tables, run the extraction in AWS Glue and write returned memory ids to a DynamoDB, RDS, or S3 manifest. See Cloud ETL & data management for the Glue diagram, Python job shape, update strategy, and delete-by-manifest cleanup flow.
Embeddings
Bedrock does not offer an OpenAI-compatible embeddings endpoint. Leave greatmemory on
its default local embedder (fastembed, no configuration needed), or point
GM_EMBEDDER at any OpenAI-compatible embeddings API (for example
Vertex AI or Azure OpenAI). See the
Configuration reference for embedder options.