greatmemory

Integrations

Amazon Bedrock

Use Amazon Bedrock for fact extraction and reflection via the OpenAI-compatible Chat Completions endpoint and a Bedrock API key.

Amazon Bedrock

Use Amazon Bedrock as greatmemory's LLM for fact extraction and prospective reflection. Bedrock exposes an OpenAI-compatible Chat Completions endpoint, so greatmemory talks to it through the openai provider kind with no code changes.

Embeddings: Bedrock's OpenAI-compatible endpoint provides chat only - there is no /embeddings route (Titan/Cohere embeddings are reachable only via Bedrock's native InvokeModel API, which is not OpenAI-shaped). So use Bedrock for the LLM, and keep greatmemory's local fastembed embedder (the default) - or another OpenAI-compatible embedder - for vectors.

How it fits

greatmemory POSTs to {GM_LLM_URL}/chat/completions with an Authorization: Bearer token. GM_LLM_URL is the full versioned base - greatmemory appends only /chat/completions.

Prerequisites

  • An AWS account with Amazon Bedrock enabled in a supported region (e.g. us-west-2) and model access granted for the model you intend to call.
  • A Bedrock API key (see below).

Authentication

Bedrock's OpenAI-compatible endpoint authenticates with a Bedrock API key passed as a bearer token - no AWS SigV4 signing required. Create one from the Bedrock console (or CLI) and export it:

export AWS_BEARER_TOKEN_BEDROCK=<your-bedrock-api-key>
  • Long-term keys last until a configured expiry - the practical choice for a long-running server's static Authorization header.
  • Short-term keys last up to 12 hours and inherit the generating principal's permissions - better for production, but you must refresh them.

Configuration

REGION=us-west-2
GM_LLM=openai \
GM_LLM_URL=https://bedrock-runtime.$REGION.amazonaws.com/openai/v1 \
GM_LLM_API_KEY=$AWS_BEARER_TOKEN_BEDROCK \
GM_LLM_MODEL=openai.gpt-oss-20b-1:0 \
gmem serve

The region is required in the host; there is no region-less endpoint.

Models

The OpenAI-compatible Chat Completions route accepts the OpenAI open-weight models hosted on Bedrock. These are current as of June 2026; check the Bedrock model catalog for the latest.

Rolemodel valueNotes
Chat (default)openai.gpt-oss-20b-1:0Smaller, lower latency - cheap default
Chat (larger)openai.gpt-oss-120b-1:0Higher quality

Anthropic Claude and other Bedrock models are served through Bedrock's native Converse/Invoke APIs rather than this OpenAI Chat Completions route, so use the openai.gpt-oss-* IDs here. greatmemory's fact-extraction prompt asks for strict JSON, which these models follow; OpenAI-style response_format JSON mode is not documented for them, so rely on the prompt (and the LLM remains optional - without it you still get full hybrid search, just no extracted facts).

Ingest large AWS sources

Bedrock config controls the LLM. Source ingestion is separate: read from S3 or AWS databases, extract the text you want agents to recall, then POST it to greatmemory. Adds are asynchronous, so a backfill process can stream documents or rows into /v1/memories.

S3 documents

Use aws s3 sync for prefixes of many files, or aws s3 cp for one object. AWS documents sync as recursively copying new and updated files between S3 and a directory.

export GM_URL=http://127.0.0.1:7437
export SPACE=aws-knowledge-base

mkdir -p /tmp/gm-s3
aws s3 sync s3://my-company-docs/policies /tmp/gm-s3

find /tmp/gm-s3 -type f \( -name '*.md' -o -name '*.txt' -o -name '*.json' -o -name '*.csv' \) -print0 |
while IFS= read -r -d '' file; do
  jq -n --rawfile content "$file" \
    --arg space "$SPACE" \
    --arg source "s3:${file#/tmp/gm-s3/}" \
    '{space:$space, content:("SOURCE: " + $source + "\n\n" + $content)}' |
  curl -sS "$GM_URL/v1/memories" \
    -H 'Content-Type: application/json' \
    -d @- >/dev/null
done

For PDF, DOCX, and other binary document formats, extract text first and include the S3 key in the content prefix so retrieval can point back to the source object.

RDS or Aurora PostgreSQL rows

Use psql against Amazon RDS for PostgreSQL or Aurora PostgreSQL. AWS documents connecting to RDS PostgreSQL with psql by providing the DB endpoint, port, credentials, and database name.

export GM_URL=http://127.0.0.1:7437
export SPACE=aws-rds
export PGHOST=mydb.abc123.us-west-2.rds.amazonaws.com
export PGPORT=5432
export PGDATABASE=appdb
export PGUSER=readonly

psql -At -c "
  select json_build_object(
    'account_id', account_id,
    'title', title,
    'notes', notes,
    'updated_at', updated_at
  )
  from account_notes
  where updated_at > now() - interval '180 days';
" |
while read -r row; do
  account=$(jq -r '.account_id' <<<"$row")
  jq -n \
    --arg space "$SPACE" \
    --arg account "$account" \
    --argjson row "$row" \
    '{space:$space, content:("RDS account note for " + $account + ":\n" + ($row|tojson))}' |
  curl -sS "$GM_URL/v1/memories" \
    -H 'Content-Type: application/json' \
    -d @- >/dev/null
done

DynamoDB items

For DynamoDB, use scan for a controlled backfill or query for a partitioned job. AWS documents Scan as reading every item in a table or secondary index, so prefer a projection and segment the job for large tables.

export GM_URL=http://127.0.0.1:7437
export SPACE=aws-dynamodb

aws dynamodb scan \
  --table-name CustomerCases \
  --projection-expression 'customerId, caseId, summary, updatedAt' \
  --output json |
jq -c '.Items[] | {
  customerId: .customerId.S,
  caseId: .caseId.S,
  summary: .summary.S,
  updatedAt: .updatedAt.S
}' |
while read -r item; do
  customer=$(jq -r '.customerId' <<<"$item")
  jq -n \
    --arg space "$SPACE" \
    --arg customer "$customer" \
    --argjson item "$item" \
    '{space:$space, content:("DynamoDB case memory for " + $customer + ":\n" + ($item|tojson))}' |
  curl -sS "$GM_URL/v1/memories" \
    -H 'Content-Type: application/json' \
    -d @- >/dev/null
done

Managed ETL with AWS Glue

For scheduled imports from S3, RDS/Aurora, DynamoDB, or cataloged data lake tables, run the extraction in AWS Glue and write returned memory ids to a DynamoDB, RDS, or S3 manifest. See Cloud ETL & data management for the Glue diagram, Python job shape, update strategy, and delete-by-manifest cleanup flow.

Embeddings

Bedrock does not offer an OpenAI-compatible embeddings endpoint. Leave greatmemory on its default local embedder (fastembed, no configuration needed), or point GM_EMBEDDER at any OpenAI-compatible embeddings API (for example Vertex AI or Azure OpenAI). See the Configuration reference for embedder options.