greatmemory

Integrations

Azure OpenAI

Use Azure OpenAI (Azure AI Foundry) as greatmemory's LLM and embedder via the OpenAI-compatible v1 API and deployment names.

Azure OpenAI (Azure AI Foundry)

Use Azure OpenAI - Microsoft's managed model platform in Azure AI Foundry, the Azure counterpart to Vertex AI and Bedrock - as greatmemory's LLM and embedder. Azure's v1 API is OpenAI-compatible, so greatmemory talks to it through the openai provider kind with no code changes.

How it fits

greatmemory POSTs to {GM_LLM_URL}/chat/completions and {GM_EMBEDDER_URL}/embeddings with an Authorization: Bearer token. GM_*_URL is the full versioned base; Azure's v1 base already ends in /openai/v1, and - on the GA v1 API - no api-version query parameter is needed.

Prerequisites

  • An Azure OpenAI resource (*.openai.azure.com) or Azure AI Foundry resource (*.services.ai.azure.com).
  • A deployment of each model you want to use. On Azure the model field is the deployment name you choose at deploy time - not the base model id. Create a chat deployment (and an embeddings deployment if you want Azure embeddings).
  • The resource's API key (from the resource's Keys and Endpoint blade).

Authentication

Pass the resource API key as the bearer token:

export AZURE_OPENAI_KEY=<your-resource-api-key>

The v1 API accepts Authorization: Bearer <key> (this is exactly what the OpenAI SDKs send against the /openai/v1/ base). Microsoft Entra ID tokens are also supported for chat - but note Entra ID is not supported on the embeddings route, so the API key is the reliable credential for embeddings.

Configuration

Replace RESOURCE with your resource name and the model values with your deployment names:

RESOURCE=my-aoai-resource
BASE=https://$RESOURCE.openai.azure.com/openai/v1
# Azure AI Foundry resource? Use: https://$RESOURCE.services.ai.azure.com/openai/v1

# LLM: fact extraction + prospective reflection
GM_LLM=openai \
GM_LLM_URL=$BASE \
GM_LLM_API_KEY=$AZURE_OPENAI_KEY \
GM_LLM_MODEL=my-gpt-4.1-mini-deployment \
gmem serve

Add Azure embeddings (a text-embedding-3-small deployment returns 1536 dimensions, so set GM_EMBEDDER_DIM=1536):

GM_EMBEDDER=openai \
GM_EMBEDDER_URL=$BASE \
GM_EMBEDDER_API_KEY=$AZURE_OPENAI_KEY \
GM_EMBEDDER_MODEL=my-embedding-3-small-deployment \
GM_EMBEDDER_DIM=1536

Models

You deploy a base model under a name of your choosing, then pass that deployment name as model. Current GA choices (June 2026 - see the Azure model catalog):

RoleBase model to deployNotes
Chat (cheap, default)gpt-4.1-mini / gpt-4.1-nano / gpt-4o-miniSupport JSON / structured output
Chat (reasoning)o4-mini / gpt-5-nanoGA reasoning minis
Embeddingstext-embedding-3-smallDefault 1536 dims
Embeddings (large)text-embedding-3-largeDefault 3072 dims

"response_format": {"type":"json_object"} is supported on Azure chat deployments, which suits greatmemory's fact extraction.

Ingest large Azure sources

Azure OpenAI config controls the LLM/embedder. For source data, extract text from Azure Storage or Azure databases and POST it to greatmemory. Adds return quickly (202 Accepted), so batch jobs can stream rows/documents without waiting for every embedding to finish.

Blob Storage documents

Use Azure CLI's az storage blob download-batch to recursively download a container or prefix. Microsoft documents download-batch as recursively downloading blobs from a container.

export GM_URL=http://127.0.0.1:7437
export SPACE=azure-knowledge-base
export ACCOUNT=myaccount
export CONTAINER=company-docs

mkdir -p /tmp/gm-blob
az storage blob download-batch \
  --account-name "$ACCOUNT" \
  --source "$CONTAINER" \
  --destination /tmp/gm-blob \
  --pattern 'policies/*' \
  --auth-mode login

find /tmp/gm-blob -type f \( -name '*.md' -o -name '*.txt' -o -name '*.json' -o -name '*.csv' \) -print0 |
while IFS= read -r -d '' file; do
  jq -n --rawfile content "$file" \
    --arg space "$SPACE" \
    --arg source "blob:${file#/tmp/gm-blob/}" \
    '{space:$space, content:("SOURCE: " + $source + "\n\n" + $content)}' |
  curl -sS "$GM_URL/v1/memories" \
    -H 'Content-Type: application/json' \
    -d @- >/dev/null
done

For PDF, DOCX, and PowerPoint files, run an extraction step first and POST the plain text. Keeping the blob path in the memory body makes later citations auditable.

Azure SQL Database rows

Use sqlcmd for Azure SQL Database or SQL Managed Instance. Microsoft documents sqlcmd as the command-line utility for running T-SQL statements and scripts.

export GM_URL=http://127.0.0.1:7437
export SPACE=azure-sql
export SQLSERVER=myserver.database.windows.net
export SQLDB=appdb
export SQLUSER=readonly

sqlcmd -S "$SQLSERVER" -d "$SQLDB" -U "$SQLUSER" -P "$SQLPASSWORD" \
  -h -1 -W -s '|' -Q "
    SET NOCOUNT ON;
    SELECT account_id, title, notes, CONVERT(varchar(33), updated_at, 126)
    FROM dbo.account_notes
    WHERE updated_at > DATEADD(day, -180, SYSUTCDATETIME());
  " |
while IFS='|' read -r account title notes updated_at; do
  [ -z "$account" ] && continue
  jq -n \
    --arg space "$SPACE" \
    --arg account "$account" \
    --arg title "$title" \
    --arg notes "$notes" \
    --arg updated_at "$updated_at" \
    '{space:$space, content:("Azure SQL note for account " + $account + "\nTitle: " + $title + "\nUpdated: " + $updated_at + "\n\n" + $notes)}' |
  curl -sS "$GM_URL/v1/memories" \
    -H 'Content-Type: application/json' \
    -d @- >/dev/null
done

Cosmos DB documents

For semi-structured application state in Cosmos DB for NoSQL, use the Azure Cosmos DB Python SDK to query items and POST selected fields. Microsoft documents the SDK for CRUD operations and querying items in containers.

import json
import os
import requests
from azure.cosmos import CosmosClient

GM_URL = os.getenv("GM_URL", "http://127.0.0.1:7437")
SPACE = "azure-cosmos"

client = CosmosClient(os.environ["COSMOS_ENDPOINT"], credential=os.environ["COSMOS_KEY"])
container = client.get_database_client("app").get_container_client("cases")

for item in container.query_items(
    query="SELECT c.id, c.customerId, c.summary, c.updatedAt FROM c WHERE c.type = 'case'",
    enable_cross_partition_query=True,
):
    content = "Cosmos DB case document:\n" + json.dumps(item, ensure_ascii=False)
    requests.post(f"{GM_URL}/v1/memories", json={"space": SPACE, "content": content}, timeout=10).raise_for_status()

Managed ETL with Azure Data Factory

For scheduled imports and governed data movement, orchestrate Blob Storage, Azure SQL, and Cosmos DB extraction with Azure Data Factory. Use an Azure Function or Web activity to post normalized text into greatmemory, then save returned memory ids in an Azure SQL manifest table. See Cloud ETL & data management for the pipeline diagram, function shape, and delete-by-manifest cleanup flow.

Notes & caveats

  • model is the deployment name, not the base model id. A 400 usually means the deployment name is wrong; a 404 usually means the base URL is missing /openai/v1.
  • Embedding dimension must match exactly. greatmemory validates the returned vector length against GM_EMBEDDER_DIM and does not request a reduced dimension, so set the dim to the model default (1536 for text-embedding-3-small, 3072 for -large).
  • Switching embedders needs a fresh data dir - existing vectors won't match a new model.
  • Both *.openai.azure.com and *.services.ai.azure.com use the identical /openai/v1 path and request shapes - pick whichever matches the resource you provisioned.

See the Configuration reference for the full GM_* variable table.