Integrations
Azure OpenAI
Use Azure OpenAI (Azure AI Foundry) as greatmemory's LLM and embedder via the OpenAI-compatible v1 API and deployment names.
Azure OpenAI (Azure AI Foundry)
Use Azure OpenAI - Microsoft's managed model platform in Azure AI Foundry, the
Azure counterpart to Vertex AI and Bedrock - as greatmemory's LLM and embedder.
Azure's v1 API is OpenAI-compatible, so greatmemory talks to it through the
openai provider kind with no code changes.
How it fits
greatmemory POSTs to {GM_LLM_URL}/chat/completions and
{GM_EMBEDDER_URL}/embeddings with an Authorization: Bearer token. GM_*_URL is
the full versioned base; Azure's v1 base already ends in /openai/v1, and - on
the GA v1 API - no api-version query parameter is needed.
Prerequisites
- An Azure OpenAI resource (
*.openai.azure.com) or Azure AI Foundry resource (*.services.ai.azure.com). - A deployment of each model you want to use. On Azure the
modelfield is the deployment name you choose at deploy time - not the base model id. Create a chat deployment (and an embeddings deployment if you want Azure embeddings). - The resource's API key (from the resource's Keys and Endpoint blade).
Authentication
Pass the resource API key as the bearer token:
export AZURE_OPENAI_KEY=<your-resource-api-key>
The v1 API accepts Authorization: Bearer <key> (this is exactly what the OpenAI
SDKs send against the /openai/v1/ base). Microsoft Entra ID tokens are also
supported for chat - but note Entra ID is not supported on the embeddings route,
so the API key is the reliable credential for embeddings.
Configuration
Replace RESOURCE with your resource name and the model values with your
deployment names:
RESOURCE=my-aoai-resource
BASE=https://$RESOURCE.openai.azure.com/openai/v1
# Azure AI Foundry resource? Use: https://$RESOURCE.services.ai.azure.com/openai/v1
# LLM: fact extraction + prospective reflection
GM_LLM=openai \
GM_LLM_URL=$BASE \
GM_LLM_API_KEY=$AZURE_OPENAI_KEY \
GM_LLM_MODEL=my-gpt-4.1-mini-deployment \
gmem serve
Add Azure embeddings (a text-embedding-3-small deployment returns 1536
dimensions, so set GM_EMBEDDER_DIM=1536):
GM_EMBEDDER=openai \
GM_EMBEDDER_URL=$BASE \
GM_EMBEDDER_API_KEY=$AZURE_OPENAI_KEY \
GM_EMBEDDER_MODEL=my-embedding-3-small-deployment \
GM_EMBEDDER_DIM=1536
Models
You deploy a base model under a name of your choosing, then pass that deployment
name as model. Current GA choices (June 2026 - see the
Azure model catalog):
| Role | Base model to deploy | Notes |
|---|---|---|
| Chat (cheap, default) | gpt-4.1-mini / gpt-4.1-nano / gpt-4o-mini | Support JSON / structured output |
| Chat (reasoning) | o4-mini / gpt-5-nano | GA reasoning minis |
| Embeddings | text-embedding-3-small | Default 1536 dims |
| Embeddings (large) | text-embedding-3-large | Default 3072 dims |
"response_format": {"type":"json_object"} is supported on Azure chat deployments,
which suits greatmemory's fact extraction.
Ingest large Azure sources
Azure OpenAI config controls the LLM/embedder. For source data, extract text from
Azure Storage or Azure databases and POST it to greatmemory. Adds return quickly
(202 Accepted), so batch jobs can stream rows/documents without waiting for every
embedding to finish.
Blob Storage documents
Use Azure CLI's az storage blob download-batch to recursively download a
container or prefix. Microsoft documents download-batch as recursively
downloading blobs from a container.
export GM_URL=http://127.0.0.1:7437
export SPACE=azure-knowledge-base
export ACCOUNT=myaccount
export CONTAINER=company-docs
mkdir -p /tmp/gm-blob
az storage blob download-batch \
--account-name "$ACCOUNT" \
--source "$CONTAINER" \
--destination /tmp/gm-blob \
--pattern 'policies/*' \
--auth-mode login
find /tmp/gm-blob -type f \( -name '*.md' -o -name '*.txt' -o -name '*.json' -o -name '*.csv' \) -print0 |
while IFS= read -r -d '' file; do
jq -n --rawfile content "$file" \
--arg space "$SPACE" \
--arg source "blob:${file#/tmp/gm-blob/}" \
'{space:$space, content:("SOURCE: " + $source + "\n\n" + $content)}' |
curl -sS "$GM_URL/v1/memories" \
-H 'Content-Type: application/json' \
-d @- >/dev/null
done
For PDF, DOCX, and PowerPoint files, run an extraction step first and POST the plain text. Keeping the blob path in the memory body makes later citations auditable.
Azure SQL Database rows
Use sqlcmd for Azure SQL Database or SQL Managed Instance. Microsoft documents
sqlcmd as the command-line utility for running T-SQL statements and scripts.
export GM_URL=http://127.0.0.1:7437
export SPACE=azure-sql
export SQLSERVER=myserver.database.windows.net
export SQLDB=appdb
export SQLUSER=readonly
sqlcmd -S "$SQLSERVER" -d "$SQLDB" -U "$SQLUSER" -P "$SQLPASSWORD" \
-h -1 -W -s '|' -Q "
SET NOCOUNT ON;
SELECT account_id, title, notes, CONVERT(varchar(33), updated_at, 126)
FROM dbo.account_notes
WHERE updated_at > DATEADD(day, -180, SYSUTCDATETIME());
" |
while IFS='|' read -r account title notes updated_at; do
[ -z "$account" ] && continue
jq -n \
--arg space "$SPACE" \
--arg account "$account" \
--arg title "$title" \
--arg notes "$notes" \
--arg updated_at "$updated_at" \
'{space:$space, content:("Azure SQL note for account " + $account + "\nTitle: " + $title + "\nUpdated: " + $updated_at + "\n\n" + $notes)}' |
curl -sS "$GM_URL/v1/memories" \
-H 'Content-Type: application/json' \
-d @- >/dev/null
done
Cosmos DB documents
For semi-structured application state in Cosmos DB for NoSQL, use the Azure Cosmos DB Python SDK to query items and POST selected fields. Microsoft documents the SDK for CRUD operations and querying items in containers.
import json
import os
import requests
from azure.cosmos import CosmosClient
GM_URL = os.getenv("GM_URL", "http://127.0.0.1:7437")
SPACE = "azure-cosmos"
client = CosmosClient(os.environ["COSMOS_ENDPOINT"], credential=os.environ["COSMOS_KEY"])
container = client.get_database_client("app").get_container_client("cases")
for item in container.query_items(
query="SELECT c.id, c.customerId, c.summary, c.updatedAt FROM c WHERE c.type = 'case'",
enable_cross_partition_query=True,
):
content = "Cosmos DB case document:\n" + json.dumps(item, ensure_ascii=False)
requests.post(f"{GM_URL}/v1/memories", json={"space": SPACE, "content": content}, timeout=10).raise_for_status()
Managed ETL with Azure Data Factory
For scheduled imports and governed data movement, orchestrate Blob Storage, Azure SQL, and Cosmos DB extraction with Azure Data Factory. Use an Azure Function or Web activity to post normalized text into greatmemory, then save returned memory ids in an Azure SQL manifest table. See Cloud ETL & data management for the pipeline diagram, function shape, and delete-by-manifest cleanup flow.
Notes & caveats
modelis the deployment name, not the base model id. A400usually means the deployment name is wrong; a404usually means the base URL is missing/openai/v1.- Embedding dimension must match exactly. greatmemory validates the returned
vector length against
GM_EMBEDDER_DIMand does not request a reduced dimension, so set the dim to the model default (1536 fortext-embedding-3-small, 3072 for-large). - Switching embedders needs a fresh data dir - existing vectors won't match a new model.
- Both
*.openai.azure.comand*.services.ai.azure.comuse the identical/openai/v1path and request shapes - pick whichever matches the resource you provisioned.
See the Configuration reference for the full GM_* variable
table.