GoodMem
ReferenceAPIREST APILLMs

Create a new LLM

Creates a new LLM configuration for text generation services. LLMs represent connections to different language model API services (like OpenAI, vLLM, etc.) and include all the necessary configuration to use them for text generation.

DUPLICATE DETECTION: Returns HTTP 409 Conflict (ALREADY_EXISTS) if another LLM exists with identical {ownerId, providerType, endpointUrl, apiPath, modelIdentifier, credentialsFingerprint} after URL canonicalization. Uniqueness is enforced per-owner. Credentials are hashed (SHA-256) for uniqueness while remaining encrypted. The apiPath field defaults to '/chat/completions' if omitted. Requires CREATE_LLM_OWN permission (or CREATE_LLM_ANY for admin users).

POST
/v1/llms
x-api-key<token>

In: header

LLM configuration details

displayNamestring

User-facing name of the LLM

Default"GPT-4 Turbo"
Length1 <= length <= 255
description?string | null

Description of the LLM

Default"OpenAI's GPT-4 Turbo model for chat completions"
providerTypeLLMProviderType

Type of LLM provider

Default"OPENAI"
Value in"OPENAI" | "LITELLM_PROXY" | "OPEN_ROUTER" | "VLLM" | "OLLAMA" | "LLAMA_CPP" | "CUSTOM_OPENAI_COMPATIBLE"
endpointUrlstring

API endpoint base URL (OpenAI-compatible base, typically ends with /v1)

Default"https://api.openai.com/v1"
apiPath?string | null

API path for chat/completions request (defaults to /chat/completions if not provided)

Default"/chat/completions"
modelIdentifierstring

Model identifier

Default"gpt-4-turbo-preview"
supportedModalities?array<Modality> | null

Supported content modalities (defaults to TEXT if not provided)

Default["TEXT"]
credentials?ApiKey (CREDENTIAL_KIND_API_KEY) | GcpAdc (CREDENTIAL_KIND_GCP_ADC) | None (omit this field)

Structured credential payload describing how to authenticate with the provider. Omit for deployments that do not require credentials.

Default{"kind":"CREDENTIAL_KIND_API_KEY","apiKey":{"inlineSecret":"sk-llm-demo-key"}}

Structured credential payload describing how GoodMem should authenticate with an upstream provider.

kindstring

Credential strategy — fixed to CREDENTIAL_KIND_API_KEY for this variant

Default"CREDENTIAL_KIND_API_KEY"
apiKeyApiKeyAuth

Configuration when kind is CREDENTIAL_KIND_API_KEY

Default{"inlineSecret":"sk-1234567890abcdef","secretRef":{"uri":"vault://path/to/secret"},"headerName":"Authorization","prefix":"Bearer "}
labels?object | null

Optional annotations to aid operators (e.g., "owner=vertex")

Propertiesproperties <= 20

Empty Object

Structured credential payload describing how GoodMem should authenticate with an upstream provider.

kindstring

Credential strategy — fixed to CREDENTIAL_KIND_GCP_ADC for this variant

Default"CREDENTIAL_KIND_GCP_ADC"
gcpAdcGcpAdcAuth

Configuration when kind is CREDENTIAL_KIND_GCP_ADC

Default{"scopes":["https://www.googleapis.com/auth/cloud-platform"],"quotaProjectId":"my-quota-project"}
labels?object | null

Optional annotations to aid operators (e.g., "owner=vertex")

Propertiesproperties <= 20

Empty Object

labels?object | null

User-defined labels for categorization

Default{"environment":"production","team":"ai"}
Propertiesproperties <= 20

Empty Object

version?string | null

Version information

Default"1.0.0"
monitoringEndpoint?string | null

Monitoring endpoint URL

Default"https://monitoring.example.com/llms/status"
capabilities?LLMCapabilities | null

LLM capabilities defining supported features and modes. Optional - server infers capabilities from model identifier if not provided.

Default{"supportsChat":true,"supportsCompletion":true,"supportsFunctionCalling":true,"supportsSystemMessages":true,"supportsStreaming":true,"supportsSamplingParameters":true}
defaultSamplingParams?LLMSamplingParams | null

Default sampling parameters for generation requests

Default{"maxTokens":2048,"temperature":0.7,"topP":0.9,"topK":50,"frequencyPenalty":0,"presencePenalty":0,"stopSequences":["\n\n","END"]}
maxContextLength?integer | null

Maximum context window size in tokens

Default32768
Formatint32
clientConfig?object | null

Provider-specific client configuration as flexible JSON structure

Empty Object

ownerId?string | null

Optional owner ID. If not provided, derived from the authentication context. Requires CREATE_LLM_ANY permission if specified.

Formatuuid
llmId?string | null

Optional client-provided UUID for idempotent creation. If not provided, server generates a new UUID. Returns ALREADY_EXISTS if ID is already in use.

Formatuuid

Response Body

curl -X POST "http://localhost:8080/v1/llms" \  -H "Content-Type: application/json" \  -d '{    "displayName": "GPT-4 Turbo",    "description": "OpenAI\'s GPT-4 Turbo model for chat completions",    "endpointUrl": "https://api.openai.com/v1",    "apiPath": "/chat/completions",    "modelIdentifier": "gpt-4-turbo-preview",    "supportedModalities": [      "TEXT"    ],    "credentials": {      "apiKey": {        "inlineSecret": "sk-your-api-key-here",        "secretRef": {          "uri": "vault://path/to/secret"        },        "headerName": "Authorization",        "prefix": "Bearer "      },      "kind": "CREDENTIAL_KIND_API_KEY"    },    "labels": {      "environment": "production",      "team": "ai"    },    "version": "1.0.0",    "monitoringEndpoint": "https://monitoring.example.com/llms/status",    "capabilities": {      "supportsChat": true,      "supportsCompletion": true,      "supportsFunctionCalling": true,      "supportsSystemMessages": true,      "supportsStreaming": true,      "supportsSamplingParameters": true    },    "defaultSamplingParams": {      "maxTokens": 2048,      "temperature": 0.7,      "topP": 0.9,      "topK": 50,      "frequencyPenalty": 0,      "presencePenalty": 0,      "stopSequences": [        "\n\n",        "END"      ]    },    "maxContextLength": 32768,    "ownerId": "550e8400-e29b-41d4-a716-446655440000",    "llmId": "550e8400-e29b-41d4-a716-446655440000",    "providerType": "OPENAI"  }'
{
  "llm": {
    "llmId": "550e8400-e29b-41d4-a716-446655440000",
    "displayName": "GPT-4 Turbo",
    "description": "OpenAI's GPT-4 Turbo model for chat completions",
    "providerType": "OPENAI",
    "endpointUrl": "https://api.openai.com/v1",
    "apiPath": "/chat/completions",
    "modelIdentifier": "gpt-4-turbo-preview",
    "supportedModalities": [
      "TEXT"
    ],
    "credentials": {
      "apiKey": {
        "inlineSecret": "sk-1234567890abcdef",
        "secretRef": {
          "uri": "vault://path/to/secret"
        },
        "headerName": "Authorization",
        "prefix": "Bearer "
      },
      "gcpAdc": {
        "scopes": [
          "https://www.googleapis.com/auth/cloud-platform"
        ],
        "quotaProjectId": "my-quota-project"
      }
    },
    "labels": {
      "environment": "production",
      "team": "ai"
    },
    "version": "1.0.0",
    "monitoringEndpoint": "https://monitoring.example.com/llms/status",
    "capabilities": {
      "supportsChat": true,
      "supportsCompletion": true,
      "supportsFunctionCalling": true,
      "supportsSystemMessages": true,
      "supportsStreaming": true,
      "supportsSamplingParameters": true
    },
    "defaultSamplingParams": {
      "maxTokens": 2048,
      "temperature": 0.7,
      "topP": 0.9,
      "topK": 50,
      "frequencyPenalty": 0,
      "presencePenalty": 0,
      "stopSequences": [
        "\n\n",
        "END"
      ]
    },
    "maxContextLength": 32768,
    "clientConfig": {
      "property1": {},
      "property2": {}
    },
    "ownerId": "550e8400-e29b-41d4-a716-446655440000",
    "createdAt": 1617293472000,
    "updatedAt": 1617293472000,
    "createdById": "550e8400-e29b-41d4-a716-446655440000",
    "updatedById": "550e8400-e29b-41d4-a716-446655440000"
  },
  "statuses": [
    {
      "code": "PARTIAL_RESULTS",
      "message": "Some embedders were unavailable, returning partial results"
    }
  ]
}
Empty
Empty
Empty
Empty