GoodMem
ReferenceAPIREST APILLMs

Create a new LLM

Creates a new LLM configuration for text generation services. LLMs represent connections to different language model API services (like OpenAI, vLLM, etc.) and include all the necessary configuration to use them for text generation.

DUPLICATE DETECTION: Returns HTTP 409 Conflict (ALREADY_EXISTS) if another LLM exists with identical {ownerId, providerType, endpointUrl, apiPath, modelIdentifier, credentialsFingerprint} after URL canonicalization. Uniqueness is enforced per-owner. Credentials are hashed (SHA-256) for uniqueness while remaining encrypted. The apiPath field defaults to '/chat/completions' if omitted. Requires CREATE_LLM_OWN permission (or CREATE_LLM_ANY for admin users).

POST
/v1/llms
x-api-key<token>

In: header

LLM configuration details

displayNamestring

User-facing name of the LLM

Length1 <= length <= 255
description?string | null

Description of the LLM

providerTypeLLMProviderType

Type of LLM provider

Value in"OPENAI" | "LITELLM_PROXY" | "OPEN_ROUTER" | "VLLM" | "OLLAMA" | "LLAMA_CPP" | "CUSTOM_OPENAI_COMPATIBLE"
endpointUrlstring

API endpoint base URL (OpenAI-compatible base, typically ends with /v1)

apiPath?string | null

API path for chat/completions request (defaults to /chat/completions if not provided)

modelIdentifierstring

Model identifier

supportedModalities?array<Modality> | null

Supported content modalities (defaults to TEXT if not provided)

credentials?EndpointAuthentication

Structured credential payload describing how to authenticate with the provider. Omit for deployments that do not require credentials.

labels?object | null

User-defined labels for categorization

Propertiesproperties <= 20

Empty Object

version?string | null

Version information

monitoringEndpoint?string | null

Monitoring endpoint URL

capabilities?LLMCapabilities

LLM capabilities defining supported features and modes. Optional - server infers capabilities from model identifier if not provided.

defaultSamplingParams?LLMSamplingParams

Default sampling parameters for generation requests

maxContextLength?integer | null

Maximum context window size in tokens

Formatint32
clientConfig?object | null

Provider-specific client configuration as flexible JSON structure

Empty Object

ownerId?string | null

Optional owner ID. If not provided, derived from the authentication context. Requires CREATE_LLM_ANY permission if specified.

llmId?string | null

Optional client-provided UUID for idempotent creation. If not provided, server generates a new UUID. Returns ALREADY_EXISTS if ID is already in use.

Response Body

curl -X POST "http://localhost:8080/v1/llms" \  -H "Content-Type: application/json" \  -d '{    "displayName": "GPT-4 Turbo",    "description": "OpenAI\'s GPT-4 Turbo model for chat completions",    "providerType": "OPENAI",    "endpointUrl": "https://api.openai.com/v1",    "apiPath": "/chat/completions",    "modelIdentifier": "gpt-4-turbo-preview",    "supportedModalities": [      "TEXT"    ],    "credentials": {      "kind": "CREDENTIAL_KIND_API_KEY",      "apiKey": {        "inlineSecret": "sk-your-api-key-here"      }    },    "capabilities": {      "supportsChat": "true",      "supportsCompletion": "true",      "supportsFunctionCalling": "true",      "supportsSystemMessages": "true",      "supportsStreaming": "true",      "supportsSamplingParameters": "true"    },    "defaultSamplingParams": {      "maxTokens": "2048",      "temperature": "0.7",      "topP": "0.9"    },    "maxContextLength": "32768",    "labels": {      "environment": "production",      "team": "ai"    }  }'
{
  "llm": {
    "llmId": "550e8400-e29b-41d4-a716-446655440000",
    "displayName": "GPT-4 Turbo",
    "description": "OpenAI's GPT-4 Turbo model for chat completions",
    "providerType": "OPENAI",
    "endpointUrl": "https://api.openai.com/v1",
    "apiPath": "/chat/completions",
    "modelIdentifier": "gpt-4-turbo-preview",
    "supportedModalities": [
      "TEXT"
    ],
    "labels": "{\"environment\": \"production\", \"team\": \"ai\"}",
    "version": "1.0.0",
    "monitoringEndpoint": "https://monitoring.example.com/llms/status",
    "capabilities": {
      "supportsChat": "true",
      "supportsCompletion": "true",
      "supportsFunctionCalling": "true",
      "supportsSystemMessages": "true",
      "supportsStreaming": "true",
      "supportsSamplingParameters": "true"
    },
    "defaultSamplingParams": {
      "maxTokens": "2048",
      "temperature": "0.7",
      "topP": "0.9",
      "topK": "50",
      "frequencyPenalty": "0.0",
      "presencePenalty": "0.0",
      "stopSequences": "[\"\\n\\n\", \"END\"]"
    },
    "maxContextLength": "32768",
    "clientConfig": {
      "property1": {},
      "property2": {}
    },
    "ownerId": "550e8400-e29b-41d4-a716-446655440000",
    "createdAt": "1617293472000",
    "updatedAt": "1617293472000",
    "createdById": "550e8400-e29b-41d4-a716-446655440000",
    "updatedById": "550e8400-e29b-41d4-a716-446655440000"
  },
  "statuses": [
    {
      "code": "PARTIAL_RESULTS",
      "message": "Some embedders were unavailable, returning partial results"
    }
  ]
}
Empty
Empty
Empty
Empty