Create a new LLM
Creates a new LLM configuration for text generation services. LLMs represent connections to different language model API services (like OpenAI, vLLM, etc.) and include all the necessary configuration to use them for text generation.
DUPLICATE DETECTION: Returns HTTP 409 Conflict (ALREADY_EXISTS) if another LLM exists with identical {ownerId, providerType, endpointUrl, apiPath, modelIdentifier, credentialsFingerprint} after URL canonicalization. Uniqueness is enforced per-owner. Credentials are hashed (SHA-256) for uniqueness while remaining encrypted. The apiPath field defaults to '/chat/completions' if omitted. Requires CREATE_LLM_OWN permission (or CREATE_LLM_ANY for admin users).
In: header
LLM configuration details
User-facing name of the LLM
1 <= length <= 255Description of the LLM
Type of LLM provider
"OPENAI" | "LITELLM_PROXY" | "OPEN_ROUTER" | "VLLM" | "OLLAMA" | "LLAMA_CPP" | "CUSTOM_OPENAI_COMPATIBLE"API endpoint base URL (OpenAI-compatible base, typically ends with /v1)
API path for chat/completions request (defaults to /chat/completions if not provided)
Model identifier
Supported content modalities (defaults to TEXT if not provided)
Structured credential payload describing how to authenticate with the provider. Omit for deployments that do not require credentials.
User-defined labels for categorization
properties <= 20Empty Object
Version information
Monitoring endpoint URL
LLM capabilities defining supported features and modes. Optional - server infers capabilities from model identifier if not provided.
Default sampling parameters for generation requests
Maximum context window size in tokens
int32Provider-specific client configuration as flexible JSON structure
Empty Object
Optional owner ID. If not provided, derived from the authentication context. Requires CREATE_LLM_ANY permission if specified.
Optional client-provided UUID for idempotent creation. If not provided, server generates a new UUID. Returns ALREADY_EXISTS if ID is already in use.
Response Body
curl -X POST "http://localhost:8080/v1/llms" \ -H "Content-Type: application/json" \ -d '{ "displayName": "GPT-4 Turbo", "description": "OpenAI\'s GPT-4 Turbo model for chat completions", "providerType": "OPENAI", "endpointUrl": "https://api.openai.com/v1", "apiPath": "/chat/completions", "modelIdentifier": "gpt-4-turbo-preview", "supportedModalities": [ "TEXT" ], "credentials": { "kind": "CREDENTIAL_KIND_API_KEY", "apiKey": { "inlineSecret": "sk-your-api-key-here" } }, "capabilities": { "supportsChat": "true", "supportsCompletion": "true", "supportsFunctionCalling": "true", "supportsSystemMessages": "true", "supportsStreaming": "true", "supportsSamplingParameters": "true" }, "defaultSamplingParams": { "maxTokens": "2048", "temperature": "0.7", "topP": "0.9" }, "maxContextLength": "32768", "labels": { "environment": "production", "team": "ai" } }'{
"llm": {
"llmId": "550e8400-e29b-41d4-a716-446655440000",
"displayName": "GPT-4 Turbo",
"description": "OpenAI's GPT-4 Turbo model for chat completions",
"providerType": "OPENAI",
"endpointUrl": "https://api.openai.com/v1",
"apiPath": "/chat/completions",
"modelIdentifier": "gpt-4-turbo-preview",
"supportedModalities": [
"TEXT"
],
"labels": "{\"environment\": \"production\", \"team\": \"ai\"}",
"version": "1.0.0",
"monitoringEndpoint": "https://monitoring.example.com/llms/status",
"capabilities": {
"supportsChat": "true",
"supportsCompletion": "true",
"supportsFunctionCalling": "true",
"supportsSystemMessages": "true",
"supportsStreaming": "true",
"supportsSamplingParameters": "true"
},
"defaultSamplingParams": {
"maxTokens": "2048",
"temperature": "0.7",
"topP": "0.9",
"topK": "50",
"frequencyPenalty": "0.0",
"presencePenalty": "0.0",
"stopSequences": "[\"\\n\\n\", \"END\"]"
},
"maxContextLength": "32768",
"clientConfig": {
"property1": {},
"property2": {}
},
"ownerId": "550e8400-e29b-41d4-a716-446655440000",
"createdAt": "1617293472000",
"updatedAt": "1617293472000",
"createdById": "550e8400-e29b-41d4-a716-446655440000",
"updatedById": "550e8400-e29b-41d4-a716-446655440000"
},
"statuses": [
{
"code": "PARTIAL_RESULTS",
"message": "Some embedders were unavailable, returning partial results"
}
]
}