Llm
Llm service API reference
Services
LLMService Service
Service for managing LLM configurations in the GoodMem system.
Authentication: gRPC metadata authorization: Bearer <api-key>
Global errors: All RPCs may return DEADLINE_EXCEEDED, CANCELLED, UNAVAILABLE, RESOURCE_EXHAUSTED, INTERNAL.
Permissions model:
*_OWN: operate on caller-owned LLMs*_ANY: operate on any user's LLMs (requires elevated role)
CreateLLM
Creates a new LLM configuration.
| Type | |
|---|---|
| Request | goodmem.v1.CreateLLMRequest |
| Response | goodmem.v1.CreateLLMResponse |
Auth: gRPC metadata authorization: Bearer <api-key>
Permissions Required: CREATE_LLM_OWN or CREATE_LLM_ANY
Summary:
- Owner defaults to the authenticated user unless
owner_idis provided (requires*_ANYif differs) ALREADY_EXISTS: another LLM exists with identical{owner_id, provider_type, endpoint_url, api_path, model_identifier, credentials_fingerprint}after URL canonicalization: trailing slash stripped; host comparison case-insensitive; default ports removed; path andmodel_identifiercomparison case-sensitive; credentials_fingerprint is SHA-256 hash of normalized credentials
Side Effects:
- Persists LLM; encrypts
credentials; sets audit fields
Error Codes:
UNAUTHENTICATED: missing/invalid authPERMISSION_DENIED: lacksCREATE_LLM_*INVALID_ARGUMENT: bad URL(s); empty/invalid fields;UNSPECIFIEDenums; labels exceed limits/charset; unsupported modality; invalid sampling parametersALREADY_EXISTS: matching LLM as defined aboveINTERNAL: unexpected server error
Idempotency: Non-idempotent; clients SHOULD NOT blindly retry on unknown failures.
Examples:
grpcurl -plaintext \
-H 'authorization: Bearer gm_xxx' \
-d '{
"display_name": "GPT-4 Turbo",
"provider_type": "LLM_PROVIDER_TYPE_OPENAI",
"endpoint_url": "https://api.openai.com",
"model_identifier": "gpt-4-turbo-preview",
"capabilities": {"supports_chat": true, "supports_streaming": true},
"credentials": "sk-***",
"labels": {"env":"prod","team":"ai"}
}' \
localhost:8080 goodmem.v1.LLMService/CreateLLMNote: bytes fields in JSON must be base64.
GetLLM
Retrieves details of a specific LLM.
| Type | |
|---|---|
| Request | goodmem.v1.GetLLMRequest |
| Response | goodmem.v1.LLM |
Auth: gRPC metadata authorization: Bearer <api-key>
Permissions Required: READ_LLM_OWN or READ_LLM_ANY
Side Effects: None
Error Codes:
UNAUTHENTICATED: missing/invalid authPERMISSION_DENIED: lacksREAD_LLM_*INVALID_ARGUMENT: invalid LLM ID formatNOT_FOUND: LLM does not existINTERNAL: unexpected server error
Idempotency: Read-only; safe to retry; results may change over time.
Examples:
grpcurl -plaintext \
-H 'authorization: Bearer gm_xxx' \
-d '{ "llm_id": "BASE64_UUID_BYTES_HERE" }' \
localhost:8080 goodmem.v1.LLMService/GetLLMNote: bytes fields in JSON must be base64.
ListLLMs
Lists LLMs accessible to the authenticated user.
| Type | |
|---|---|
| Request | goodmem.v1.ListLLMsRequest |
| Response | goodmem.v1.ListLLMsResponse |
Auth: gRPC metadata authorization: Bearer <api-key>
Permissions Required: LIST_LLM_OWN or LIST_LLM_ANY
Request Parameters:
owner_id(optional, bytes UUID): filter by owner. If caller hasLIST_LLM_ANYandowner_idis omitted, all LLMs visible to the role are returned; otherwise only caller-owned LLMs are returnedprovider_type(optional):UNSPECIFIEDignoredlabel_selectors(optional): AND of exact key=value matches (case-sensitive)
Note: bytes fields in JSON must be base64.
Side Effects: None
Error Codes:
UNAUTHENTICATED: missing/invalid authPERMISSION_DENIED: lacksLIST_LLM_*INVALID_ARGUMENT: invalid filters or parametersINTERNAL: unexpected server error
Idempotency: Read-only; safe to retry; results may change over time.
Examples:
grpcurl -plaintext \
-H 'authorization: Bearer gm_xxx' \
-d '{ "provider_type":"LLM_PROVIDER_TYPE_OPENAI", "label_selectors":{"env":"prod"} }' \
localhost:8080 goodmem.v1.LLMService/ListLLMsUpdateLLM
Updates mutable properties of an LLM.
| Type | |
|---|---|
| Request | goodmem.v1.UpdateLLMRequest |
| Response | goodmem.v1.LLM |
Auth: gRPC metadata authorization: Bearer <api-key>
Permissions Required: UPDATE_LLM_OWN or UPDATE_LLM_ANY
Side Effects:
- Persists changes; updates
updated_atandupdated_by_id
Error Codes:
UNAUTHENTICATED: missing/invalid authPERMISSION_DENIED: lacksUPDATE_LLM_*INVALID_ARGUMENT: invalid fields or formatsNOT_FOUND: LLM does not existINTERNAL: unexpected server error
Idempotency: Idempotent with identical input; safe to retry.
Examples:
grpcurl -plaintext \
-H 'authorization: Bearer gm_xxx' \
-d '{
"llm_id": "BASE64_UUID_BYTES_HERE",
"replace_labels": { "items": {} }
}' \
localhost:8080 goodmem.v1.LLMService/UpdateLLMNote: bytes fields in JSON must be base64.
DeleteLLM
Permanently deletes an LLM configuration.
| Type | |
|---|---|
| Request | goodmem.v1.DeleteLLMRequest |
| Response | google.protobuf.Empty |
Auth: gRPC metadata authorization: Bearer <api-key>
Permissions Required: DELETE_LLM_OWN or DELETE_LLM_ANY
Side Effects:
- Removes the LLM record; securely deletes stored credentials
- Does not cancel in-flight generations; previously generated content is unaffected
Error Codes:
UNAUTHENTICATED: missing/invalid authPERMISSION_DENIED: lacksDELETE_LLM_*INVALID_ARGUMENT: invalid LLM ID formatNOT_FOUND: LLM does not existINTERNAL: unexpected server error
Idempotency: Safe to retry; may return NOT_FOUND if already deleted or never existed.
Examples:
grpcurl -plaintext \
-H 'authorization: Bearer gm_xxx' \
-d '{ "llm_id": "BASE64_UUID_BYTES_HERE" }' \
localhost:8080 goodmem.v1.LLMService/DeleteLLMNote: bytes fields in JSON must be base64.
Messages
LLMCapabilities
Capabilities and features supported by an LLM service.
Defines the interface features and generation modes available from the model endpoint. These capabilities determine which API methods and request formats can be used.
Multi-modal capabilities:
- Vision support is derived from
IMAGEinsupported_modalitiesfield - Audio support is derived from
AUDIOinsupported_modalitiesfield
Capability interactions:
supports_chatandsupports_completionmay be independent; some providers expose only one modesupports_function_callingrequiressupports_chatin most implementationssupports_system_messagesenhances both chat and completion modes when availablesupports_streamingapplies to both chat and completion generation modessupports_sampling_parameterscontrols whether stochastic generation knobs (e.g.,temperature,top_p) are available; deterministic-only models will disable this
See also: goodmem.v1.Modality
| Field | Type | Description |
|---|---|---|
supports_chat | bool | Supports conversational/chat completion format with message roles |
supports_completion | bool | Supports raw text completion with prompt continuation |
supports_function_calling | bool | Supports function/tool calling with structured responses |
supports_system_messages | bool | Supports system prompts to define model behavior and context |
supports_streaming | bool | Supports real-time token streaming during generation |
supports_sampling_parameters | bool | Supports stochastic sampling controls (e.g., temperature, top_p) for non-deterministic output |
LLMSamplingParams
Sampling and generation parameters for controlling LLM text output.
These parameters fine-tune the generation behavior, creativity, and output constraints of the language model. Different providers may support different subsets of parameters.
Parameter interactions:
temperatureandtop_pwork together; lower values increase determinismtop_kis primarily used by local/open-source models (HuggingFace, Ollama)frequency_penaltyandpresence_penaltyhelp reduce repetitive outputstop_sequencesprovide precise generation termination control
Provider compatibility:
- OpenAI: supports all except
top_k - Local models (vLLM, Ollama): typically support all parameters
- Custom providers: parameter support varies by implementation
| Field | Type | Description |
|---|---|---|
max_tokens | int32 | OPTIONAL maximum tokens to generate; >0 if set; provider-dependent limits apply |
temperature | float | OPTIONAL sampling temperature 0.0-2.0; 0.0=deterministic, 2.0=highly random; default varies by provider |
top_p | float | OPTIONAL nucleus sampling threshold 0.0-1.0; smaller values focus on higher probability tokens |
top_k | int32 | OPTIONAL top-k sampling limit; >0 if set; primarily for local/open-source models |
frequency_penalty | float | OPTIONAL frequency penalty -2.0 to 2.0; positive values reduce repetition based on frequency |
presence_penalty | float | OPTIONAL presence penalty -2.0 to 2.0; positive values encourage topic diversity |
stop_sequences | string | OPTIONAL generation stop sequences; ≤10 sequences; each ≤100 chars; generation halts on exact match |
LLM
Represents a connection to a Large Language Model service for text generation.
LLMs provide generative AI capabilities for chat completion, text completion, and function calling by interfacing with both hosted services (OpenAI, API gateways) and self-hosted models (vLLM, Ollama). Each configuration includes connection details, model parameters, generation capabilities, and access credentials.
Security:
credentialsis INPUT_ONLY and is omitted from all responses.
Immutability:
provider_typeis IMMUTABLE after creation.owner_idis set at creation and cannot be modified.
Notes:
- All timestamps are UTC (
google.protobuf.Timestamp). - Complex configuration objects (
capabilities,default_sampling_params,client_config) are stored as JSONB.
See also: LLMProviderType, LLMCapabilities, LLMSamplingParams, goodmem.v1.Modality
| Field | Type | Description |
|---|---|---|
llm_id | bytes | OUTPUT_ONLY UUID (16 bytes); immutable primary identifier |
display_name | string | REQUIRED on create; ≤255 chars; leading/trailing whitespace trimmed; cannot be empty |
description | string | OPTIONAL |
provider_type | goodmem.v1.LLMProviderType | REQUIRED on create; IMMUTABLE thereafter |
endpoint_url | string | REQUIRED HTTP(S) URL; server strips trailing slash; host comparison case-insensitive; default ports removed |
api_path | string | OPTIONAL; if empty on create, defaults to "/v1/chat/completions" |
model_identifier | string | REQUIRED on create; non-empty after trimming |
supported_modalities | goodmem.v1.Modality | OUTPUT semantics: server-stored set; default TEXT if omitted at create. See: goodmem.v1.Modality |
credentials | ...dmem.v1.EndpointAuthentication | INPUT_ONLY; optional when provider allows anonymous access; never returned in responses |
labels | goodmem.v1.LLM.LabelsEntry | ≤20 entries; keys/values ≤255 chars; keys [a-z0-9._-], case-sensitive; merge overwrites on exact key match |
version | string | OPTIONAL |
monitoring_endpoint | string | OPTIONAL http(s) URL for health/metrics |
capabilities | goodmem.v1.LLMCapabilities | LLM-specific configuration REQUIRED on create; defines supported generation modes and features |
default_sampling_params | goodmem.v1.LLMSamplingParams | OPTIONAL default parameters for generation requests |
max_context_length | int32 | OPTIONAL maximum context window size in tokens; >0 if set |
client_config | google.protobuf.Struct | OPTIONAL provider-specific configuration as flexible JSON structure |
owner_id | bytes | Standard audit fields OUTPUT_ONLY owner UUID (16 bytes); set at create; not updatable |
created_at | google.protobuf.Timestamp | OUTPUT_ONLY |
updated_at | google.protobuf.Timestamp | OUTPUT_ONLY |
created_by_id | bytes | OUTPUT_ONLY |
updated_by_id | bytes | OUTPUT_ONLY |
LLM.LabelsEntry
| Field | Type | Description |
|---|---|---|
key | string | |
value | string |
CreateLLMRequest
| Field | Type | Description |
|---|---|---|
llm_id | bytes | Optional: client-provided UUID (16 bytes); server generates if omitted; returns ALREADY_EXISTS if ID exists |
display_name | string | Required: User-facing name (≤255 chars; leading/trailing whitespace trimmed; cannot be empty) |
description | string | Optional: description of the LLM's purpose |
provider_type | goodmem.v1.LLMProviderType | Required: Provider type; LLM_PROVIDER_TYPE_UNSPECIFIED → INVALID_ARGUMENT |
endpoint_url | string | Required: HTTP(S) URL; server strips trailing slash |
api_path | string | Optional: API path; if empty, defaults to "/v1/chat/completions" |
model_identifier | string | Required: Model identifier string (non-empty after trimming) |
supported_modalities | goodmem.v1.Modality | Optional: supported modalities; empty defaults to TEXT only. See: goodmem.v1.Modality |
credentials | ...dmem.v1.EndpointAuthentication | Structured credential payload describing upstream authentication; omit when provider does not require credentials |
labels | ...1.CreateLLMRequest.LabelsEntry | Optional: labels (≤20 entries; keys/values ≤255 chars; keys [a-z0-9._-], case-sensitive) |
version | string | Optional: version information for the model/service |
monitoring_endpoint | string | Optional: HTTP(S) URL for health/metrics |
capabilities | goodmem.v1.LLMCapabilities | Required: LLM capabilities defining supported features and modes |
default_sampling_params | goodmem.v1.LLMSamplingParams | Optional: Default sampling parameters for generation requests |
max_context_length | int32 | Optional: Maximum context window size in tokens (>0 if set) |
client_config | google.protobuf.Struct | Optional: Provider-specific client configuration as flexible JSON structure |
owner_id | bytes | Optional: owner ID (16 bytes UUID); if omitted → authenticated user; requires CREATE_LLM_ANY if different from caller |
CreateLLMRequest.LabelsEntry
| Field | Type | Description |
|---|---|---|
key | string | |
value | string |
CreateLLMResponse
Response message for the CreateLLM RPC.
Contains the newly created LLM configuration and any informational statuses generated
during the creation process, such as results from capability inference.
| Field | Type | Description |
|---|---|---|
llm | goodmem.v1.LLM | The created `LLM` configuration. |
statuses | goodmem.v1.GoodMemStatus | Optional: A list of statuses detailing the results of server-side operations, such as capability inference. See `goodmem.v1.GoodMemStatus`. |
GetLLMRequest
| Field | Type | Description |
|---|---|---|
llm_id | bytes | Required: LLM ID (16 bytes UUID) |
ListLLMsRequest
| Field | Type | Description |
|---|---|---|
owner_id | bytes | Optional filters Optional: Filter by owner (16 bytes UUID) |
provider_type | goodmem.v1.LLMProviderType | Optional: Filter by provider type; LLM_PROVIDER_TYPE_UNSPECIFIED ignored |
label_selectors | ...LMsRequest.LabelSelectorsEntry | Optional: conjunction (AND) of exact key=value matches |
ListLLMsRequest.LabelSelectorsEntry
| Field | Type | Description |
|---|---|---|
key | string | |
value | string |
ListLLMsResponse
Response message for the ListLLMs RPC.
Contains a list of LLM configurations that match the request filters and are accessible by the authenticated user.
| Field | Type | Description |
|---|---|---|
llms | goodmem.v1.LLM | List of `LLM`s matching filters and permissions. |
UpdateLLMRequest
| Field | Type | Description |
|---|---|---|
llm_id | bytes | Required: ID of the LLM to update (16 bytes UUID) |
display_name | string | Optional fields to update (if omitted → unchanged) Update display name (≤255 chars; cannot be empty) |
description | string | Update description |
endpoint_url | string | Update endpoint URL (must be valid HTTP/HTTPS URL) |
api_path | string | Update API path |
model_identifier | string | Update model identifier (cannot be empty) |
supported_modalities | goodmem.v1.Modality | Updates to supported_modalities: Proto3 cannot distinguish an omitted repeated field from an empty one. In v1, updates to supported_modalities are NOT supported and the field is ignored. (If/when presence is added, setting an empty set would clear all modalities.) See note above. See: goodmem.v1.Modality |
credentials | ...dmem.v1.EndpointAuthentication | Update credentials; replaces any existing payload when set |
version | string | Update version information |
monitoring_endpoint | string | Update monitoring endpoint URL |
capabilities | goodmem.v1.LLMCapabilities | Update LLM capabilities (replaces entire capability set; clients MUST send all flags) |
default_sampling_params | goodmem.v1.LLMSamplingParams | Update default sampling parameters |
max_context_length | int32 | Update maximum context window size in tokens |
client_config | google.protobuf.Struct | Update provider-specific client configuration (replaces entire config; no merging) |
replace_labels | goodmem.v1.StringMap | Replace all existing labels with this set. Empty StringMap clears all labels. See: goodmem.v1.StringMap |
merge_labels | goodmem.v1.StringMap | Merge with existing labels: upserts with overwrite. Labels not mentioned are preserved. See: goodmem.v1.StringMap |
DeleteLLMRequest
| Field | Type | Description |
|---|---|---|
llm_id | bytes | Required: ID of the LLM to delete (16 bytes UUID) |
Enums
LLMProviderType
LLM provider type for text generation services.
Organized by deployment model with distinct value ranges:
- 0: Invalid type
- 1-99: Reserved for future general providers
- 10-19: Cloud/SaaS providers and API gateways
- 100-199: Self-hosted and local deployment solutions
- 999: Generic compatibility fallback
| Name | Value | Description |
|---|---|---|
LLM_PROVIDER_TYPE_UNSPECIFIED | 0 | Invalid provider type; `INVALID_ARGUMENT` on writes |
LLM_PROVIDER_TYPE_OPENAI | 10 | SaaS or Gateway (10-19) OpenAI API service (api.openai.com) |
LLM_PROVIDER_TYPE_LITELLM_PROXY | 11 | LiteLLM proxy for unified model access |
LLM_PROVIDER_TYPE_OPEN_ROUTER | 12 | OpenRouter API gateway service |
LLM_PROVIDER_TYPE_VLLM | 100 | Local / self-hosted (100-199) vLLM inference server for high-performance serving |
LLM_PROVIDER_TYPE_OLLAMA | 101 | Ollama local model runner |
LLM_PROVIDER_TYPE_LLAMA_CPP | 102 | llama-cpp-python server implementations |
LLM_PROVIDER_TYPE_CUSTOM_OPENAI_COMPATIBLE | 999 | Generic compatibility (999) Any OpenAI-compatible API endpoint |