Llm Create

goodmem llm create

Create a new LLM

Synopsis

Create a new LLM in the GoodMem service with the specified configuration.

goodmem llm create [flags]

Examples

  # Create an OpenAI GPT-4 LLM with a client-provided ID
  goodmem llm create \
    --id "123e4567-e89b-12d3-a456-426614174000" \
    --display-name "My GPT-4" \
    --provider-type OPENAI \
    --endpoint-url "https://api.openai.com/v1" \
    --model-identifier "gpt-4o" \
    --cred-api-key "sk-..." \
    --supports-chat

  # Create an OpenAI GPT-4 LLM (server-generated ID)
  goodmem llm create \
    --display-name "My GPT-4" \
    --provider-type OPENAI \
    --endpoint-url "https://api.openai.com/v1" \
    --model-identifier "gpt-4o" \
    --cred-api-key "sk-..." \
    --supports-chat \
    --supports-streaming \
    --supports-function-calling \
    --sampling-max-tokens 4096 \
    --sampling-temperature 0.7

  # Create a LiteLLM proxy LLM using a bearer token
  goodmem llm create \
    --display-name "LiteLLM Claude" \
    --provider-type LITELLM_PROXY \
    --endpoint-url "https://llm-proxy.internal/v1" \
    --model-identifier "anthropic/claude-3-opus" \
    --cred-api-key "Bearer token" \
    --supports-chat \
    --supports-system-messages \
    --sampling-max-tokens 2048

  # Create a Vertex AI (OpenAI-compatible) LLM using Google ADC credentials
  # Note: Vertex must be configured via its OpenAI compatibility endpoint (exact URL varies by project/region).
  goodmem llm create \
    --display-name "Vertex (OpenAI-compatible, ADC)" \
    --provider-type CUSTOM_OPENAI_COMPATIBLE \
    --endpoint-url "https://vertex-openai-compat.example.com/v1" \
    --model-identifier "gpt-4o-mini" \
    --cred-gcp \
    --cred-gcp-scope https://www.googleapis.com/auth/cloud-platform \
    --cred-gcp-quota my-billing-project \
    --supports-chat

  # Create a local VLLM LLM
  goodmem llm create \
    --display-name "Local Llama" \
    --provider-type VLLM \
    --endpoint-url "http://localhost:8000/v1" \
    --model-identifier "llama3-70b" \
    --supports-chat \
    --supports-completion

Options

      --api-path string                      API path (defaults to /chat/completions)
      --client-config string                 Provider-specific client configuration as JSON string
      --cred-api-key string                  API key for provider endpoint (use "-" to read from piped stdin)
      --cred-api-key-file string             Read API key from file (file should be mode 0600)
      --cred-api-key-prompt                  Prompt for API key interactively (requires TTY, input hidden)
      --cred-gcp                             Use Google Application Default Credentials
      --cred-gcp-quota string                Quota project for Google ADC requests
      --cred-gcp-scope strings               Additional Google ADC OAuth scope (can be specified multiple times)
      --description string                   Description of the LLM
      --display-name string                  Display name for the LLM (required)
      --endpoint-url string                  Endpoint base URL (do not include /chat/completions; include /v1 only if your provider requires it) (required)
  -h, --help                                 help for create
      --id string                            Optional: Client-provided UUID for the LLM (16 bytes). Server generates if omitted.
  -l, --label strings                        Labels in key=value format (can be specified multiple times)
      --max-context-length int32             Maximum context length in tokens
      --modalities strings                   Supported modalities (TEXT, IMAGE, AUDIO, VIDEO) (default [TEXT])
      --model-identifier string              Model identifier (required)
      --monitoring-endpoint string           Monitoring endpoint URL
      --no-supports-chat                     LLM does not support chat/conversation mode
      --no-supports-completion               LLM does not support text completion mode
      --no-supports-function-calling         LLM does not support function calling
      --no-supports-sampling-parameters      LLM does not support sampling parameters
      --no-supports-streaming                LLM does not support streaming responses
      --no-supports-system-messages          LLM does not support system messages
      --owner string                         Owner ID for the LLM (requires admin permissions)
      --provider-type string                 Provider type (OPENAI, LITELLM_PROXY, OPEN_ROUTER, VLLM, OLLAMA, LLAMA_CPP, CUSTOM_OPENAI_COMPATIBLE) (required)
      --sampling-frequency-penalty float32   Frequency penalty (-2.0 to 2.0)
      --sampling-max-tokens int32            Maximum number of tokens to generate
      --sampling-presence-penalty float32    Presence penalty (-2.0 to 2.0)
      --sampling-stop-sequences strings      Stop sequences (can be specified multiple times)
      --sampling-temperature float32         Sampling temperature (0.0-2.0)
      --sampling-top-k int32                 Top-k sampling parameter
      --sampling-top-p float32               Top-p sampling parameter (0.0-1.0)
      --supports-chat                        LLM supports chat/conversation mode
      --supports-completion                  LLM supports text completion mode
      --supports-function-calling            LLM supports function calling
      --supports-sampling-parameters         LLM supports sampling parameters (temperature, top_p, etc.)
      --supports-streaming                   LLM supports streaming responses
      --supports-system-messages             LLM supports system messages
      --version string                       Version of the LLM

Options inherited from parent commands

      --api-key string   API key for authentication (can also be set via GOODMEM_API_KEY environment variable)
      --server string    GoodMem server address (gRPC API) (default "https://localhost:9090")

Llm Create

goodmem llm create

Synopsis

Examples

Options

Options inherited from parent commands

SEE ALSO

On this page