GoodMem

ChatPostProcessor

Comprehensive guide to GoodMem's built-in post processor for conversational AI applications

ChatPostProcessor

The ChatPostProcessor is GoodMem's built-in post processor designed for conversational AI applications. It handles the complete pipeline from raw vector search results to contextual responses, including reranking, filtering, chronological sorting, and AI-generated summaries.

How It Works

The ChatPostProcessor follows a specific processing pipeline:

  1. Reranking (optional) - Uses a reranker model to improve result ordering
  2. Relevance filtering - Removes results below the configured threshold (only when reranking is enabled and a threshold is provided)
  3. Chronological sorting (optional) - Sorts by memory creation time
  4. Result streaming - Sends results back to your application
  5. AI summarization (optional) - Generates contextual abstracts using LLM

Configuration Parameters

ParameterTypeDefaultDescription
reranker_idUUID stringnoneReranker model to use for result reordering. Validated before processing starts.
relevance_thresholdnumberunsetOptional minimum relevance score (only applies when reranking is enabled)
chronological_resortbooleantrueWhether to sort results by creation time
llm_idUUID stringnoneLLM model to use for generating summaries. Validated before processing starts.
llm_tempnumber0.3Temperature setting for LLM generation (valid range: 0.0-2.0)
max_resultsinteger10Maximum number of results to return (must be positive)
sys_promptstring(see below)System prompt template for LLM using Pebble syntax
promptstring(see below)User prompt template for LLM using Pebble syntax
gen_token_budgetinteger512Token budget for LLM generation (must be positive)

LLM Prompt Templates & Variables

The system and user prompts use the Pebble engine. Each render receives four variables:

VariableTypeDescription
userQuerystringOriginal RetrieveMemoryRequest.message.
dataSectionstringPre-built chronology block that numbers each entry ([1], [2], …).
contextlist<string>Prior conversation turns provided in the request (may be empty).
resultslist<TemplateRetrievalItem>Structured access to every filtered retrieval result.

TemplateRetrievalItem fields:

FieldTypeNotes
indexinteger1-based position matching the [n] labels in dataSection.
scoredoubleRetrieval or reranker score.
chunkTextstringRaw chunk text used in prompts.
chunkSequenceNumberinteger (nullable)Position inside the parent memory.
memoryId / chunkIdUUIDIdentifiers for linking back to GoodMem objects.
chunkTimestampInstantTimestamp used for chronological ordering (UTC).
relativeTimeDescriptionstringHuman-readable offset (e.g., 2 hours ago).
memoryCreatedAtInstant (nullable)Creation time of the parent memory when present.
memoryMetadatamapFlattened metadata map; nested structs remain nested maps/lists.

Default templates The built-in prompts combine dataSection with a loop over results to produce numbered citations. Override sys_prompt / prompt to customise the behaviour while reusing these variables.

{% for item in results %}
[{{ item.index }}] {{ item.chunkText }} — priority {{ item.memoryMetadata.tags.priority | default('n/a') }}
{% endfor %}

The data section heading reads “Relevant data from most recent to oldest …”. When chronological_resort=false, the order mirrors the upstream retrieval/reranker order even though the heading stays the same. Keep chronological resort enabled (default) to maintain that guarantee.

Parameter Name Mapping

Important: Parameter names differ between GET query parameters and POST JSON configuration.

GET Query ParameterPOST JSON ConfigDescription
pp_reranker_idreranker_idReranker model UUID
pp_llm_idllm_idLLM model UUID
pp_relevance_thresholdrelevance_thresholdRelevance threshold
pp_llm_templlm_tempLLM temperature
pp_max_resultsmax_resultsMaximum results
pp_chronological_resortchronological_resortChronological sorting

Common Mistake: Using camelCase like llmId in POST JSON config. Always use snake_case like llm_id to match the internal configuration keys.

Relevance Threshold Tips: Leave relevance_threshold unset to accept all reranked items. Different reranker providers produce scores that peak anywhere between ~0.05 and ~0.9, so set a cutoff only after inspecting the provider's distribution. Remember that max_results is enforced after filtering—aggressive thresholds can return fewer items than requested even when more were available.

Usage Examples

Basic Usage with LLM Summary

GET Request:

curl -X GET "https://api.goodmem.ai/v1/memories:retrieve?message=What+is+GoodMem+and+how+much+does+it+cost?&spaceIds=f77a8555-0232-4c01-a33e-4f0ca072905e&pp_llm_id=72eaec11-c698-4262-970b-83aa957f9e02" \
  -H "x-api-key: <your-api-key>" \
  -H "Accept: text/event-stream"

POST Request (equivalent):

curl -X POST "https://api.goodmem.ai/v1/memories:retrieve" \
  -H "x-api-key: <your-api-key>" \
  -H "Content-Type: application/json" \
  -H "Accept: application/x-ndjson" \
  -d '{
  "message": "What is GoodMem and how much does it cost?",
  "spaceKeys": [
    {
      "spaceId": "f77a8555-0232-4c01-a33e-4f0ca072905e"
    }
  ],
  "postProcessor": {
    "name": "com.goodmem.retrieval.postprocess.ChatPostProcessorFactory",
    "config": {
      "llm_id": "72eaec11-c698-4262-970b-83aa957f9e02"
    }
  }
}'

Advanced Configuration with Reranking

curl -X POST "https://api.goodmem.ai/v1/memories:retrieve" \
  -H "x-api-key: <your-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
  "message": "What features does GoodMem offer?",
  "spaceKeys": [
    {
      "spaceId": "f77a8555-0232-4c01-a33e-4f0ca072905e"
    }
  ],
  "postProcessor": {
    "name": "com.goodmem.retrieval.postprocess.ChatPostProcessorFactory",
    "config": {
      "reranker_id": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
      "relevance_threshold": 0.7,
      "llm_id": "72eaec11-c698-4262-970b-83aa957f9e02",
      "llm_temp": 0.2,
      "max_results": 5,
      "chronological_resort": false
    }
  }
}'

Response Format

The ChatPostProcessor returns streaming responses in the following order:

1. Retrieved Items

Individual memory chunks with relevance scores:

{
  "retrievedItem": {
    "chunk": {
      "chunkId": "70dfe7f7-49fa-4898-bd2c-18870d1cac29",
      "memoryId": "bba608d7-1ac0-4eb1-b1ad-f93278bd0293",
      "chunkText": "GoodMem is an API service for creating stateful AI applications. The cost is $788.45 per month.",
      "vectorStatus": "COMPLETED",
      "createdAt": 1758010477847
    },
    "memoryIndex": 9,
    "relevanceScore": -0.6942077279090881
  }
}

2. Result Set Boundary

Marks the end of retrieved items:

{
  "resultSetBoundary": {
    "resultSetId": "504721f1-d5fc-4e84-872b-fdcc5ee2bade",
    "kind": "END",
    "stageName": ""
  }
}

3. Abstract Reply

AI-generated summary (only if LLM is configured):

{
  "abstractReply": {
    "text": "According to the retrieved data, the cost of GoodMem is $788.45 per month. This information is consistently mentioned across multiple recent data points, indicating a reliable answer to your query.",
    "relevanceScore": 0.0,
    "resultSetId": "504721f1-d5fc-4e84-872b-fdcc5ee2bade"
  }
}

Expected Behavior: The abstractReply appears only once at the end of the stream, after all retrieved items and the result set boundary. If you're not seeing an abstract reply, check your LLM configuration.

Feature Status Messages

The ChatPostProcessor streams status updates alongside retrieval events:

  • FEATURE_DISABLED when no llm_id is supplied (see below)
  • INVALID_ARGUMENT when configuration includes unknown keys or out-of-range numeric values
  • NOT_FOUND warnings when the supplied LLM or reranker UUID fails early validation

Abstract Reply Generation Disabled

When no llm_id is configured, you'll receive a FEATURE_DISABLED status message:

{
  "status": {
    "code": "FEATURE_DISABLED",
    "message": "Abstract reply generation disabled: no LLM configured. Add 'llm_id' parameter to enable AI-generated summaries.",
    "details": {
      "feature": "summarization",
      "required_param": "llm_id"
    }
  }
}

This helps clarify why abstract replies aren't appearing in your response stream.

Common Issues & Troubleshooting

Missing Abstract Reply

Symptoms: You receive retrieved items but no abstractReply in the response.

Possible Causes:

  1. No LLM configured: Ensure llm_id is provided in your configuration
  2. Invalid LLM ID: Verify the LLM exists using the GET /v1/llms endpoint
  3. Parameter naming: Use llm_id (not llmId) in POST JSON config
  4. Aggressive filtering: High relevance_threshold or very small max_results can leave too few items for the prompt to summarise meaningfully.
  5. LLM failure: Check server logs for LLM-related errors

Solution:

# First, verify your LLM exists
curl -X GET "https://api.goodmem.ai/v1/llms" \
  -H "x-api-key: <your-api-key>"

# Then use the correct LLM ID in your config
{
  "postProcessor": {
    "name": "com.goodmem.retrieval.postprocess.ChatPostProcessorFactory",
    "config": {
      "llm_id": "72eaec11-c698-4262-970b-83aa957f9e02"
    }
  }
}

Parameter Naming Confusion

Wrong (common mistake):

{
  "config": {
    "llmId": "72eaec11-c698-4262-970b-83aa957f9e02",
    "rerankerId": "a1b2c3d4-e5f6-7890-1234-567890abcdef"
  }
}

Correct:

{
  "config": {
    "llm_id": "72eaec11-c698-4262-970b-83aa957f9e02",
    "reranker_id": "a1b2c3d4-e5f6-7890-1234-567890abcdef"
  }
}

GET vs POST Differences

The GET endpoint is hardwired to use only the ChatPostProcessor with simplified query parameters. The POST endpoint is more powerful - it requires explicit naming of the post processor but allows you to use any post processor registered in your GoodMem instance:

GET: Hardwired to ChatPostProcessor with pp_ query parameters

?pp_llm_id=<llm-id>&pp_relevance_threshold=0.7

POST: Explicit post processor selection with full configuration

{
  "postProcessor": {
    "name": "com.goodmem.retrieval.postprocess.ChatPostProcessorFactory",
    "config": {
      "llm_id": "<llm-id>",
      "relevance_threshold": 0.7
    }
  }
}

This means if you have custom post processors registered in your GoodMem instance, you can only access them via the POST endpoint by specifying their factory class name.

Custom Prompt Templates

The ChatPostProcessor uses Pebble template engine for generating LLM prompts. You can customize both the system prompt and user prompt templates.

Default System Prompt

You are an AI assistant helping to synthesize retrieved memory content to answer a specific user query.

Your task is to analyze the retrieved data and provide insights that directly address the user's question.

IMPORTANT GUIDELINES:
- Preserve specific details, numbers, dates, and facts that are relevant to answering the query
- Do NOT generalize or abstract away important specifics (e.g., "$788.45" should stay "$788.45", not "a lot of money")
- Focus on information that would help the user make decisions or understand the answer to their query
- If the retrieved data doesn't contain relevant information, simply state that the available data doesn't address the query
- Do NOT offer suggestions, recommendations, or volunteer additional help beyond what's in the retrieved data
- Do NOT suggest alternative sources or next steps - stick strictly to what's in the memory data
- Keep the response under 4 sentences but prioritize completeness and accuracy over brevity
- Use a conversational but informative tone

Default User Prompt

{% if context|length > 0 -%}
Previous conversation context:
{%- for contextItem in context %}
- {{ contextItem }}
{%- endfor %}

{% endif -%}
User's Query: "{{ userQuery }}"

Retrieved Memory Data:
{{ dataSection }}

Based on the retrieved memory data above, provide a targeted response that addresses the user's query.
Include all relevant specific details, numbers, and facts that would help answer their question.

Template Variables

When customizing prompts, these variables are available:

VariableTypeDescription
userQuerystringThe original user's query/message
dataSectionstringFormatted retrieved memory chunks with timestamps and scores
contextarrayPrevious conversation context items (if provided in request)

Important: Always include {{ userQuery }} and {{ dataSection }} in your custom prompts. These provide the essential context the LLM needs to generate relevant responses. The {{ context }} variable is optional but recommended for conversational applications.

Custom Prompt Example

curl -X POST "https://api.goodmem.ai/v1/memories:retrieve" \
  -H "x-api-key: <your-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
  "message": "What are the key features?",
  "spaceKeys": [{"spaceId": "f77a8555-0232-4c01-a33e-4f0ca072905e"}],
  "postProcessor": {
    "name": "com.goodmem.retrieval.postprocess.ChatPostProcessorFactory",
    "config": {
      "llm_id": "72eaec11-c698-4262-970b-83aa957f9e02",
      "sys_prompt": "You are a technical documentation assistant. Provide clear, structured answers based only on the retrieved data.",
      "prompt": "Query: {{ userQuery }}\n\nRetrieved Data:\n{{ dataSection }}\n\nAnswer in bullet points with specific details:"
    }
  }
}'

Error Codes

The ChatPostProcessor now performs comprehensive validation and returns specific error codes:

Error CodeWhen It OccursExample ScenarioSolution
INVALID_ARGUMENTInvalid parameter names, values, or rangesUsing llmId instead of llm_id, or temperature outside 0.0-2.0 rangeCheck parameter names use snake_case and values are within valid ranges
NOT_FOUNDReferenced LLM or reranker doesn't existProviding a UUID that doesn't match any configured LLM/rerankerVerify the LLM/reranker exists using the management APIs
FEATURE_DISABLEDFeature unavailable due to missing configurationNo llm_id provided, so abstract reply generation is disabledAdd required configuration parameters to enable the feature
RERANKING_FAILEDReranker processing errorNetwork timeout or reranker service failureCheck reranker service status and retry
SUMMARIZATION_FAILEDLLM processing errorLLM service failure or token limit exceededCheck LLM service status and token budget configuration

Performance Considerations

  • Reranking: Adds processing time but improves result quality
  • LLM Generation: Increases latency but provides valuable summaries
  • Token Budget: Higher budgets allow longer summaries but take more time
  • Result Limits: Use max_results to control processing time and response size

For high-throughput scenarios, consider using the processor without LLM generation for faster responses, then adding summarization only when needed.

Next Steps