ChatPostProcessor
Comprehensive guide to GoodMem's built-in post processor for conversational AI applications
ChatPostProcessor
The ChatPostProcessor is GoodMem's built-in post processor designed for conversational AI applications. It handles the complete pipeline from raw vector search results to contextual responses, including reranking, filtering, chronological sorting, and AI-generated summaries.
How It Works
The ChatPostProcessor follows a specific processing pipeline:
- Reranking (optional) - Uses a reranker model to improve result ordering
- Relevance filtering - Removes results below the configured threshold (only when reranking is enabled and a threshold is provided)
- Chronological sorting (optional) - Sorts by memory creation time
- Result streaming - Sends results back to your application
- AI summarization (optional) - Generates contextual abstracts using LLM
Configuration Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
reranker_id | UUID string | none | Reranker model to use for result reordering. Validated before processing starts. |
relevance_threshold | number | unset | Optional minimum relevance score (only applies when reranking is enabled) |
chronological_resort | boolean | true | Whether to sort results by creation time |
llm_id | UUID string | none | LLM model to use for generating summaries. Validated before processing starts. |
llm_temp | number | 0.3 | Temperature setting for LLM generation (valid range: 0.0-2.0) |
max_results | integer | 10 | Maximum number of results to return (must be positive) |
sys_prompt | string | (see below) | System prompt template for LLM using Pebble syntax |
prompt | string | (see below) | User prompt template for LLM using Pebble syntax |
gen_token_budget | integer | 512 | Token budget for LLM generation (must be positive) |
LLM Prompt Templates & Variables
The system and user prompts use the Pebble engine. Each render receives four variables:
| Variable | Type | Description |
|---|---|---|
userQuery | string | Original RetrieveMemoryRequest.message. |
dataSection | string | Pre-built chronology block that numbers each entry ([1], [2], …). |
context | list<string> | Prior conversation turns provided in the request (may be empty). |
results | list<TemplateRetrievalItem> | Structured access to every filtered retrieval result. |
TemplateRetrievalItem fields:
| Field | Type | Notes |
|---|---|---|
index | integer | 1-based position matching the [n] labels in dataSection. |
score | double | Retrieval or reranker score. |
chunkText | string | Raw chunk text used in prompts. |
chunkSequenceNumber | integer (nullable) | Position inside the parent memory. |
memoryId / chunkId | UUID | Identifiers for linking back to GoodMem objects. |
chunkTimestamp | Instant | Timestamp used for chronological ordering (UTC). |
relativeTimeDescription | string | Human-readable offset (e.g., 2 hours ago). |
memoryCreatedAt | Instant (nullable) | Creation time of the parent memory when present. |
memoryMetadata | map | Flattened metadata map; nested structs remain nested maps/lists. |
Default templates
The built-in prompts combine dataSection with a loop over results to produce numbered citations. Override sys_prompt / prompt to customise the behaviour while reusing these variables.
{% for item in results %}
[{{ item.index }}] {{ item.chunkText }} — priority {{ item.memoryMetadata.tags.priority | default('n/a') }}
{% endfor %}The data section heading reads “Relevant data from most recent to oldest …”. When chronological_resort=false, the order mirrors the upstream retrieval/reranker order even though the heading stays the same. Keep chronological resort enabled (default) to maintain that guarantee.
Parameter Name Mapping
Important: Parameter names differ between GET query parameters and POST JSON configuration.
| GET Query Parameter | POST JSON Config | Description |
|---|---|---|
pp_reranker_id | reranker_id | Reranker model UUID |
pp_llm_id | llm_id | LLM model UUID |
pp_relevance_threshold | relevance_threshold | Relevance threshold |
pp_llm_temp | llm_temp | LLM temperature |
pp_max_results | max_results | Maximum results |
pp_chronological_resort | chronological_resort | Chronological sorting |
Common Mistake: Using camelCase like llmId in POST JSON config. Always use snake_case like llm_id to match the internal configuration keys.
Relevance Threshold Tips: Leave relevance_threshold unset to accept all reranked items. Different reranker providers produce scores that peak anywhere between ~0.05 and ~0.9, so set a cutoff only after inspecting the provider's distribution. Remember that max_results is enforced after filtering—aggressive thresholds can return fewer items than requested even when more were available.
Usage Examples
Basic Usage with LLM Summary
GET Request:
curl -X GET "https://api.goodmem.ai/v1/memories:retrieve?message=What+is+GoodMem+and+how+much+does+it+cost?&spaceIds=f77a8555-0232-4c01-a33e-4f0ca072905e&pp_llm_id=72eaec11-c698-4262-970b-83aa957f9e02" \
-H "x-api-key: <your-api-key>" \
-H "Accept: text/event-stream"POST Request (equivalent):
curl -X POST "https://api.goodmem.ai/v1/memories:retrieve" \
-H "x-api-key: <your-api-key>" \
-H "Content-Type: application/json" \
-H "Accept: application/x-ndjson" \
-d '{
"message": "What is GoodMem and how much does it cost?",
"spaceKeys": [
{
"spaceId": "f77a8555-0232-4c01-a33e-4f0ca072905e"
}
],
"postProcessor": {
"name": "com.goodmem.retrieval.postprocess.ChatPostProcessorFactory",
"config": {
"llm_id": "72eaec11-c698-4262-970b-83aa957f9e02"
}
}
}'Advanced Configuration with Reranking
curl -X POST "https://api.goodmem.ai/v1/memories:retrieve" \
-H "x-api-key: <your-api-key>" \
-H "Content-Type: application/json" \
-d '{
"message": "What features does GoodMem offer?",
"spaceKeys": [
{
"spaceId": "f77a8555-0232-4c01-a33e-4f0ca072905e"
}
],
"postProcessor": {
"name": "com.goodmem.retrieval.postprocess.ChatPostProcessorFactory",
"config": {
"reranker_id": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
"relevance_threshold": 0.7,
"llm_id": "72eaec11-c698-4262-970b-83aa957f9e02",
"llm_temp": 0.2,
"max_results": 5,
"chronological_resort": false
}
}
}'Response Format
The ChatPostProcessor returns streaming responses in the following order:
1. Retrieved Items
Individual memory chunks with relevance scores:
{
"retrievedItem": {
"chunk": {
"chunkId": "70dfe7f7-49fa-4898-bd2c-18870d1cac29",
"memoryId": "bba608d7-1ac0-4eb1-b1ad-f93278bd0293",
"chunkText": "GoodMem is an API service for creating stateful AI applications. The cost is $788.45 per month.",
"vectorStatus": "COMPLETED",
"createdAt": 1758010477847
},
"memoryIndex": 9,
"relevanceScore": -0.6942077279090881
}
}2. Result Set Boundary
Marks the end of retrieved items:
{
"resultSetBoundary": {
"resultSetId": "504721f1-d5fc-4e84-872b-fdcc5ee2bade",
"kind": "END",
"stageName": ""
}
}3. Abstract Reply
AI-generated summary (only if LLM is configured):
{
"abstractReply": {
"text": "According to the retrieved data, the cost of GoodMem is $788.45 per month. This information is consistently mentioned across multiple recent data points, indicating a reliable answer to your query.",
"relevanceScore": 0.0,
"resultSetId": "504721f1-d5fc-4e84-872b-fdcc5ee2bade"
}
}Expected Behavior: The abstractReply appears only once at the end of the stream, after all retrieved items and the result set boundary. If you're not seeing an abstract reply, check your LLM configuration.
Feature Status Messages
The ChatPostProcessor streams status updates alongside retrieval events:
FEATURE_DISABLEDwhen nollm_idis supplied (see below)INVALID_ARGUMENTwhen configuration includes unknown keys or out-of-range numeric valuesNOT_FOUNDwarnings when the supplied LLM or reranker UUID fails early validation
Abstract Reply Generation Disabled
When no llm_id is configured, you'll receive a FEATURE_DISABLED status message:
{
"status": {
"code": "FEATURE_DISABLED",
"message": "Abstract reply generation disabled: no LLM configured. Add 'llm_id' parameter to enable AI-generated summaries.",
"details": {
"feature": "summarization",
"required_param": "llm_id"
}
}
}This helps clarify why abstract replies aren't appearing in your response stream.
Common Issues & Troubleshooting
Missing Abstract Reply
Symptoms: You receive retrieved items but no abstractReply in the response.
Possible Causes:
- No LLM configured: Ensure
llm_idis provided in your configuration - Invalid LLM ID: Verify the LLM exists using the
GET /v1/llmsendpoint - Parameter naming: Use
llm_id(notllmId) in POST JSON config - Aggressive filtering: High
relevance_thresholdor very smallmax_resultscan leave too few items for the prompt to summarise meaningfully. - LLM failure: Check server logs for LLM-related errors
Solution:
# First, verify your LLM exists
curl -X GET "https://api.goodmem.ai/v1/llms" \
-H "x-api-key: <your-api-key>"
# Then use the correct LLM ID in your config
{
"postProcessor": {
"name": "com.goodmem.retrieval.postprocess.ChatPostProcessorFactory",
"config": {
"llm_id": "72eaec11-c698-4262-970b-83aa957f9e02"
}
}
}Parameter Naming Confusion
Wrong (common mistake):
{
"config": {
"llmId": "72eaec11-c698-4262-970b-83aa957f9e02",
"rerankerId": "a1b2c3d4-e5f6-7890-1234-567890abcdef"
}
}Correct:
{
"config": {
"llm_id": "72eaec11-c698-4262-970b-83aa957f9e02",
"reranker_id": "a1b2c3d4-e5f6-7890-1234-567890abcdef"
}
}GET vs POST Differences
The GET endpoint is hardwired to use only the ChatPostProcessor with simplified query parameters. The POST endpoint is more powerful - it requires explicit naming of the post processor but allows you to use any post processor registered in your GoodMem instance:
GET: Hardwired to ChatPostProcessor with pp_ query parameters
?pp_llm_id=<llm-id>&pp_relevance_threshold=0.7POST: Explicit post processor selection with full configuration
{
"postProcessor": {
"name": "com.goodmem.retrieval.postprocess.ChatPostProcessorFactory",
"config": {
"llm_id": "<llm-id>",
"relevance_threshold": 0.7
}
}
}This means if you have custom post processors registered in your GoodMem instance, you can only access them via the POST endpoint by specifying their factory class name.
Custom Prompt Templates
The ChatPostProcessor uses Pebble template engine for generating LLM prompts. You can customize both the system prompt and user prompt templates.
Default System Prompt
You are an AI assistant helping to synthesize retrieved memory content to answer a specific user query.
Your task is to analyze the retrieved data and provide insights that directly address the user's question.
IMPORTANT GUIDELINES:
- Preserve specific details, numbers, dates, and facts that are relevant to answering the query
- Do NOT generalize or abstract away important specifics (e.g., "$788.45" should stay "$788.45", not "a lot of money")
- Focus on information that would help the user make decisions or understand the answer to their query
- If the retrieved data doesn't contain relevant information, simply state that the available data doesn't address the query
- Do NOT offer suggestions, recommendations, or volunteer additional help beyond what's in the retrieved data
- Do NOT suggest alternative sources or next steps - stick strictly to what's in the memory data
- Keep the response under 4 sentences but prioritize completeness and accuracy over brevity
- Use a conversational but informative toneDefault User Prompt
{% if context|length > 0 -%}
Previous conversation context:
{%- for contextItem in context %}
- {{ contextItem }}
{%- endfor %}
{% endif -%}
User's Query: "{{ userQuery }}"
Retrieved Memory Data:
{{ dataSection }}
Based on the retrieved memory data above, provide a targeted response that addresses the user's query.
Include all relevant specific details, numbers, and facts that would help answer their question.Template Variables
When customizing prompts, these variables are available:
| Variable | Type | Description |
|---|---|---|
userQuery | string | The original user's query/message |
dataSection | string | Formatted retrieved memory chunks with timestamps and scores |
context | array | Previous conversation context items (if provided in request) |
Important: Always include {{ userQuery }} and {{ dataSection }} in your custom prompts. These provide the essential context the LLM needs to generate relevant responses. The {{ context }} variable is optional but recommended for conversational applications.
Custom Prompt Example
curl -X POST "https://api.goodmem.ai/v1/memories:retrieve" \
-H "x-api-key: <your-api-key>" \
-H "Content-Type: application/json" \
-d '{
"message": "What are the key features?",
"spaceKeys": [{"spaceId": "f77a8555-0232-4c01-a33e-4f0ca072905e"}],
"postProcessor": {
"name": "com.goodmem.retrieval.postprocess.ChatPostProcessorFactory",
"config": {
"llm_id": "72eaec11-c698-4262-970b-83aa957f9e02",
"sys_prompt": "You are a technical documentation assistant. Provide clear, structured answers based only on the retrieved data.",
"prompt": "Query: {{ userQuery }}\n\nRetrieved Data:\n{{ dataSection }}\n\nAnswer in bullet points with specific details:"
}
}
}'Error Codes
The ChatPostProcessor now performs comprehensive validation and returns specific error codes:
| Error Code | When It Occurs | Example Scenario | Solution |
|---|---|---|---|
INVALID_ARGUMENT | Invalid parameter names, values, or ranges | Using llmId instead of llm_id, or temperature outside 0.0-2.0 range | Check parameter names use snake_case and values are within valid ranges |
NOT_FOUND | Referenced LLM or reranker doesn't exist | Providing a UUID that doesn't match any configured LLM/reranker | Verify the LLM/reranker exists using the management APIs |
FEATURE_DISABLED | Feature unavailable due to missing configuration | No llm_id provided, so abstract reply generation is disabled | Add required configuration parameters to enable the feature |
RERANKING_FAILED | Reranker processing error | Network timeout or reranker service failure | Check reranker service status and retry |
SUMMARIZATION_FAILED | LLM processing error | LLM service failure or token limit exceeded | Check LLM service status and token budget configuration |
Performance Considerations
- Reranking: Adds processing time but improves result quality
- LLM Generation: Increases latency but provides valuable summaries
- Token Budget: Higher budgets allow longer summaries but take more time
- Result Limits: Use
max_resultsto control processing time and response size
For high-throughput scenarios, consider using the processor without LLM generation for faster responses, then adding summarization only when needed.
Next Steps
- Memory Retrieval API - Complete API reference
- LLM Management - Managing LLM configurations
- Reranker Management - Configuring rerankers