Filter Expressions Reference
Detailed reference for the GoodMem metadata filter language including syntax, types, operators, and functions.
Filter Expressions Reference
GoodMem filter expressions are compiled to PostgreSQL predicates that operate on the JSON metadata column attached to each memory. This page describes the language precisely so you can author reliable filters and understand the translation rules that run on the server.
The implementation is defined by Filter.g4 (ANTLR grammar) and the SQL emitters in com.goodmem.retrieval.sql.emit. Everything documented here matches the behaviour in those sources.
Expression Model
- A filter expression must evaluate to a boolean. The server rejects expressions that end in a non-boolean value.
- Expressions are case-insensitive for keywords and function names. Identifiers are matched case-insensitively by the parser but produce lower-case function dispatch.
- Whitespace and
--line comments are ignored. - Identifiers (function names) follow
[A-Za-z_][A-Za-z0-9_]*. - String literals use single quotes with doubled single quotes for escaping (
'O''Brien'). - Numeric literals may be integers or decimals with optional scientific notation (for example
42,3.14,6.02e23). - Boolean literals are
TRUEandFALSE;NULLrepresents the absence of a value. - Array literals are JSON-style:
[expr, expr, ...]. The result is stored asjsonb. - Any expression may apply array subscripting like
value[0]. Indexes are zero-based and must evaluate to a non-negative integer; they use PostgreSQLjsonb_array_elementat runtime.
Operator Precedence
From lowest to highest precedence:
| Precedence | Operators | Notes |
|---|---|---|
| 1 | OR | Left-associative |
| 2 | AND | Left-associative |
| 3 | NOT | Unary |
| 4 | Comparisons (=, ==, !=, <>, <, <=, >, >=, [NOT] LIKE, [NOT] ILIKE, IS [NOT] NULL, IN (...), value [NOT] IN array, CONTAINS, OVERLAPS) | Comparison and membership tests |
| 5 | ` | |
| 6 | +, - | Addition and subtraction |
| 7 | *, /, % | Multiplication, division, modulo |
| 8 | Unary +, Unary - | Arithmetic negation or no-op |
| 9 | CAST(... AS type) | Type conversion |
| 10 | Literals, function calls, parenthesised expressions, JSON path extraction | Highest precedence |
Parentheses can always be used to override precedence.
Accessing Metadata
GoodMem stores metadata as JSON. The filter language offers two entry points into that JSON:
| Function | Return | Notes |
|---|---|---|
val('<jsonpath>') | jsonb | Extracts the first match for the supplied PostgreSQL JSONPath. The path must be a string literal. When used inside a CAST, the underlying implementation issues JSON_VALUE with NULL ON EMPTY NULL ON ERROR, giving you typed scalars when the path exists and NULL otherwise. Outside a cast, comparisons operate on jsonb values. |
exists('<jsonpath>') | boolean | Uses jsonb_path_exists to test whether a path produces at least one result. The argument must be a string literal JSONPath. |
JSONPath specifics
- Paths follow PostgreSQL JSONPath syntax (
$.field,$.array[0],$.foo[*].bar). - Array slices, wildcards, negative indexes, and predicates are supported exactly as PostgreSQL does—whatever
jsonb_path_query_firstaccepts is valid here. - Missing paths yield
NULLfromvalandFALSEfromexists.
Array Accessors
After any expression that returns a JSON array you may append [index] to fetch jsonb_array_element(array_expression, index). The index expression is cast to integer. Negative indexes are not supported by PostgreSQL and will produce a runtime error if used.
Array Literals
[ ] produce JSON arrays. Mixed element types are allowed; they are emitted via jsonb_build_array.
Type System and Casting
CAST(expr AS type) converts values using the TypeSpec rules enforced in TypeSpec.java. The following targets are accepted (aliases shown in parentheses):
| Type | Notes |
|---|---|
numeric[(precision[, scale])] (decimal) | Precision optional; scale optional when precision supplied. |
integer (int) | No parameters allowed. |
bigint (int8) | No parameters allowed. |
double precision | Synonyms: double. |
real (float4) | Single-precision float. |
boolean (bool) | |
text | Unlimited-length string. |
varchar(length) (character varying) | Length is mandatory; only one argument permitted. |
date | Calendar date. |
timestamp | Timestamp without time zone. |
timestamptz (timestamp with time zone) | |
uuid | Standard 128-bit UUID. |
inet | Returned as text; useful when metadata stores IP addresses. |
Notes:
- Unsupported type names or illegal parameter counts raise
INVALID_ARGUMENTduring semantic validation. - Casting a
val()call usesJSON_VALUEunder the hood. Invalid conversions (for example, trying to read a string asINTEGER) returnNULLinstead of throwing, mirroringNULL ON ERRORsemantics. - Casting any other expression relies on PostgreSQL’s implicit casts via jOOQ.
Operators
Comparison and Logical Operators
| Form | Result | Details |
|---|---|---|
left = right, left == right | boolean | Uses PostgreSQL equality; automatically coerces JSON operands when one side is a scalar and the other is JSON (for array subscripts). |
left != right, left <> right | boolean | Inequality. |
left < right, <=, >, >= | boolean | Numeric or lexicographic comparison. Operands are cast to numeric/text by PostgreSQL depending on context. |
expr IS NULL, expr IS NOT NULL | boolean | Tests for NULL result. |
expr IN (value, ...), expr NOT IN (value, ...) | boolean | Values inside the list are coerced to the operand’s SQL type before comparison. |
scalar [NOT] IN array_expr | boolean | Treats array_expr as JSON and checks whether it contains the scalar. NOT IN is written NOT (scalar IN array_expr) or scalar NOT IN array_expr. |
left [NOT] LIKE pattern | boolean | Case-sensitive pattern match. % matches any sequence of characters; _ matches a single character. NOT LIKE negates the result. |
left [NOT] ILIKE pattern | boolean | Case-insensitive variant of LIKE. Supports NOT ILIKE. |
array_expr CONTAINS other_array | boolean | True when array_expr @> other_array (superset test after both sides are coerced to JSONB). |
array_expr OVERLAPS other_array | boolean | True when arrays share at least one element (EXISTS join over jsonb_array_elements). |
left AND right | boolean | Short-circuit semantics provided by PostgreSQL evaluation order. |
left OR right | boolean | |
NOT expr | boolean | Unary negation. |
Arithmetic and Concatenation
| Form | Return Type | Notes |
|---|---|---|
left + right, -, *, /, % | numeric | Both operands are treated as numeric; results are emitted as PostgreSQL numeric. Casting may be required if you expect integers. Division follows PostgreSQL semantics (precision depends on operands). |
text_left || text_right | text | String concatenation. Operands are cast to text. |
Temporal Keywords and Functions
| Construct | Return | Arguments | Details |
|---|---|---|---|
CURRENT_DATE | date | none | PostgreSQL current_date. |
CURRENT_TIME | time with time zone | optional precision integer | Emits current_time or current_time(n). |
CURRENT_TIMESTAMP | timestamp with time zone | optional precision integer | No arguments or a single integer precision. |
now() | timestamp with time zone | none | Alias for PostgreSQL now(). |
EXTRACT(field FROM expr) | numeric | field from list below; expr must be timestamp, timestamptz, time, or interval. Only the following field values are accepted: CENTURY, DAY, DECADE, DOW, DOY, EPOCH, HOUR, ISODOW, ISOYEAR, MICROSECONDS, MILLENNIUM, MILLISECOND, MILLISECONDS, MINUTE, MONTH, QUARTER, SECOND, TIMEZONE, TIMEZONE_HOUR, TIMEZONE_MINUTE, WEEK, YEAR. | |
date_trunc('precision', value) | same as input | precision string literal; value timestamp/time expression | Calls PostgreSQL date_trunc. An optional third argument zone string literal is supported for timestamptz values. |
to_timestamp(number) | timestamp with time zone | double precision seconds | Numeric overload uses PostgreSQL epoch conversion. |
to_timestamp(text, format) | timestamp with time zone | Both arguments text | Format string follows PostgreSQL patterns. |
String-Oriented Functions
All string functions coerce their arguments to text before calling the underlying PostgreSQL function.
| Function | Return | Arguments | Behaviour |
|---|---|---|---|
char_length(value) | integer | expression | Standard character length. The optional USING CHARACTERS clause is accepted and treated as characters. |
substring(value FROM start [FOR length]) | text | start/length numeric expressions | Uses PostgreSQL substring. Omitting FROM returns the entire string cast to text. |
position(needle IN haystack) | integer | text expressions | 1-based position, or 0 if not found. |
trim(value) | text | expression | Removes leading and trailing whitespace. |
trim([LEADING|TRAILING|BOTH] [trim_char] FROM value) | text | optional direction and trim character | Mirrors PostgreSQL trim. |
upper(value) | text | expression | Uppercase conversion. |
lower(value) | text | expression | Lowercase conversion. |
octet_length(value) | integer | expression | Bytes of UTF‑8 representation. |
replace(source, from, to) | text | three text expressions | Direct call to replace. |
regexp_replace(source, pattern, replacement[, flags]) | text | pattern and replacement text | PostgreSQL regular expression replace; flags optional. |
string_agg(value, delimiter) | text | both text expressions | Aggregates within the current row context. Typically used after grouping in subqueries that feed filters. |
Numeric and Math Functions
Arguments are coerced to numeric or double precision as noted.
| Function | Return | Arguments | Notes |
|---|---|---|---|
abs(value) | numeric | 1 | Absolute value. |
sign(value) | numeric | 1 | Sign of the number. |
ceil(value) / ceiling(value) | numeric | 1 | Ceiling. |
floor(value) | numeric | 1 | Floor. |
round(value[, scale]) | numeric | 1 or 2 | Optional scale integer. |
trunc(value[, scale]) | numeric | 1 or 2 | Optional scale integer. |
mod(dividend, divisor) | numeric | 2 | Remainder with PostgreSQL mod. |
power(base, exponent) | numeric | 2 | |
sqrt(value) | double precision | 1 | |
exp(value) | double precision | 1 | Natural exponential. |
ln(value) | double precision | 1 | Natural logarithm. |
log10(value) | double precision | 1 | Base-10 logarithm. |
log(value) | double precision | 1 | Natural log when a single argument is provided. |
log(base, value) | numeric | 2 | Explicit base logarithm. |
cbrt(value) | double precision | 1 | Cube root. |
radians(value) | double precision | 1 | Degrees to radians. |
degrees(value) | double precision | 1 | Radians to degrees. |
greatest(value, ...) | numeric | ≥2 | All arguments coerced to numeric. |
least(value, ...) | numeric | ≥2 | All arguments coerced to numeric. |
width_bucket(expr, min, max, count) | integer | 4 | Computes bucket index after coercing operands to numeric. Inputs that are NaN are wrapped with NULLIF(..., 'NaN'::numeric) to avoid PostgreSQL errors. |
isfinite(value) | boolean | 1 | Returns TRUE if the coerced double precision value is not NaN or ±Infinity. |
isnan(value) | boolean | 1 | Returns TRUE when the coerced double precision value is NaN. |
pi() | double precision | 0 | Constant π. |
Array Utility Functions
| Function | Return | Arguments | Notes |
|---|---|---|---|
array_length(value) | integer | expression producing a JSON array | Calls jsonb_array_length. Non-array input raises an error at runtime. |
Practical Tips
-
Always
CASTaval()expression before numeric comparison to avoid implicit string ordering. Example:CAST(val('$.retentionPolicy.archiveAfterMonths') AS INTEGER) >= 0 -
Use
existswhen checking for the presence of a complex structure instead of comparing againstNULL.exists('$.owners[*] ? (@.role == "Process Owner")') -
When composing arrays for membership tests, build them explicitly so the intent is clear:
'priority-handling' IN val('$.tags') AND val('$.operationalContext.priorityLevels') OVERLAPS ['P1', 'P0'] -
string_agg,greatest, and similar helpers are executed in the same SELECT statement as the filter. They are intended for views or subqueries that pre-compute aggregates before applying the filter. -
Comparisons with
NULLmust useIS [NOT] NULL; SQL operators like=do not treatNULLas a value. -
JSON arrays returned from
val()arejsonb. If you need scalar comparisons against array elements, use array subscripts (val('$.tags')[0]) orCONTAINS/OVERLAPSoperators.
This reference mirrors the exact capabilities of the current GoodMem filter compiler. If you encounter a construct that should work but does not, confirm it is covered here before filing a bug—it likely requires a server-side enhancement.