GoodMem
Reference

Filter Expressions Reference

Detailed reference for the GoodMem metadata filter language including syntax, types, operators, and functions.

Filter Expressions Reference

GoodMem filter expressions are compiled to PostgreSQL predicates that operate on the JSON metadata column attached to each memory. This page describes the language precisely so you can author reliable filters and understand the translation rules that run on the server.

The implementation is defined by Filter.g4 (ANTLR grammar) and the SQL emitters in com.goodmem.retrieval.sql.emit. Everything documented here matches the behaviour in those sources.

Expression Model

  • A filter expression must evaluate to a boolean. The server rejects expressions that end in a non-boolean value.
  • Expressions are case-insensitive for keywords and function names. Identifiers are matched case-insensitively by the parser but produce lower-case function dispatch.
  • Whitespace and -- line comments are ignored.
  • Identifiers (function names) follow [A-Za-z_][A-Za-z0-9_]*.
  • String literals use single quotes with doubled single quotes for escaping ('O''Brien').
  • Numeric literals may be integers or decimals with optional scientific notation (for example 42, 3.14, 6.02e23).
  • Boolean literals are TRUE and FALSE; NULL represents the absence of a value.
  • Array literals are JSON-style: [expr, expr, ...]. The result is stored as jsonb.
  • Any expression may apply array subscripting like value[0]. Indexes are zero-based and must evaluate to a non-negative integer; they use PostgreSQL jsonb_array_element at runtime.

Operator Precedence

From lowest to highest precedence:

PrecedenceOperatorsNotes
1ORLeft-associative
2ANDLeft-associative
3NOTUnary
4Comparisons (=, ==, !=, <>, <, <=, >, >=, [NOT] LIKE, [NOT] ILIKE, IS [NOT] NULL, IN (...), value [NOT] IN array, CONTAINS, OVERLAPS)Comparison and membership tests
5`
6+, -Addition and subtraction
7*, /, %Multiplication, division, modulo
8Unary +, Unary -Arithmetic negation or no-op
9CAST(... AS type)Type conversion
10Literals, function calls, parenthesised expressions, JSON path extractionHighest precedence

Parentheses can always be used to override precedence.

Accessing Metadata

GoodMem stores metadata as JSON. The filter language offers two entry points into that JSON:

FunctionReturnNotes
val('<jsonpath>')jsonbExtracts the first match for the supplied PostgreSQL JSONPath. The path must be a string literal. When used inside a CAST, the underlying implementation issues JSON_VALUE with NULL ON EMPTY NULL ON ERROR, giving you typed scalars when the path exists and NULL otherwise. Outside a cast, comparisons operate on jsonb values.
exists('<jsonpath>')booleanUses jsonb_path_exists to test whether a path produces at least one result. The argument must be a string literal JSONPath.

JSONPath specifics

  • Paths follow PostgreSQL JSONPath syntax ($.field, $.array[0], $.foo[*].bar).
  • Array slices, wildcards, negative indexes, and predicates are supported exactly as PostgreSQL does—whatever jsonb_path_query_first accepts is valid here.
  • Missing paths yield NULL from val and FALSE from exists.

Array Accessors

After any expression that returns a JSON array you may append [index] to fetch jsonb_array_element(array_expression, index). The index expression is cast to integer. Negative indexes are not supported by PostgreSQL and will produce a runtime error if used.

Array Literals

[ ] produce JSON arrays. Mixed element types are allowed; they are emitted via jsonb_build_array.

Type System and Casting

CAST(expr AS type) converts values using the TypeSpec rules enforced in TypeSpec.java. The following targets are accepted (aliases shown in parentheses):

TypeNotes
numeric[(precision[, scale])] (decimal)Precision optional; scale optional when precision supplied.
integer (int)No parameters allowed.
bigint (int8)No parameters allowed.
double precisionSynonyms: double.
real (float4)Single-precision float.
boolean (bool)
textUnlimited-length string.
varchar(length) (character varying)Length is mandatory; only one argument permitted.
dateCalendar date.
timestampTimestamp without time zone.
timestamptz (timestamp with time zone)
uuidStandard 128-bit UUID.
inetReturned as text; useful when metadata stores IP addresses.

Notes:

  • Unsupported type names or illegal parameter counts raise INVALID_ARGUMENT during semantic validation.
  • Casting a val() call uses JSON_VALUE under the hood. Invalid conversions (for example, trying to read a string as INTEGER) return NULL instead of throwing, mirroring NULL ON ERROR semantics.
  • Casting any other expression relies on PostgreSQL’s implicit casts via jOOQ.

Operators

Comparison and Logical Operators

FormResultDetails
left = right, left == rightbooleanUses PostgreSQL equality; automatically coerces JSON operands when one side is a scalar and the other is JSON (for array subscripts).
left != right, left <> rightbooleanInequality.
left < right, <=, >, >=booleanNumeric or lexicographic comparison. Operands are cast to numeric/text by PostgreSQL depending on context.
expr IS NULL, expr IS NOT NULLbooleanTests for NULL result.
expr IN (value, ...), expr NOT IN (value, ...)booleanValues inside the list are coerced to the operand’s SQL type before comparison.
scalar [NOT] IN array_exprbooleanTreats array_expr as JSON and checks whether it contains the scalar. NOT IN is written NOT (scalar IN array_expr) or scalar NOT IN array_expr.
left [NOT] LIKE patternbooleanCase-sensitive pattern match. % matches any sequence of characters; _ matches a single character. NOT LIKE negates the result.
left [NOT] ILIKE patternbooleanCase-insensitive variant of LIKE. Supports NOT ILIKE.
array_expr CONTAINS other_arraybooleanTrue when array_expr @> other_array (superset test after both sides are coerced to JSONB).
array_expr OVERLAPS other_arraybooleanTrue when arrays share at least one element (EXISTS join over jsonb_array_elements).
left AND rightbooleanShort-circuit semantics provided by PostgreSQL evaluation order.
left OR rightboolean
NOT exprbooleanUnary negation.

Arithmetic and Concatenation

FormReturn TypeNotes
left + right, -, *, /, %numericBoth operands are treated as numeric; results are emitted as PostgreSQL numeric. Casting may be required if you expect integers. Division follows PostgreSQL semantics (precision depends on operands).
text_left || text_righttextString concatenation. Operands are cast to text.

Temporal Keywords and Functions

ConstructReturnArgumentsDetails
CURRENT_DATEdatenonePostgreSQL current_date.
CURRENT_TIMEtime with time zoneoptional precision integerEmits current_time or current_time(n).
CURRENT_TIMESTAMPtimestamp with time zoneoptional precision integerNo arguments or a single integer precision.
now()timestamp with time zonenoneAlias for PostgreSQL now().
EXTRACT(field FROM expr)numericfield from list below; expr must be timestamp, timestamptz, time, or interval. Only the following field values are accepted: CENTURY, DAY, DECADE, DOW, DOY, EPOCH, HOUR, ISODOW, ISOYEAR, MICROSECONDS, MILLENNIUM, MILLISECOND, MILLISECONDS, MINUTE, MONTH, QUARTER, SECOND, TIMEZONE, TIMEZONE_HOUR, TIMEZONE_MINUTE, WEEK, YEAR.
date_trunc('precision', value)same as inputprecision string literal; value timestamp/time expressionCalls PostgreSQL date_trunc. An optional third argument zone string literal is supported for timestamptz values.
to_timestamp(number)timestamp with time zonedouble precision secondsNumeric overload uses PostgreSQL epoch conversion.
to_timestamp(text, format)timestamp with time zoneBoth arguments textFormat string follows PostgreSQL patterns.

String-Oriented Functions

All string functions coerce their arguments to text before calling the underlying PostgreSQL function.

FunctionReturnArgumentsBehaviour
char_length(value)integerexpressionStandard character length. The optional USING CHARACTERS clause is accepted and treated as characters.
substring(value FROM start [FOR length])textstart/length numeric expressionsUses PostgreSQL substring. Omitting FROM returns the entire string cast to text.
position(needle IN haystack)integertext expressions1-based position, or 0 if not found.
trim(value)textexpressionRemoves leading and trailing whitespace.
trim([LEADING|TRAILING|BOTH] [trim_char] FROM value)textoptional direction and trim characterMirrors PostgreSQL trim.
upper(value)textexpressionUppercase conversion.
lower(value)textexpressionLowercase conversion.
octet_length(value)integerexpressionBytes of UTF‑8 representation.
replace(source, from, to)textthree text expressionsDirect call to replace.
regexp_replace(source, pattern, replacement[, flags])textpattern and replacement textPostgreSQL regular expression replace; flags optional.
string_agg(value, delimiter)textboth text expressionsAggregates within the current row context. Typically used after grouping in subqueries that feed filters.

Numeric and Math Functions

Arguments are coerced to numeric or double precision as noted.

FunctionReturnArgumentsNotes
abs(value)numeric1Absolute value.
sign(value)numeric1Sign of the number.
ceil(value) / ceiling(value)numeric1Ceiling.
floor(value)numeric1Floor.
round(value[, scale])numeric1 or 2Optional scale integer.
trunc(value[, scale])numeric1 or 2Optional scale integer.
mod(dividend, divisor)numeric2Remainder with PostgreSQL mod.
power(base, exponent)numeric2
sqrt(value)double precision1
exp(value)double precision1Natural exponential.
ln(value)double precision1Natural logarithm.
log10(value)double precision1Base-10 logarithm.
log(value)double precision1Natural log when a single argument is provided.
log(base, value)numeric2Explicit base logarithm.
cbrt(value)double precision1Cube root.
radians(value)double precision1Degrees to radians.
degrees(value)double precision1Radians to degrees.
greatest(value, ...)numeric≥2All arguments coerced to numeric.
least(value, ...)numeric≥2All arguments coerced to numeric.
width_bucket(expr, min, max, count)integer4Computes bucket index after coercing operands to numeric. Inputs that are NaN are wrapped with NULLIF(..., 'NaN'::numeric) to avoid PostgreSQL errors.
isfinite(value)boolean1Returns TRUE if the coerced double precision value is not NaN or ±Infinity.
isnan(value)boolean1Returns TRUE when the coerced double precision value is NaN.
pi()double precision0Constant π.

Array Utility Functions

FunctionReturnArgumentsNotes
array_length(value)integerexpression producing a JSON arrayCalls jsonb_array_length. Non-array input raises an error at runtime.

Practical Tips

  • Always CAST a val() expression before numeric comparison to avoid implicit string ordering. Example:

    CAST(val('$.retentionPolicy.archiveAfterMonths') AS INTEGER) >= 0
  • Use exists when checking for the presence of a complex structure instead of comparing against NULL.

    exists('$.owners[*] ? (@.role == "Process Owner")')
  • When composing arrays for membership tests, build them explicitly so the intent is clear:

    'priority-handling' IN val('$.tags')
    AND val('$.operationalContext.priorityLevels') OVERLAPS ['P1', 'P0']
  • string_agg, greatest, and similar helpers are executed in the same SELECT statement as the filter. They are intended for views or subqueries that pre-compute aggregates before applying the filter.

  • Comparisons with NULL must use IS [NOT] NULL; SQL operators like = do not treat NULL as a value.

  • JSON arrays returned from val() are jsonb. If you need scalar comparisons against array elements, use array subscripts (val('$.tags')[0]) or CONTAINS/OVERLAPS operators.

This reference mirrors the exact capabilities of the current GoodMem filter compiler. If you encounter a construct that should work but does not, confirm it is covered here before filing a bug—it likely requires a server-side enhancement.