Documentation Index
Fetch the complete documentation index at: https://docs.finwatch.finance/llms.txt
Use this file to discover all available pages before exploring further.
Aggregate functions are the most powerful feature of the Watch Script DSL. They allow you to move beyond single-transaction checks and detect behavioral patterns that unfold across multiple transactions over time. This is what separates FinWatch from basic rule engines.
Why Aggregates Matter
Consider a common fraud pattern: structuring (also called “smurfing”). A bad actor deliberately breaks a large transfer into multiple smaller ones to stay under reporting thresholds. Each individual transaction looks perfectly normal — it’s only the aggregate behavior that’s suspicious.
A rule like when amount > 10000 would never catch this. The whole point of structuring is that each transaction is under the threshold.
But with aggregates, you can write:
rule StructuringDetection {
description "Detect multiple deposits that may be structured to evade reporting thresholds."
when amount < 10000
and count(when source == $current.source, "PT24H") >= 3
and sum(when source == $current.source, "PT24H") > 25000
then review
score 0.8
reason "Possible structuring: multiple sub-threshold deposits exceeding $25,000 in 24 hours"
}
This rule says: “If the current transaction is under 10,000,butthesamesourceaccounthasmadeatleast3transactionsinthelast24hoursthattotalmorethan25,000 — flag it.”
This is impossible with single-transaction checks. With aggregates, it’s a few lines of clear, declarative logic.
The Anatomy of an Aggregate Call
Let’s break down a complete aggregate expression token by token:
count(when destination == $current.destination, "PT24H") > 10
│ │ │ │ │ │ │ │
│ │ │ │ │ │ │ └─ Threshold value
│ │ │ │ │ │ └──── Comparison operator
│ │ │ │ │ └───────────── Time window (ISO 8601)
│ │ │ │ └──────────────────────────────────── Dynamic reference to current txn
│ │ │ └─────────────────────────────────────── Comparison operator (inside filter)
│ │ └─────────────────────────────────────────────────── Field to filter on
│ └──────────────────────────────────────────────────────── Filter keyword
└────────────────────────────────────────────────────────────── Aggregate function name
The structure is:
<function>(<filter_condition>, "<time_window>") <operator> <threshold>
Each component:
- Function name (
count, sum, avg, max, min) — What metric to compute.
- Filter condition (
when destination == $current.destination) — Which historical transactions to include. Uses the same syntax as the rule’s when clause, but applied to historical data.
- Time window (
"PT24H") — How far back in time to look.
- Comparison (
> 10) — The threshold to compare the computed metric against.
The when Filter Inside Aggregates
The filter condition inside an aggregate function determines which historical transactions are included in the computation. Without a filter, the aggregate would count or sum all transactions in the time window — which is rarely useful.
Using $current for Self-Referencing Filters
The $current variable is the key to making aggregates useful. It refers to the transaction currently being evaluated, allowing you to ask questions like “how many transactions from this same source” or “what’s the total amount to this same destination.”
// Count transactions from the SAME source account
count(when source == $current.source, "PT24H")
// Sum amounts going to the SAME destination
sum(when destination == $current.destination, "PT24H")
At runtime, $current.source is resolved to the actual value of the source field in the current transaction. If the current transaction has source: "acct_alice", then the filter becomes source == "acct_alice" and DuckDB queries:
SELECT COUNT(*) FROM transactions
WHERE source = 'acct_alice'
AND timestamp >= '2026-04-17T14:30:00' -- now minus 24 hours
Filter Field Selection
The filter condition always references a field in the historical transactions table. The most common filter fields are:
| Filter Pattern | What It Answers |
|---|
source == $current.source | ”How many transactions has this source account made?” |
destination == $current.destination | ”How many transactions has this destination received?” |
source == $current.source (in sum) | “What’s the total outflow from this account?” |
destination == $current.destination (in sum) | “What’s the total inflow to this account?” |
Time Windows Explained
Every aggregate function requires a time window in ISO 8601 duration format. The time window defines the lookback period — how far into the past the aggregate should scan.
| Format | Duration | Use Case |
|---|
"PT30S" | 30 seconds | Rapid-fire card testing detection |
"PT1M" | 1 minute | Burst detection |
"PT5M" | 5 minutes | Short-burst velocity checks |
"PT15M" | 15 minutes | BIN attack detection |
"PT1H" | 1 hour | Hourly velocity limits |
"PT24H" | 24 hours | Daily limits and patterns |
"P1D" | 1 day | Same as PT24H |
"P7D" | 7 days | Weekly pattern analysis |
"P30D" | 30 days | Monthly behavioral baselines |
How the Parser Handles Durations
The interpreter uses a minimal ISO 8601 parser that supports:
PT<n>H — hours
PT<n>M — minutes
PT<n>S — seconds
P<n>D — days (converted to n * 24 hours)
Choosing the Right Window Size
The time window you choose has a direct impact on both detection accuracy and performance:
- Too small: You’ll miss patterns that span a longer period. A 1-hour window won’t catch structuring that happens over an entire day.
- Too large: You’ll get false positives from legitimate historical activity, and the query will be slower because DuckDB has to scan more data.
Guidelines:
| Pattern | Recommended Window |
|---|
| Card testing (micro-transactions) | "PT5M" to "PT15M" |
| Rapid velocity (burst spending) | "PT1H" |
| Daily structuring | "PT24H" |
| Weekly behavioral anomaly | "P7D" |
| Monthly baseline deviation | "P30D" |
count() — Frequency Detection
count() returns the number of transactions that match the filter condition within the time window.
Signature
count(<filter_condition>, "<time_window>") <operator> <threshold>
Return Value
An integer — the number of matching historical transactions.
Use Cases
- Detecting an unusually high number of transactions (velocity abuse)
- Identifying rapid-fire micro-transactions (card testing)
- Counting failed attempts before a successful transaction
Example: High Frequency to Same Destination
rule HighFrequencyDestination {
description "Unusually frequent payments to the same destination may require scrutiny."
when count(when destination == $current.destination, "PT24H") > 10
and amount > 100
then review
score 0.5
reason "High frequency of transactions to same destination in 24 hours"
}
What this detects: If more than 10 transactions have been sent to the same destination in the last 24 hours, and the current transaction is over $100, flag it for review. This catches automated fraud where a compromised account is used to funnel money to a single destination.
The SQL generated internally:
SELECT COUNT(*) FROM transactions
WHERE destination = 'acct_bob_456'
AND timestamp >= '2026-04-17T14:30:00'
Example: Rapid Small Burst
rule RapidSmallBurst {
description "Detects rapid small transactions that may indicate card testing."
when count(when source == $current.source, "PT5M") > 5
and amount < 10
then block
score 0.9
reason "Rapid burst of micro-transactions detected — possible card testing"
}
What this detects: More than 5 transactions from the same source in 5 minutes, each under $10. This is the classic signature of card testing — where a fraudster uses stolen card numbers to make small test purchases before attempting larger ones.
sum() — Volume Detection
sum() returns the total amount of all transactions that match the filter condition within the time window.
Signature
sum(<filter_condition>, "<time_window>") <operator> <threshold>
Return Value
A float — the sum of the amount field for all matching transactions.
Use Cases
- Detecting structuring (aggregate amount exceeds reporting threshold)
- Monitoring account draining (total outflow exceeds a limit)
- Enforcing daily/weekly transaction volume limits
Example: Source Account High Outflow
rule SourceHighOutflow {
description "Source account has high outflow volume in 24h."
when sum(when source == $current.source, "PT24H") > 5000
then review
score 0.5
reason "High cumulative outflow from source in 24 hours"
}
What this detects: The total amount sent from this source account in the last 24 hours exceeds $5,000. This catches scenarios where an account is being drained through many small transactions.
Example: Destination High Inflow
rule DestinationHighInflow {
description "Destination account receiving unusually high inflow volume."
when sum(when destination == $current.destination, "PT24H") > 50000
then review
score 0.6
reason "Unusually high inflow to destination in 24 hours"
}
What this detects: A single destination account is receiving more than $50,000 in the last 24 hours. This catches money mule accounts — intermediary accounts used to receive and re-distribute stolen funds.
avg() — Behavioral Deviation
avg() returns the average amount of all transactions that match the filter condition within the time window.
Signature
avg(<filter_condition>, "<time_window>") <operator> <threshold>
Return Value
A float — the mean of the amount field for all matching transactions.
Use Cases
- Detecting transactions that are significantly larger than a user’s typical behavior
- Identifying sudden changes in spending patterns
- Baseline comparison for anomaly detection
Example: Transaction Far Exceeds Average
rule UnusualAmountForSource {
description "Transaction amount significantly exceeds the source's 30-day average."
when avg(when source == $current.source, "P30D") < 500
and amount > 5000
then review
score 0.6
reason "Transaction amount far exceeds source's 30-day average spending pattern"
}
What this detects: If the source account’s average transaction over the last 30 days is under 500,butthecurrenttransactionisover5,000, flag it. This is a behavioral anomaly — the account is doing something dramatically different from its established pattern, which could indicate account takeover.
Note: This rule has two conditions connected by and. The avg() aggregate computes the historical average, and the amount > 5000 check ensures we only flag genuinely large transactions (not just any transaction from a low-activity account).
max() and min() — Detecting Extremes
max() returns the largest amount and min() returns the smallest amount among matching transactions.
Signatures
max(<filter_condition>, "<time_window>") <operator> <threshold>
min(<filter_condition>, "<time_window>") <operator> <threshold>
Use Cases for max()
- Detecting if the current transaction is a new peak for this account
- Identifying escalating transaction amounts (a common fraud progression pattern)
Use Cases for min()
- Detecting card testing alongside normal transactions (presence of micro-transactions)
- Identifying accounts that have recently started making unusually small transactions
Example: Escalating Transaction Amounts
rule EscalatingAmounts {
description "Transaction exceeds the historical maximum for this source."
when amount > 10000
and max(when source == $current.source, "P30D") < 5000
then review
score 0.7
reason "Transaction amount exceeds historical maximum for this source account"
}
What this detects: The current transaction ($10,000+) is more than double the largest transaction this source account has ever made in the last 30 days. This pattern is common in account takeover — the fraudster quickly escalates to maximize theft.
Combining Aggregates with Simple Conditions
The most effective rules combine cheap simple checks with expensive aggregate checks. This follows the “gate and probe” pattern:
- Gate: A cheap simple condition that filters out the vast majority of transactions.
- Probe: An expensive aggregate that deeply analyzes the remaining transactions.
The Gate-and-Probe Pattern
rule HighFrequencyDestination {
description "Unusually frequent payments to the same destination may require scrutiny."
// GATE: Only consider transactions over $100.
// This eliminates micro-transactions and reduces aggregate query load.
when amount > 100
// PROBE: Now run the expensive aggregate check.
and count(when destination == $current.destination, "PT24H") > 10
then review
score 0.5
reason "High frequency of transactions to same destination in 24 hours"
}
Because of short-circuit evaluation, if amount <= 100, the count() function is never called. In a system processing thousands of transactions per second, this can eliminate 80-90% of aggregate queries.
Combining Multiple Aggregates
You can use multiple aggregate functions in the same rule:
rule StructuringDetection {
description "Detect potential structuring activity."
when amount < 10000
and count(when source == $current.source, "PT24H") >= 3
and sum(when source == $current.source, "PT24H") > 25000
then review
score 0.8
reason "Possible structuring: multiple sub-threshold deposits totaling over $25,000"
}
This rule uses both count() and sum() to detect structuring: at least 3 transactions (frequency), totaling over 25,000(volume),eachunder10,000 (sub-threshold).
Aggregate functions are the primary performance bottleneck in FinWatch because they execute SQL queries against DuckDB. Understanding how they work internally helps you write faster rules.
How Aggregates Translate to SQL
Each aggregate function is compiled to a SQL query. The engine uses a mapping:
| Function | SQL Expression |
|---|
count() | COUNT(*) |
sum() | COALESCE(SUM(amount), 0) |
avg() | COALESCE(AVG(amount), 0) |
max() | COALESCE(MAX(amount), 0) |
min() | COALESCE(MIN(amount), 0) |
The full query template:
WITH filtered_txns AS (
SELECT * FROM transactions
WHERE <filter_field> = <filter_value>
AND timestamp >= <now - time_window>
)
SELECT CAST(
CASE WHEN COUNT(*) = 0 THEN 0
ELSE <metric_expression>
END AS DOUBLE
) as metric_result
FROM filtered_txns
Batch Aggregate Context
FinWatch optimizes aggregate evaluation by pre-computing all aggregate values needed for a transaction in a single batch before rule evaluation begins. This is the BuildAggContext() function. It:
- Scans all active rules for aggregate conditions.
- Groups them by unique
(metric, time_window, filter_field, filter_value) tuples.
- Executes one SQL query per unique tuple.
- Caches the results in a
map[string]float64.
During rule evaluation, aggregate lookups are simple map reads — not SQL queries. This means if 10 rules all check count(when source == $current.source, "PT24H"), the SQL query only runs once.
Optimization Tips
- Use the smallest time window that works.
"PT1H" scans far less data than "P30D".
- Gate aggregates with simple checks. Put
amount > X before count(...) in your and chain.
- Avoid redundant aggregates. If two rules check the same metric with the same filter and time window, the engine only queries once — but it’s still good practice to be aware of this.
- Monitor query times. FinWatch logs aggregate query execution times. Watch for queries that take more than 100ms — they may indicate that your time windows are too large or your transaction table is growing beyond DuckDB’s comfortable working set.
Next Steps