Understanding the Architecture

This guide gives you a deep mental model of how FinWatch works internally. Understanding the architecture will help you make informed decisions about configuration, rule design, and production deployment.

High-Level Architecture

FinWatch is a single binary that embeds all of its dependencies. There is no external database to manage, no message queue to configure, and no separate rule engine to deploy. Everything runs in one process.

Why DuckDB?

FinWatch chose DuckDB as its embedded analytical database for several compelling reasons:

Columnar Storage: DuckDB stores data in a columnar format, which is significantly more efficient for analytical queries (aggregations, filtering, scanning) than row-based databases like SQLite or PostgreSQL. When a rule asks “sum all amounts where source equals X in the last 24 hours,” DuckDB only reads the amount, source, and timestamp columns — not the entire row.
Vectorized Execution: DuckDB processes data in batches (vectors) rather than row-by-row. This means aggregate functions like COUNT, SUM, and AVG execute at near-native speed, leveraging modern CPU architectures (SIMD instructions, cache-friendly access patterns).
Zero External Dependencies: DuckDB is an in-process database. There is no separate server to install, configure, or manage. It compiles into the FinWatch binary and runs inside the same process. This dramatically simplifies deployment — especially for an embeddable product that runs on the customer’s server.
SQL Compatibility: DuckDB supports a rich SQL dialect, which means FinWatch can translate aggregate functions from the DSL into standard SQL queries. This makes the interpreter straightforward to implement and debug.

Trade-offs

Every architectural choice has trade-offs. DuckDB’s are:

Single-Writer Concurrency: DuckDB allows multiple concurrent reads but only a single writer at a time. FinWatch handles this with a mutex lock (dbMutex) to serialize write operations. In practice, this is not a bottleneck because transaction ingestion is I/O-bound, not CPU-bound.
Local Storage: Data lives on the local filesystem as .db files in the blnk_agent/ directory. This means the data is tied to the server. If the server is lost, the local data is lost. However, the source of truth for historical data is the Blnk PostgreSQL database (synced via the watermark pattern), and the source of truth for rules is the Git repository. FinWatch can be fully reconstructed from these external sources.
Memory Usage: DuckDB’s performance comes from keeping data in memory. As your transaction volume grows, so does DuckDB’s memory footprint. FinWatch provides a configurable memory_limit (default: 2GiB) and an auto-scaling feature to manage this. See the Production Deployment Guide for details.

Database Files

FinWatch creates two DuckDB databases:

File	Location	Purpose
`finwatch.db`	`finwatch_agent/blnk.db`	Stores the `transactions` table — all ingested transaction data.
`instructions.db`	`finwatch_agent/instructions.db`	Stores compiled rules as “instructions” — the JSON representation of parsed `.ws` files.

A temporary directory (finwatch_agent/duckdb_temp) is also created for DuckDB’s spill-to-disk operations when queries exceed the configured memory limit.

Connection Configuration

DuckDB is initialized with the following pragmas:

SET access_mode = 'READ_WRITE';
SET threads = 1;
SET memory_limit = '2GiB';
SET checkpoint_threshold = '64MiB';

threads = 1: Limits DuckDB to a single thread. This simplifies concurrency management and is sufficient for the single-writer model.
memory_limit = '2GiB': The default upper bound on memory. Configurable via the FINWATCH_MEMORY_LIMIT environment variable.
checkpoint_threshold = '64MiB': Controls how frequently DuckDB writes its in-memory data to disk. A lower value means more frequent writes (safer but slower); a higher value means less frequent writes (faster but more data at risk during a crash).

Transaction Lifecycle

In One Line

Ingest → Store → Evaluate → Decide → Alert

Ingestion: A transaction enters FinWatch through an API (either directly or via webhook). Storage: The transaction is stored and made available for analysis. Evaluation Trigger: The system asynchronously picks up the transaction and prepares everything needed to assess it:

Loads active risk rules
Prepares any required historical or aggregated context

Rule Execution: Each rule is evaluated against the transaction: • Checks transaction attributes (e.g., amount, source)

Uses historical patterns (e.g., frequency, past behaviour)
Applies logic and time-based conditions. If a rule matches, it produces a risk signal.

Risk Decision: All risk signals are combined into a single outcome:

A risk score is computed
A verdict is assigned (e.g., allow, alert, review, block)
A risk level is determined (very low → high)
A reason is generated

Alerting: If the transaction is risky:

An anomaly is sent in real-time to the monitoring system
Includes key details (transaction info, risk score, reason, verdict)

Rule Compilation Pipeline

When you create or modify a .ws file, FinWatch detects the change and compiles the rule through a multi-stage pipeline: Stage 1: Lexing. The Lexer reads the raw .ws text character by character and produces a stream of Token objects. Each token represents a fundamental language element: a keyword (rule, when, then), an operator (==, >), a literal (10000, "USD"), or a delimiter ({, }). Stage 2: Parsing. The Parser consumes the token stream and builds an Abstract Syntax Tree (AST). The AST is a hierarchical representation of the rule’s structure. At the top is a RuleStatement containing a name, description, a when expression (which can be a nested tree of logical and comparison expressions), and a then action expression. Stage 3: AST to JSON. The astToRule() function converts the AST into a Rule struct — a flat, JSON-serializable representation that the interpreter can evaluate efficiently. Logical expressions are flattened into a list of conditions. The JSON rule is stored in the instructions database. Stage 4: Interpretation. At evaluation time, the interpreter reads the JSON rule and evaluates each condition against the transaction data. This separation of parsing (compile-time) and evaluation (runtime) means that rules are only parsed once, even if they are evaluated millions of times. Why this pipeline exists: The pipeline separates concerns. The DSL provides a human-friendly authoring experience. The JSON intermediate format provides a machine-friendly evaluation target. This means you can write rules in the expressive .ws syntax, while the engine evaluates them in a format optimized for speed.

Data Synchronization

FinWatch can synchronize data from your Blnk PostgreSQL database into its local DuckDB using the watermark sync pattern. This is essential for aggregate functions — if a rule needs to count “transactions from this account in the last 24 hours,” the local DuckDB must contain that historical data.

How It Works

FinWatch connects to the Blnk PostgreSQL using the BLNK_DSN connection string.
It maintains a sync_watermark table in DuckDB that tracks the last synchronized position (a combination of last_sync_timestamp and last_record_id).
On each sync cycle, it queries PostgreSQL for records created after the watermark.
New records are inserted into the local DuckDB tables.
The watermark is updated.

This approach ensures:

No duplicates: Records are only synced once.
No gaps: All records after the watermark are eventually synced.
Efficient incremental updates: Only new records are transferred, not the entire dataset.

The sync handles four entity types: transactions, identities, balances, and ledgers.

For the full technical specification, see the Watermark Sync Documentation.

Anomaly Reporting

FinWatch communicates with the Blnk Cloud dashboard through a WebSocket tunnel. This is a persistent, bidirectional connection that enables real-time anomaly reporting. When a transaction triggers one or more rules and the risk consolidator determines that the result warrants attention, an AnomalyMessage is sent through the tunnel. The message contains all the context a fraud analyst needs: the transaction ID, the risk score, the verdict, the reason, and the transaction’s metadata. The WebSocket tunnel is initialized at startup and automatically reconnects if the connection is dropped. If the tunnel is unavailable, anomaly messages are logged locally but not sent — FinWatch does not block transaction processing due to a reporting failure.

Next Steps

Now that you understand the architecture:

Writing Your First Rule — Apply this knowledge to build your first rule step by step.
Aggregate Functions Guide — Understand how aggregate functions translate to DuckDB SQL queries.
Production Deployment — Configure memory limits, monitoring, and backups for a production environment.
Integration Guide — Connect FinWatch to your application via the API or Blnk webhooks.

Documentation Index

​High-Level Architecture

​Why DuckDB?

​Trade-offs

​Database Files

​Connection Configuration

​Transaction Lifecycle