This guide gives you a deep mental model of how FinWatch works internally. Understanding the architecture will help you make informed decisions about configuration, rule design, and production deployment.Documentation Index
Fetch the complete documentation index at: https://docs.finwatch.finance/llms.txt
Use this file to discover all available pages before exploring further.
High-Level Architecture
FinWatch is a single binary that embeds all of its dependencies. There is no external database to manage, no message queue to configure, and no separate rule engine to deploy. Everything runs in one process.Why DuckDB?
FinWatch chose DuckDB as its embedded analytical database for several compelling reasons:- Columnar Storage: DuckDB stores data in a columnar format, which is significantly more efficient for analytical queries (aggregations, filtering, scanning) than row-based databases like SQLite or PostgreSQL. When a rule asks “sum all amounts where source equals X in the last 24 hours,” DuckDB only reads the
amount,source, andtimestampcolumns — not the entire row. - Vectorized Execution: DuckDB processes data in batches (vectors) rather than row-by-row. This means aggregate functions like
COUNT,SUM, andAVGexecute at near-native speed, leveraging modern CPU architectures (SIMD instructions, cache-friendly access patterns). - Zero External Dependencies: DuckDB is an in-process database. There is no separate server to install, configure, or manage. It compiles into the FinWatch binary and runs inside the same process. This dramatically simplifies deployment — especially for an embeddable product that runs on the customer’s server.
- SQL Compatibility: DuckDB supports a rich SQL dialect, which means FinWatch can translate aggregate functions from the DSL into standard SQL queries. This makes the interpreter straightforward to implement and debug.
Trade-offs
Every architectural choice has trade-offs. DuckDB’s are:- Single-Writer Concurrency: DuckDB allows multiple concurrent reads but only a single writer at a time. FinWatch handles this with a mutex lock (
dbMutex) to serialize write operations. In practice, this is not a bottleneck because transaction ingestion is I/O-bound, not CPU-bound. - Local Storage: Data lives on the local filesystem as
.dbfiles in theblnk_agent/directory. This means the data is tied to the server. If the server is lost, the local data is lost. However, the source of truth for historical data is the Blnk PostgreSQL database (synced via the watermark pattern), and the source of truth for rules is the Git repository. FinWatch can be fully reconstructed from these external sources. - Memory Usage: DuckDB’s performance comes from keeping data in memory. As your transaction volume grows, so does DuckDB’s memory footprint. FinWatch provides a configurable
memory_limit(default:2GiB) and an auto-scaling feature to manage this. See the Production Deployment Guide for details.
Database Files
FinWatch creates two DuckDB databases:| File | Location | Purpose |
|---|---|---|
finwatch.db | finwatch_agent/blnk.db | Stores the transactions table — all ingested transaction data. |
instructions.db | finwatch_agent/instructions.db | Stores compiled rules as “instructions” — the JSON representation of parsed .ws files. |
finwatch_agent/duckdb_temp) is also created for DuckDB’s spill-to-disk operations when queries exceed the configured memory limit.
Connection Configuration
DuckDB is initialized with the following pragmas:threads = 1: Limits DuckDB to a single thread. This simplifies concurrency management and is sufficient for the single-writer model.memory_limit = '2GiB': The default upper bound on memory. Configurable via theFINWATCH_MEMORY_LIMITenvironment variable.checkpoint_threshold = '64MiB': Controls how frequently DuckDB writes its in-memory data to disk. A lower value means more frequent writes (safer but slower); a higher value means less frequent writes (faster but more data at risk during a crash).
Transaction Lifecycle
In One Line
Ingest → Store → Evaluate → Decide → Alert
- Loads active risk rules
- Prepares any required historical or aggregated context
- Uses historical patterns (e.g., frequency, past behaviour)
- Applies logic and time-based conditions. If a rule matches, it produces a risk signal.
- A risk score is computed
- A verdict is assigned (e.g., allow, alert, review, block)
- A risk level is determined (very low → high)
- A reason is generated
- An anomaly is sent in real-time to the monitoring system
- Includes key details (transaction info, risk score, reason, verdict)
Rule Compilation Pipeline
When you create or modify a.ws file, FinWatch detects the change and compiles the rule through a multi-stage pipeline:
Stage 1: Lexing. The Lexer reads the raw .ws text character by character and produces a stream of Token objects. Each token represents a fundamental language element: a keyword (rule, when, then), an operator (==, >), a literal (10000, "USD"), or a delimiter ({, }).
Stage 2: Parsing. The Parser consumes the token stream and builds an Abstract Syntax Tree (AST). The AST is a hierarchical representation of the rule’s structure. At the top is a RuleStatement containing a name, description, a when expression (which can be a nested tree of logical and comparison expressions), and a then action expression.
Stage 3: AST to JSON. The astToRule() function converts the AST into a Rule struct — a flat, JSON-serializable representation that the interpreter can evaluate efficiently. Logical expressions are flattened into a list of conditions. The JSON rule is stored in the instructions database.
Stage 4: Interpretation. At evaluation time, the interpreter reads the JSON rule and evaluates each condition against the transaction data. This separation of parsing (compile-time) and evaluation (runtime) means that rules are only parsed once, even if they are evaluated millions of times.
Why this pipeline exists: The pipeline separates concerns. The DSL provides a human-friendly authoring experience. The JSON intermediate format provides a machine-friendly evaluation target. This means you can write rules in the expressive .ws syntax, while the engine evaluates them in a format optimized for speed.
Data Synchronization
FinWatch can synchronize data from your Blnk PostgreSQL database into its local DuckDB using the watermark sync pattern. This is essential for aggregate functions — if a rule needs to count “transactions from this account in the last 24 hours,” the local DuckDB must contain that historical data.How It Works
- FinWatch connects to the Blnk PostgreSQL using the
BLNK_DSNconnection string. - It maintains a
sync_watermarktable in DuckDB that tracks the last synchronized position (a combination oflast_sync_timestampandlast_record_id). - On each sync cycle, it queries PostgreSQL for records created after the watermark.
- New records are inserted into the local DuckDB tables.
- The watermark is updated.
- No duplicates: Records are only synced once.
- No gaps: All records after the watermark are eventually synced.
- Efficient incremental updates: Only new records are transferred, not the entire dataset.
For the full technical specification, see the Watermark Sync Documentation.
Anomaly Reporting
FinWatch communicates with the Blnk Cloud dashboard through a WebSocket tunnel. This is a persistent, bidirectional connection that enables real-time anomaly reporting. When a transaction triggers one or more rules and the risk consolidator determines that the result warrants attention, anAnomalyMessage is sent through the tunnel. The message contains all the context a fraud analyst needs: the transaction ID, the risk score, the verdict, the reason, and the transaction’s metadata.
The WebSocket tunnel is initialized at startup and automatically reconnects if the connection is dropped. If the tunnel is unavailable, anomaly messages are logged locally but not sent — FinWatch does not block transaction processing due to a reporting failure.
Next Steps
Now that you understand the architecture:- Writing Your First Rule — Apply this knowledge to build your first rule step by step.
- Aggregate Functions Guide — Understand how aggregate functions translate to DuckDB SQL queries.
- Production Deployment — Configure memory limits, monitoring, and backups for a production environment.
- Integration Guide — Connect FinWatch to your application via the API or Blnk webhooks.
.png?fit=max&auto=format&n=0JF6z69u57hmqsWm&q=85&s=531373acedba0eb783b669f6d558dfd8)