Orderbook Testing Architecture

Executive Summary

This document defines a comprehensive testing architecture for the Universal Orderbook system. The goal is to achieve Correctness by Construction - moving beyond "testing" (checking if it works 99% of the time) to Verification (proving it cannot fail).

The testing framework is organized into 6 Phases plus 3 Correctness Layers:

Phase	Name	Purpose	Tools
1	Oracle Testing	Compare fast engine against simple reference	Python + Rust
2	Property-Based Testing	Fuzz invariants with random inputs	proptest
3	Deterministic Simulation	Replay verification, hash comparison	Journal + SHA-256
4	Performance Testing	Latency p99, jitter, flamegraphs	criterion
5	Chaos Engineering	Crash recovery, flood resilience	kill -9, load gen
6	Adversarial Testing	Malicious actors, exploit attempts	Evil bots

Layer	Name	Purpose
A	Type-State Correctness	Make invalid states unrepresentable
B	Financial Safety	Double-entry ledger, liability bounds
C	Formal Verification	Mathematical proofs of properties

Testing Architecture Overview

Phase 1: Oracle Testing (Reference Model)

Concept

The Oracle Pattern compares your optimized Rust engine against a "golden source" - a simple, obviously-correct reference implementation.

What to Compare

Property	Slow Engine	Fast Engine	Must Match
Best Bid Price	`max(bids)`	`BTreeMap.last()`	✓ Exact
Best Ask Price	`min(asks)`	`BTreeMap.first()`	✓ Exact
Total Volume	Sum all orders	Cached counter	✓ Exact
Order Count	`len(orders)`	Index size	✓ Exact
Match Results	List of fills	List of fills	✓ Exact
Order State	Dict lookup	Index lookup	✓ Exact

Reference Engine Design

Phase 2: Property-Based Testing (Invariants)

The Laws of Physics

These invariants must never be violated, regardless of input sequence:

Invariant Test Matrix

Invariant	Trigger Scenario	Detection Method
Crossed Book	Aggressive order crosses spread	Assert `best_bid < best_ask` post-op
Conservation	Any add/cancel/match	Track running totals, compare
ID Uniqueness	Rapid order submission	HashSet membership check
Price-Time Priority	Same-price orders	Verify fill order = insertion order
Probability Bounds	Extreme prices	Clamp check in type constructor
Self-Trade	Same user crossing	Check user_id before match

Fuzzer Strategy

Phase 3: Deterministic Simulation (Replay Testing)

The Time Machine

If you feed the exact same inputs in the same order, you must get the exact same state - down to the byte.

Sources of Non-Determinism

Source	Problem	Solution
`HashMap` iteration	Random order	Use `BTreeMap`
`SystemTime::now()`	Wall clock in logic	Pass timestamp as input
Thread scheduling	Race conditions	Single-threaded core
Floating point	CPU-specific rounding	Use fixed-point integers
Random numbers	Different seeds	Seed from sequence number
Compiler optimization	Instruction reordering	Pin compiler version

State Hashing Strategy

Cross-Machine Determinism Test

Phase 4: Performance Testing (Benchmarking)

Latency Targets

Metric	Target	Critical	Measurement Point
p50 Latency	< 10µs	< 50µs	Order → Ack
p99 Latency	< 50µs	< 200µs	Order → Ack
p99.9 Latency	< 100µs	< 500µs	Order → Ack
Tick-to-Trade	< 20µs	< 100µs	Receive → Fill Event
Jitter (σ)	< 5µs	< 20µs	Std dev of latency

Benchmark Architecture

Benchmark Scenarios

Phase 5: Chaos Engineering

Failure Scenarios

Power Plug Test Protocol

Flood Test Protocol

Phase 6: Adversarial Testing

Evil Bot Taxonomy

Agent-Based Simulation

Correctness Layer A: Type-State Pattern

Making Invalid States Unrepresentable

The Type-State pattern uses Rust's type system to ensure that invalid state transitions are compile-time errors, not runtime bugs.

Type-State Implementation

Market State Enforcement

What Type-State Prevents

Bug	Without Type-State	With Type-State
Cancel filled order	Runtime check, might miss	Won't compile
Match in suspended market	Boolean flag check	Type mismatch error
Double settlement	Flag check	Method doesn't exist
Order on closed market	State validation	Type prevents

Correctness Layer B: Financial Safety

Double-Entry Ledger

Every operation must balance. No money is ever created or destroyed.

Worst-Case Liability Bounding

Settlement Idempotency

Correctness Layer C: Formal Verification

Proof Hierarchy

What Kani Proves

Property	Kani Assertion	Failure Mode
No overflow	`price * quantity` safe	Integer overflow
No panic	`unwrap()` always succeeds	None unwrap on None
No OOB	Array index in bounds	Index out of bounds
Termination	Loops terminate	Infinite loop

Overflow Analysis Points

Bet Type & Market Correctness Matrix

Market Types

Settlement Test Matrix

Market Type	Outcome	Settlement Logic	Test Scenario
Binary	Yes wins	Pay all Yes backers	Full payout
Binary	No wins	Pay all No backers	Full payout
1X2	Home wins	Pay Home backers	One of three
1X2	Void	Refund all	Event cancelled
Asian -0.5	Win by 1+	Full win	Clear outcome
Asian -0.5	Draw	Full loss	Push scenario
Asian -0.25	Win by 1	Half win	Split stake
Over 2.5	3 goals	Over wins	Threshold crossed
Over 2.5	2 goals	Under wins	Below threshold

Void/Push Decision Tree

Price-Time Priority Verification

The Fairness Guarantee

If Order A arrives before Order B at the same price, then Order A is always matched first — under all cancel/modify scenarios.

Modify Semantics

Modification	Loses Priority?	Rationale
Price change (any)	✅ Yes	New price = new queue position
Quantity increase	✅ Yes	Advantage over earlier orders
Quantity decrease	❌ No	No advantage gained
Metadata only	❌ No	No market impact

Priority Fuzzing Test

Auction State Correctness

Market States and Allowed Operations

Auction Invariants

Invariant	Description	Test
No matching in auction	Orders rest, no fills	Submit crossing orders, verify no fills
Uncrossing at auction end	Single price clears	Verify all fills at equilibrium price
Order priority preserved	Earlier orders first	Check fill order
Indicative price correct	Maximizes volume	Compare with calculated equilibrium

Test Directory Structure

CI/CD Pipeline Integration

Operational Safeguards

Kill Switch Correctness

Config Drift Detection

Liveness Proofs

Beyond safety (nothing bad happens), we must prove liveness (good things eventually happen):

Property	Definition	Test Method
Order eventually matches or cancels	No order stuck forever	Timeout simulation
Market eventually settles	No perpetual open markets	State machine analysis
No deadlocks	System always makes progress	Model checking (TLA+)
No livelocks	Not spinning without progress	Kani loop analysis

Complete Test Checklist

Matching Engine

Test	Type	Status	Priority
Oracle comparison	Phase 1	🔲	P0
Property fuzzing	Phase 2	✅ Exists	P0
Price-time priority proofs	Phase 2	🔲	P0
Modify semantics invariants	Phase 2	🔲	P1
Auction state enforcement	Phase 2	🔲	P1
Replay determinism	Phase 3	🔲	P0
Cross-machine determinism	Phase 3	🔲	P1
Latency benchmarks	Phase 4	✅ Exists	P1
Crash recovery	Phase 5	✅ Exists	P0
Flood testing	Phase 5	🔲	P1
Adversarial bots	Phase 6	🔲	P2

Financial Safety

Test	Type	Status	Priority
Double-entry ledger invariant	Layer B	🔲	P0
Worst-case liability bounds	Layer B	🔲	P0
Settlement idempotency	Layer B	🔲	P0
Insurance fund drain test	Chaos	🔲	P1

Formal Verification

Test	Type	Status	Priority
Type-state transitions	Layer A	🔲	P1
Kani overflow proofs	Layer C	🔲	P0
Kani panic-freedom	Layer C	🔲	P0
Liveness (no deadlock)	Layer C	🔲	P2

Market/Bet Coverage

Market Type	Direct Match	Synthetic	Settlement	Void/Push
Binary	✅	N/A	✅	🔲
1X2	✅	✅	🔲	🔲
Asian Handicap	✅	N/A	🔲	🔲
Over/Under	🔲	N/A	🔲	🔲
Multi-Runner	🔲	🔲	🔲	🔲
Outright	🔲	🔲	🔲	🔲

Summary: The Testing Hierarchy

Appendix: Tool Reference

Tool	Purpose	Command
cargo test	Unit & integration tests	`cargo test`
proptest	Property-based fuzzing	Built into test suite
criterion	Micro-benchmarks	`cargo bench`
cargo flamegraph	CPU profiling	`cargo flamegraph --bin test_ui_server`
Kani	Formal verification	`cargo kani --harness <name>`
cargo fuzz	Coverage-guided fuzzing	`cargo +nightly fuzz run <target>`

Document Version: 1.0 Last Updated: 2025-12-27

Executive Summary​

Testing Architecture Overview​

Phase 1: Oracle Testing (Reference Model)​

Concept​

What to Compare​

Reference Engine Design​

Phase 2: Property-Based Testing (Invariants)​

The Laws of Physics​

Invariant Test Matrix​

Fuzzer Strategy​

Phase 3: Deterministic Simulation (Replay Testing)​

The Time Machine​

Sources of Non-Determinism​

State Hashing Strategy​

Cross-Machine Determinism Test​

Phase 4: Performance Testing (Benchmarking)​

Latency Targets​

Benchmark Architecture​

Benchmark Scenarios​

Phase 5: Chaos Engineering​

Failure Scenarios​

Power Plug Test Protocol​

Flood Test Protocol​

Phase 6: Adversarial Testing​

Evil Bot Taxonomy​

Agent-Based Simulation​

Correctness Layer A: Type-State Pattern​

Making Invalid States Unrepresentable​

Type-State Implementation​

Market State Enforcement​

What Type-State Prevents​

Correctness Layer B: Financial Safety​

Double-Entry Ledger​

Worst-Case Liability Bounding​

Settlement Idempotency​

Correctness Layer C: Formal Verification​

Proof Hierarchy​

What Kani Proves​

Overflow Analysis Points​

Bet Type & Market Correctness Matrix​

Market Types​

Settlement Test Matrix​

Void/Push Decision Tree​

Price-Time Priority Verification​

The Fairness Guarantee​

Modify Semantics​

Priority Fuzzing Test​

Auction State Correctness​

Market States and Allowed Operations​

Auction Invariants​

Test Directory Structure​

CI/CD Pipeline Integration​

Operational Safeguards​

Kill Switch Correctness​

Config Drift Detection​

Liveness Proofs​

Complete Test Checklist​

Matching Engine​

Financial Safety​

Formal Verification​

Market/Bet Coverage​

Summary: The Testing Hierarchy​

Appendix: Tool Reference​