Skip to main content

Orderbook Testing Architecture

Executive Summary

This document defines a comprehensive testing architecture for the Universal Orderbook system. The goal is to achieve Correctness by Construction - moving beyond "testing" (checking if it works 99% of the time) to Verification (proving it cannot fail).

The testing framework is organized into 6 Phases plus 3 Correctness Layers:

PhaseNamePurposeTools
1Oracle TestingCompare fast engine against simple referencePython + Rust
2Property-Based TestingFuzz invariants with random inputsproptest
3Deterministic SimulationReplay verification, hash comparisonJournal + SHA-256
4Performance TestingLatency p99, jitter, flamegraphscriterion
5Chaos EngineeringCrash recovery, flood resiliencekill -9, load gen
6Adversarial TestingMalicious actors, exploit attemptsEvil bots
LayerNamePurpose
AType-State CorrectnessMake invalid states unrepresentable
BFinancial SafetyDouble-entry ledger, liability bounds
CFormal VerificationMathematical proofs of properties

Testing Architecture Overview


Phase 1: Oracle Testing (Reference Model)

Concept

The Oracle Pattern compares your optimized Rust engine against a "golden source" - a simple, obviously-correct reference implementation.

What to Compare

PropertySlow EngineFast EngineMust Match
Best Bid Pricemax(bids)BTreeMap.last()✓ Exact
Best Ask Pricemin(asks)BTreeMap.first()✓ Exact
Total VolumeSum all ordersCached counter✓ Exact
Order Countlen(orders)Index size✓ Exact
Match ResultsList of fillsList of fills✓ Exact
Order StateDict lookupIndex lookup✓ Exact

Reference Engine Design


Phase 2: Property-Based Testing (Invariants)

The Laws of Physics

These invariants must never be violated, regardless of input sequence:

Invariant Test Matrix

InvariantTrigger ScenarioDetection Method
Crossed BookAggressive order crosses spreadAssert best_bid < best_ask post-op
ConservationAny add/cancel/matchTrack running totals, compare
ID UniquenessRapid order submissionHashSet membership check
Price-Time PrioritySame-price ordersVerify fill order = insertion order
Probability BoundsExtreme pricesClamp check in type constructor
Self-TradeSame user crossingCheck user_id before match

Fuzzer Strategy


Phase 3: Deterministic Simulation (Replay Testing)

The Time Machine

If you feed the exact same inputs in the same order, you must get the exact same state - down to the byte.

Sources of Non-Determinism

SourceProblemSolution
HashMap iterationRandom orderUse BTreeMap
SystemTime::now()Wall clock in logicPass timestamp as input
Thread schedulingRace conditionsSingle-threaded core
Floating pointCPU-specific roundingUse fixed-point integers
Random numbersDifferent seedsSeed from sequence number
Compiler optimizationInstruction reorderingPin compiler version

State Hashing Strategy

Cross-Machine Determinism Test


Phase 4: Performance Testing (Benchmarking)

Latency Targets

MetricTargetCriticalMeasurement Point
p50 Latency< 10µs< 50µsOrder → Ack
p99 Latency< 50µs< 200µsOrder → Ack
p99.9 Latency< 100µs< 500µsOrder → Ack
Tick-to-Trade< 20µs< 100µsReceive → Fill Event
Jitter (σ)< 5µs< 20µsStd dev of latency

Benchmark Architecture

Benchmark Scenarios


Phase 5: Chaos Engineering

Failure Scenarios

Power Plug Test Protocol

Flood Test Protocol


Phase 6: Adversarial Testing

Evil Bot Taxonomy

Agent-Based Simulation


Correctness Layer A: Type-State Pattern

Making Invalid States Unrepresentable

The Type-State pattern uses Rust's type system to ensure that invalid state transitions are compile-time errors, not runtime bugs.

Type-State Implementation

Market State Enforcement

What Type-State Prevents

BugWithout Type-StateWith Type-State
Cancel filled orderRuntime check, might missWon't compile
Match in suspended marketBoolean flag checkType mismatch error
Double settlementFlag checkMethod doesn't exist
Order on closed marketState validationType prevents

Correctness Layer B: Financial Safety

Double-Entry Ledger

Every operation must balance. No money is ever created or destroyed.

Worst-Case Liability Bounding

Settlement Idempotency


Correctness Layer C: Formal Verification

Proof Hierarchy

What Kani Proves

PropertyKani AssertionFailure Mode
No overflowprice * quantity safeInteger overflow
No panicunwrap() always succeedsNone unwrap on None
No OOBArray index in boundsIndex out of bounds
TerminationLoops terminateInfinite loop

Overflow Analysis Points


Bet Type & Market Correctness Matrix

Market Types

Settlement Test Matrix

Market TypeOutcomeSettlement LogicTest Scenario
BinaryYes winsPay all Yes backersFull payout
BinaryNo winsPay all No backersFull payout
1X2Home winsPay Home backersOne of three
1X2VoidRefund allEvent cancelled
Asian -0.5Win by 1+Full winClear outcome
Asian -0.5DrawFull lossPush scenario
Asian -0.25Win by 1Half winSplit stake
Over 2.53 goalsOver winsThreshold crossed
Over 2.52 goalsUnder winsBelow threshold

Void/Push Decision Tree


Price-Time Priority Verification

The Fairness Guarantee

If Order A arrives before Order B at the same price, then Order A is always matched first — under all cancel/modify scenarios.

Modify Semantics

ModificationLoses Priority?Rationale
Price change (any)✅ YesNew price = new queue position
Quantity increase✅ YesAdvantage over earlier orders
Quantity decrease❌ NoNo advantage gained
Metadata only❌ NoNo market impact

Priority Fuzzing Test


Auction State Correctness

Market States and Allowed Operations

Auction Invariants

InvariantDescriptionTest
No matching in auctionOrders rest, no fillsSubmit crossing orders, verify no fills
Uncrossing at auction endSingle price clearsVerify all fills at equilibrium price
Order priority preservedEarlier orders firstCheck fill order
Indicative price correctMaximizes volumeCompare with calculated equilibrium

Test Directory Structure


CI/CD Pipeline Integration


Operational Safeguards

Kill Switch Correctness

Config Drift Detection


Liveness Proofs

Beyond safety (nothing bad happens), we must prove liveness (good things eventually happen):

PropertyDefinitionTest Method
Order eventually matches or cancelsNo order stuck foreverTimeout simulation
Market eventually settlesNo perpetual open marketsState machine analysis
No deadlocksSystem always makes progressModel checking (TLA+)
No livelocksNot spinning without progressKani loop analysis

Complete Test Checklist

Matching Engine

TestTypeStatusPriority
Oracle comparisonPhase 1🔲P0
Property fuzzingPhase 2✅ ExistsP0
Price-time priority proofsPhase 2🔲P0
Modify semantics invariantsPhase 2🔲P1
Auction state enforcementPhase 2🔲P1
Replay determinismPhase 3🔲P0
Cross-machine determinismPhase 3🔲P1
Latency benchmarksPhase 4✅ ExistsP1
Crash recoveryPhase 5✅ ExistsP0
Flood testingPhase 5🔲P1
Adversarial botsPhase 6🔲P2

Financial Safety

TestTypeStatusPriority
Double-entry ledger invariantLayer B🔲P0
Worst-case liability boundsLayer B🔲P0
Settlement idempotencyLayer B🔲P0
Insurance fund drain testChaos🔲P1

Formal Verification

TestTypeStatusPriority
Type-state transitionsLayer A🔲P1
Kani overflow proofsLayer C🔲P0
Kani panic-freedomLayer C🔲P0
Liveness (no deadlock)Layer C🔲P2

Market/Bet Coverage

Market TypeDirect MatchSyntheticSettlementVoid/Push
BinaryN/A🔲
1X2🔲🔲
Asian HandicapN/A🔲🔲
Over/Under🔲N/A🔲🔲
Multi-Runner🔲🔲🔲🔲
Outright🔲🔲🔲🔲

Summary: The Testing Hierarchy


Appendix: Tool Reference

ToolPurposeCommand
cargo testUnit & integration testscargo test
proptestProperty-based fuzzingBuilt into test suite
criterionMicro-benchmarkscargo bench
cargo flamegraphCPU profilingcargo flamegraph --bin test_ui_server
KaniFormal verificationcargo kani --harness <name>
cargo fuzzCoverage-guided fuzzingcargo +nightly fuzz run <target>

Document Version: 1.0 Last Updated: 2025-12-27