π SuperSkin Services: Data & API Strategy
Version: 2.0 Date: January 2026 Status: β IMPLEMENTED Focus: 6 production services with sport-aware ML predictions
Executive Summaryβ
This document details the data sources, APIs, and implementation approach for the 6 SuperSkin services that are now fully implemented. The architecture features:
- Grok (xAI) as the primary LLM with native X Search and Web Search
- Sport-aware ML models for football, cricket, and tennis
- Graceful fallback to LLM intelligence for unsupported sports (basketball, esports, etc.)
- Multi-source data from Oracle Platform (real-time) and Forsyt Data Machine (historical)
Key Architecture: ML models augment Grok's intelligence for supported sports. For unsupported sports, Grok provides its own analysis using odds-implied probabilities and its knowledge base.
β What's Built (Current State)β
Core Infrastructureβ
| Component | Location | Status | Purpose |
|---|---|---|---|
| Oddspapi Integration | oracle-platform/ | β BUILT | Real-time odds from 300+ bookmakers |
| Redis Pub/Sub | SuperSkin services | β BUILT | Event distribution & caching |
| TimescaleDB | SuperSkin infra | β BUILT | Time-series data for charts |
| PostgreSQL | Supabase (IceCrystal) | β BUILT | Historical match data warehouse |
SuperSkin Services (All Implemented)β
| # | Service | Port | Status | Purpose |
|---|---|---|---|---|
| 1 | Price Feed Aggregator | 3100 | β BUILT | Normalized odds aggregation |
| 2 | Cash Out Calculator | 3101 | β BUILT | Real-time cash-out quotes |
| 3 | AI Value Detection | 3102 | β BUILT | Shin/Poisson/ELO value signals |
| 4 | AI Chat Assistant | 3103 | β BUILT | Grok-powered RAG assistant |
| 5 | Trading Charts | 3104 | β BUILT | OHLC candlestick data |
| 6 | ML Prediction Service | 3105 | β BUILT | Sport-specific ML inference |
ML Training Pipeline (Offline)β
| Component | Status | Purpose |
|---|---|---|
| Data Loader | β BUILT | Load football/cricket/tennis from Supabase |
| Feature Engineering | β BUILT | Sport-specific feature creation |
| Model Training | β BUILT | XGBoost, Random Forest, Neural Network |
| Model Registry | β BUILT | Auto-discovery of sport-specific models |
πΊοΈ 6 Services Architectureβ
| # | Service | Input | Output | ML Support |
|---|---|---|---|---|
| 1 | Price Feed Aggregator | Oddspapi odds | Normalized prices β Redis | N/A |
| 2 | Cash Out Calculator | Positions + prices | Cash out quotes | N/A |
| 3 | AI Value Detection | Prices + historical | Value signals | Shin/Poisson/ELO |
| 4 | AI Chat Assistant | User queries | Analysis + predictions | β Football/Cricket/Tennis |
| 5 | Trading Charts | Price history | OHLC endpoints | N/A |
| 6 | ML Prediction Service | Match features | Probabilities | β Football/Cricket/Tennis |
Sport-Aware ML Architectureβ
π Service Implementation Detailsβ
1οΈβ£ Price Feed Aggregator Serviceβ
Purpose: Normalize odds from multiple bookmakers into a unified format
What it consumes:
Oddspapi WebSocket β odds:updated:soccer channel
Redis Cache β odds:soccer:{matchId}
What it produces:
Redis Pub/Sub β prices:normalized:{sport}:{matchId}
Redis Cache β prices:best:{matchId} (best back/lay per outcome)
Data Structure (Normalized Price):
interface NormalizedPrice {
matchId: string;
sport: 'soccer' | 'cricket' | 'tennis';
timestamp: Date;
outcomes: {
[outcomeId: string]: {
name: string; // "Home", "Draw", "Away"
bestBack: { odds: number; bookmaker: string; limit: number };
bestLay: { odds: number; bookmaker: string; limit: number };
consensus: number; // Weighted average
allPrices: { bookmaker: string; back: number; lay: number }[];
};
};
margin: number; // Calculated overround
sharpness: number; // How close to Pinnacle
}
Implementation Steps:
- Subscribe to
odds:updated:*from Redis Pub/Sub - For each update, fetch full odds from cache
- Apply normalization (consensus calculation already exists in
odds-aggregation.service.ts) - Publish to
prices:normalized:*channel - Cache best prices with 5s TTL
2οΈβ£ AI Value Detection Serviceβ
Purpose: Identify bets where our odds offer positive expected value
What it consumes:
Redis Pub/Sub β prices:normalized:{sport}:{matchId}
Internal odds β From matching engine (Forsyt order book)
What it produces:
Redis Pub/Sub β value:detected:{matchId}
WebSocket β value:alert (to subscribed users)
Value Detection Algorithm:
// Calculate fair probability from sharp books (Pinnacle-weighted)
function calculateFairOdds(normalizedPrice: NormalizedPrice): number {
const pinnacleWeight = 0.6;
const betfairWeight = 0.25;
const consensusWeight = 0.15;
// Remove margins, calculate implied probabilities
// Weight by sharpness
// Return fair decimal odds
}
// Calculate edge
function calculateEdge(ourOdds: number, fairOdds: number): ValueAlert {
const edgePercent = ((ourOdds - fairOdds) / fairOdds) * 100;
return {
matchId,
outcome,
ourOdds,
fairOdds,
edge: edgePercent,
confidence: calculateConfidence(dataQuality),
recommendation: edgePercent > 3 ? 'STRONG_VALUE' :
edgePercent > 1 ? 'VALUE' : 'NO_VALUE'
};
}
Confidence Scoring:
- Number of bookmakers (more = higher confidence)
- Pinnacle included (yes = +20% confidence)
- Time to event (closer = lower confidence due to volatility)
- Historical accuracy of this market type
3οΈβ£ AI Chat Assistant Service (Port 3103)β
Purpose: Natural language betting assistance with Grok-powered intelligence
What it consumes:
User context β Balance, positions, bet history (from Supabase)
Market data β All live matches, odds (from Redis cache)
Value detection β Current value opportunities (from AI Value Detection)
ML predictions β Sport-specific predictions (from ML Prediction Service)
What it produces:
Chat responses with:
- Match analysis with ML predictions (when available)
- Value bet recommendations
- Cash out suggestions
- Real-time X Search for news/injuries
- Web Search for additional context
LLM Provider Architecture:
| Priority | Provider | Model | Features | Cost |
|---|---|---|---|---|
| 1 | Grok (xAI) | grok-3 | X Search, Web Search, Function Calling | $5/1M tokens |
| 2 | Groq | llama-3.3-70b | Fast inference, Function Calling | FREE (30 req/min) |
| 3 | Ollama | llama3:8b | Local, Unlimited | FREE |
| 4 | OpenAI | gpt-4o-mini | Reliable backup | $0.15/1M tokens |
Grok-Specific Features:
Sport-Aware Response Flow:
User: "Analyze Lakers vs Celtics tonight"
β
Grok identifies: Basketball (unsupported sport)
β
Grok calls: get_market_prices(match="Lakers vs Celtics")
β
Grok calls: x_search("Lakers Celtics injury news")
β
Grok calls: web_search("Lakers Celtics betting preview")
β
Response: Analysis using odds-implied probabilities + X/Web context
(No ML predictions - basketball not supported)
User: "Analyze Liverpool vs Chelsea"
β
Grok identifies: Football (supported sport)
β
Grok calls: get_ml_predictions(sport="football", match="Liverpool vs Chelsea")
β
Grok calls: get_value_signals(match="Liverpool vs Chelsea")
β
Response: Analysis with ML predictions (72% confidence) + value signals
4οΈβ£ Trading Charts Backendβ
Purpose: Store and serve historical price data for charting
What it consumes:
Redis Pub/Sub β prices:normalized:{sport}:{matchId}
Every price update gets stored
What it produces:
REST API endpoints:
- GET /charts/{matchId}/ohlc?interval=1m&from=&to=
- GET /charts/{matchId}/depth (order book depth history)
- GET /charts/{matchId}/trades (matched order history)
Data Storage (TimescaleDB recommended):
-- Price ticks table (hypertable)
CREATE TABLE price_ticks (
time TIMESTAMPTZ NOT NULL,
match_id TEXT NOT NULL,
outcome_id TEXT NOT NULL,
best_back DECIMAL(10,3),
best_lay DECIMAL(10,3),
consensus DECIMAL(10,3),
volume DECIMAL(15,2)
);
SELECT create_hypertable('price_ticks', 'time');
-- OHLC continuous aggregates
CREATE MATERIALIZED VIEW ohlc_1m
WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 minute', time) AS bucket,
match_id,
outcome_id,
first(best_back, time) as open,
max(best_back) as high,
min(best_back) as low,
last(best_back, time) as close,
sum(volume) as volume
FROM price_ticks
GROUP BY bucket, match_id, outcome_id;
Alternative: PostgreSQL with pg_partman (if TimescaleDB not available)
5οΈβ£ Cash Out Calculator Serviceβ
Purpose: Calculate real-time cash out quotes for open positions
What it consumes:
User positions β From database (stake, odds, outcome)
Live prices β From Redis cache (current best back/lay)
What it produces:
REST API:
- GET /cashout/{positionId}/quote
- GET /cashout/user/{userId}/all
WebSocket:
- cashout:quote:updated (when prices change)
Cash Out Calculation (already documented in codebase):
interface CashOutQuote {
positionId: string;
originalStake: number;
originalOdds: number;
currentOdds: number;
cashOutValue: number;
profitLoss: number;
returnPercent: number;
}
function calculateCashOut(position: Position, currentPrice: NormalizedPrice): CashOutQuote {
// For BACK positions: Cash out by laying at current price
// Formula: (Original Odds / Current Lay Odds) Γ Stake
const originalImpliedProb = 1 / position.odds;
const currentImpliedProb = 1 / currentPrice.bestLay.odds;
// If probability increased (more likely to win), cash out is profitable
const cashOutValue = (position.odds / currentPrice.bestLay.odds) * position.stake;
return {
positionId: position.id,
originalStake: position.stake,
originalOdds: position.odds,
currentOdds: currentPrice.bestLay.odds,
cashOutValue,
profitLoss: cashOutValue - position.stake,
returnPercent: ((cashOutValue - position.stake) / position.stake) * 100
};
}
Cash Out Options:
| Option | Description | Implementation |
|---|---|---|
| Full Cash Out | Close 100% of position | Single calculation |
| Partial Cash Out | Close X% (25/50/75%) | Scale proportionally |
| Auto Cash Out | Trigger at target value | Background job monitors |
| Cash Out Lock | Guarantee minimum value | Early exit protection |
6οΈβ£ ML Prediction Service (Port 3105)β
Purpose: Serve sport-specific ML models for match outcome predictions
What it consumes:
Match features β From Supabase (historical data)
Real-time odds β From Price Feed Aggregator
Model files β ONNX models from training pipeline
What it produces:
REST API:
- POST /predict (match features β probabilities)
- GET /models (list available models)
- GET /health (service health + model status)
Sport-Aware Model Registry:
# Model discovery and loading
class ModelRegistry:
def __init__(self):
self.models = {}
self._discover_models()
def _discover_models(self):
"""Auto-discover models by sport"""
for sport in ['football', 'cricket', 'tennis']:
model_path = f"models/{sport}_ensemble.onnx"
if os.path.exists(model_path):
self.models[sport] = onnxruntime.InferenceSession(model_path)
def predict(self, sport: str, features: dict) -> dict:
if sport not in self.models:
return {"error": f"No model for {sport}", "supported": list(self.models.keys())}
# Run inference
session = self.models[sport]
probabilities = session.run(None, features)
return {"probabilities": probabilities, "sport": sport}
Supported Sports:
| Sport | Model Status | Features Used | Accuracy |
|---|---|---|---|
| Football | β Trained | Odds, ELO, Form, H2H | ~68% |
| Cricket | β Trained | Odds, Venue, Toss, Form | ~65% |
| Tennis | β Trained | Odds, Rankings, Surface, H2H | ~70% |
| Basketball | β Not trained | - | - |
| Esports | β Not trained | - | - |
Graceful Degradation:
When a sport is not supported, the service returns a structured response that allows Grok to provide analysis using its own intelligence:
{
"sport": "basketball",
"model_available": false,
"supported_sports": ["football", "cricket", "tennis"],
"fallback_suggestion": "Use odds-implied probabilities and Grok analysis"
}
π Existing API Integration (Already Built)β
Oddspapi (PRIMARY DATA SOURCE)β
Already integrated in oracle-platform:
- REST API:
odds-aggregation.service.ts- Fetches odds from 40+ bookmakers - WebSocket:
oddspapi-websocket.service.ts- Real-time streaming - Caching: 5 minute TTL for odds, 5 second TTL for live prices
Endpoints Available:
| Endpoint | Purpose | Already Using? |
|---|---|---|
GET /v4/odds | Fetch odds for fixture | β Yes |
GET /v4/bookmakers | List bookmakers | β Yes |
GET /v4/fixtures | List fixtures | β Yes |
WebSocket /v4/ws | Real-time updates | β Yes |
Bookmaker Sharpness (for Value Detection):
| Bookmaker | Sharpness Score | Use For |
|---|---|---|
| Pinnacle | 1.0 (reference) | Fair value baseline |
| Betfair Exchange | 0.95 | Market consensus |
| Bet365 | 0.7 | Popular market |
| William Hill | 0.65 | Soft book comparison |
π Data Flow Architectureβ
Current Flow (Already Working)β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CURRENT DATA FLOW (BUILT) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Oddspapi WS βββ oddspapi-websocket.service βββ Redis Cache β
β β β β
β Redis Pub/Sub βββββββββββββββββββββββ β
β β β
β redis-match-subscriber.service (backend) β
β β β
β Socket.IO βββ Frontend β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
New Services Integrationβ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β NEW SERVICES INTEGRATION β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Redis Pub/Sub (odds:updated:*) β
β β β
β ββββββββ΄βββββββ¬βββββββββββββββ¬βββββββββββββββ β
β β β β β β
β Price Feed Charts Cash Out Value Detection β
β Aggregator Backend Calculator Service β
β β β β β β
β prices:* TimescaleDB cashout:* value:* β
β β β
β AI Chat Assistant β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π― Implementation Status (All Complete)β
Build Order (Completed)β
Phase 1 (Complete):
ββββββββββββββββββββββββββββ
β 1. Price Feed Aggregator β β
Port 3100
ββββββββββββββ¬ββββββββββββββ
β
Phase 2 (Complete - Parallel):
ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββ
β 2. Cash Out Calc β β 3. AI Value Det β β 4. Charts Backend β
β β
Port 3101 β β β
Port 3102 β β β
Port 3104 β
ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββ
β
Phase 3 (Complete):
ββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββ
β 5. AI Chat Assistant β β 6. ML Prediction Service β
β β
Port 3103 β β β
Port 3105 β
ββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββ
Service Statusβ
| Service | Port | Status | Location |
|---|---|---|---|
| Price Feed Aggregator | 3100 | β COMPLETE | superskin/services/price-feed-aggregator/ |
| Cash Out Calculator | 3101 | β COMPLETE | superskin/services/cash-out-calculator/ |
| AI Value Detection | 3102 | β COMPLETE | superskin/services/ai-value-detection/ |
| AI Chat Assistant | 3103 | β COMPLETE | superskin/services/ai-chat-assistant/ |
| Trading Charts | 3104 | β COMPLETE | superskin/services/trading-charts/ |
| ML Prediction Service | 3105 | β COMPLETE | superskin/services/ml-prediction-service/ |
π§ Current Infrastructureβ
Deployed Componentsβ
| Component | Status | Configuration |
|---|---|---|
| Redis | β Running | Port 6380 |
| TimescaleDB | β Running | Port 5433 |
| PostgreSQL (Supabase) | β Running | IceCrystal project |
| Grok API | β Configured | Primary LLM |
| Groq API | β Configured | Fallback LLM |
Environment Variables (Configured)β
# LLM Providers
XAI_API_KEY=... # Grok (primary)
GROQ_API_KEY=... # Groq (fallback)
OPENAI_API_KEY=... # OpenAI (backup)
# Infrastructure
REDIS_URL=redis://localhost:6380
TIMESCALE_URL=postgres://localhost:5433/superskin
SUPABASE_URL=https://hchdnajxifkorhoqqozq.supabase.co
SUPABASE_SERVICE_KEY=...
# Oracle Platform
ODDSPAPI_API_KEY=...
π Actual File Structureβ
superskin/
βββ services/
β βββ price-feed-aggregator/ # Port 3100 - TypeScript/Express
β β βββ src/
β β β βββ routes/
β β β βββ services/
β β β βββ index.ts
β β βββ package.json
β β
β βββ cash-out-calculator/ # Port 3101 - TypeScript/Express
β β βββ src/
β β β βββ routes/
β β β βββ services/
β β β βββ index.ts
β β βββ package.json
β β
β βββ ai-value-detection/ # Port 3102 - Python/FastAPI
β β βββ app/
β β β βββ routers/
β β β βββ services/
β β β βββ main.py
β β βββ requirements.txt
β β
β βββ ai-chat-assistant/ # Port 3103 - Python/FastAPI
β β βββ app/
β β β βββ routers/
β β β βββ services/
β β β βββ llm/ # Grok, Groq, Ollama, OpenAI
β β β βββ tools/ # Function calling tools
β β β βββ main.py
β β βββ requirements.txt
β β
β βββ trading-charts/ # Port 3104 - TypeScript/Express
β β βββ src/
β β β βββ routes/
β β β βββ services/
β β β βββ index.ts
β β βββ package.json
β β
β βββ ml-prediction-service/ # Port 3105 - Python/FastAPI
β βββ app/
β β βββ routers/
β β βββ services/
β β βββ models/ # ONNX model files
β β βββ main.py
β βββ requirements.txt
β
βββ ml-training/ # Offline training pipeline
β βββ data_loader.py
β βββ feature_engineering.py
β βββ train_models.py
β βββ export_onnx.py
β
βββ docker-compose.yml # Full stack deployment
β Current Focus: ML Training Pipelineβ
The 6 services are complete. Current work focuses on:
- Training sport-specific models for football, cricket, tennis
- Expanding data sources via Forsyt Data Machine
- Improving Grok integration with better prompts and tools
- Adding more sports to ML pipeline (basketball, esports)
This document reflects the current state of SuperSkin services as of January 2026.