Skip to content

shubhamgupta27695/financial-trade-analytics-engine

Repository files navigation

Financial Trade Analytics Engine 🏦

A production-grade SQL analytics project simulating a real trading firm's data warehouse. Built to demonstrate expert-level SQL skills across P&L analysis, performance ranking, rolling returns, and anomaly detection — using concepts directly applicable to financial services and fintech environments.


Project Overview

This project replicates the kind of analytics work done inside trading desks at firms translating raw trade data into actionable business intelligence using pure SQL.

The project is built on a 4-table schema simulating a trading firm's data warehouse, with 60+ realistic trades across Equities, FX, and Commodities desks spanning 6 months (Jan–Jun 2024).


Schema Design

traders          → trader master data (desk, region, seniority)
instruments      → tradeable assets (stocks, forex, commodities, ETFs)
trades           → core fact table — all executed transactions
daily_positions  → end-of-day position snapshots

Entity Relationship:

  • Each trade belongs to one trader and one instrument
  • Daily positions snapshot each trader's net exposure per instrument per day
  • Schema follows a star model — trades as the central fact table

Modules

Module 1 — Schema & Sample Data

  • 4 normalized tables with referential integrity
  • 10 traders across APAC, EMEA, Americas regions
  • 10 instruments: stocks (AAPL, MSFT, JPM), forex (EUR/USD, GBP/USD), commodities (Gold, Crude Oil), ETFs
  • 60 trades across 6 months with realistic P&L, commissions, and 3 seeded anomalies
  • 30 daily position snapshots with unrealized P&L

Module 2 — P&L Analysis

  • Total P&L per trader with win/loss breakdown
  • Win rate calculation per trader
  • P&L by instrument type (Stock vs Forex vs Commodity)
  • Monthly P&L trend with cumulative running total
  • Best and worst trade per trader using ROW_NUMBER()
  • Desk contribution percentage using SUM() OVER()
  • Trader P&L vs desk average using AVG() OVER(PARTITION BY)
  • Profit factor per trader (gross profit / gross loss) with performance rating
  • Bonus: Full P&L dashboard combining all metrics in a single query

Module 3 — Trader Performance Rankings

  • Firm-wide ranking: RANK vs DENSE_RANK vs ROW_NUMBER comparison
  • Within-desk ranking using PARTITION BY
  • NTILE(4) quartile segmentation: Elite / Strong / Average / Underperformer
  • Monthly rank tracker — how each trader's rank changed over time
  • Win rate ranking with minimum trade filter (HAVING clause)
  • Rank movement detection using LAG() and LEAD() — "Moved Up / Moved Down"
  • NTILE segmentation within each desk separately
  • Composite score ranking: weighted KPI index (50% P&L + 30% win rate + 20% volume)
  • Bonus: Full ranking dashboard with percentile rank

Module 4 — Rolling Returns & Moving Averages

  • Daily P&L per trader (foundation for all rolling windows)
  • Cumulative P&L over time using UNBOUNDED PRECEDING
  • 3-trade and 5-trade rolling average using ROWS BETWEEN N PRECEDING
  • Monthly 2-month and 3-month moving averages with MoM % change
  • Firm-wide moving averages with peak and trough tracking
  • Rolling volatility using STDDEV() as a window function
  • Sharpe-style risk-adjusted ratio (return / volatility)
  • Performance acceleration — detecting improving vs declining traders
  • Bonus: Full rolling analytics dashboard

Module 5 — Anomaly Detection

  • Mean ± 2 standard deviation threshold flagging
  • Z-score calculation for every trade (global baseline)
  • Per-trader z-score — each trader judged against their own history
  • Large notional value detection (price × quantity outliers)
  • Consecutive loss streak detection using LAG() and running counters
  • Price anomaly detection — off-market / fat finger trade identification
  • Multi-dimensional composite anomaly score (P&L + quantity + notional)
  • Daily risk summary report — aggregated anomaly counts per trading day
  • Bonus: Full anomaly dashboard with CRITICAL / HIGH / MEDIUM / NORMAL risk levels

SQL Concepts Demonstrated

Category Concepts
Window Functions RANK, DENSE_RANK, ROW_NUMBER, NTILE, PARTITION BY
Aggregate Windows SUM OVER, AVG OVER, MAX OVER, MIN OVER, STDDEV OVER
Navigation Functions LAG, LEAD, FIRST_VALUE, LAST_VALUE
Frame Clauses ROWS BETWEEN UNBOUNDED PRECEDING, N PRECEDING AND CURRENT ROW
CTEs Single CTE, chained CTEs, multiple CTEs in one query
Aggregations GROUP BY, HAVING, CASE WHEN inside aggregates
Statistical Z-score, standard deviation, mean, composite scoring
Data Quality NULLIF, ABS, ROUND, division-by-zero protection
Date Functions DATE_FORMAT, date-based GROUP BY, time period aggregation
Joins INNER JOIN, CROSS JOIN for stats broadcasting

Key Business Insights from the Data

  • Top performer: Michael Chen (Equities, Americas) — highest net P&L and profit factor
  • Most consistent: Sarah Collins (FX, EMEA) — lowest volatility relative to returns
  • Anomalies detected: Trade IDs 51, 55, 60 — flagged as CRITICAL by composite anomaly score
    • Trade 51: Unusual quantity (1,000 shares vs avg ~250)
    • Trade 55: Extreme notional value ($384,000+ on GBP/USD)
    • Trade 60: Large loss on Gold — z-score > 2 vs trader history
  • Best month: June 2024 — driven by David Kim's GBP/USD position
  • Desk breakdown: FX desk contributed ~45% of firm-wide P&L

How to Run

Option 1 — DB Fiddle (no install)

  1. Go to db-fiddle.com
  2. Select MySQL 8 from the database dropdown
  3. Paste 01_schema_and_data.sql into the left (Schema) panel
  4. Paste any query from modules 2–5 into the right (Query) panel
  5. Click Run

Option 2 — Local MySQL / MS SQL Server

-- MySQL
SOURCE /path/to/01_schema_and_data.sql;

-- MS SQL Server (SSMS)
-- Open file and execute — schema is MSSQL compatible with GO statements intact

Files

├── 01_schema_and_data.sql          -- Tables + 60 rows of sample data
├── 02_pnl_analysis.sql             -- 8 P&L queries + bonus dashboard
├── 03_trader_rankings.sql          -- 8 ranking queries + bonus dashboard
├── 04_rolling_returns.sql          -- 8 rolling/moving average queries + bonus
├── 05_anomaly_detection.sql        -- 8 anomaly detection queries + bonus dashboard
└── README.md

About

Built by Shubham Gupta

This project reflects the type of analytics work I've delivered in real financial services and fintech environments — from trade reporting at StoneX to spend analytics at American Express.

📧 Connect on LinkedIn


Tags

SQL MySQL MS SQL Server Financial Analytics Window Functions CTEs Anomaly Detection Trading Analytics Fintech Data Analytics Portfolio Project

About

Expert SQL analytics project — P&L, rankings, rolling returns & anomaly detection for a simulated trading firm

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors