A production-grade SQL analytics project simulating a real trading firm's data warehouse. Built to demonstrate expert-level SQL skills across P&L analysis, performance ranking, rolling returns, and anomaly detection — using concepts directly applicable to financial services and fintech environments.
This project replicates the kind of analytics work done inside trading desks at firms translating raw trade data into actionable business intelligence using pure SQL.
The project is built on a 4-table schema simulating a trading firm's data warehouse, with 60+ realistic trades across Equities, FX, and Commodities desks spanning 6 months (Jan–Jun 2024).
traders → trader master data (desk, region, seniority)
instruments → tradeable assets (stocks, forex, commodities, ETFs)
trades → core fact table — all executed transactions
daily_positions → end-of-day position snapshots
Entity Relationship:
- Each trade belongs to one trader and one instrument
- Daily positions snapshot each trader's net exposure per instrument per day
- Schema follows a star model — trades as the central fact table
- 4 normalized tables with referential integrity
- 10 traders across APAC, EMEA, Americas regions
- 10 instruments: stocks (AAPL, MSFT, JPM), forex (EUR/USD, GBP/USD), commodities (Gold, Crude Oil), ETFs
- 60 trades across 6 months with realistic P&L, commissions, and 3 seeded anomalies
- 30 daily position snapshots with unrealized P&L
- Total P&L per trader with win/loss breakdown
- Win rate calculation per trader
- P&L by instrument type (Stock vs Forex vs Commodity)
- Monthly P&L trend with cumulative running total
- Best and worst trade per trader using ROW_NUMBER()
- Desk contribution percentage using SUM() OVER()
- Trader P&L vs desk average using AVG() OVER(PARTITION BY)
- Profit factor per trader (gross profit / gross loss) with performance rating
- Bonus: Full P&L dashboard combining all metrics in a single query
- Firm-wide ranking: RANK vs DENSE_RANK vs ROW_NUMBER comparison
- Within-desk ranking using PARTITION BY
- NTILE(4) quartile segmentation: Elite / Strong / Average / Underperformer
- Monthly rank tracker — how each trader's rank changed over time
- Win rate ranking with minimum trade filter (HAVING clause)
- Rank movement detection using LAG() and LEAD() — "Moved Up / Moved Down"
- NTILE segmentation within each desk separately
- Composite score ranking: weighted KPI index (50% P&L + 30% win rate + 20% volume)
- Bonus: Full ranking dashboard with percentile rank
- Daily P&L per trader (foundation for all rolling windows)
- Cumulative P&L over time using UNBOUNDED PRECEDING
- 3-trade and 5-trade rolling average using ROWS BETWEEN N PRECEDING
- Monthly 2-month and 3-month moving averages with MoM % change
- Firm-wide moving averages with peak and trough tracking
- Rolling volatility using STDDEV() as a window function
- Sharpe-style risk-adjusted ratio (return / volatility)
- Performance acceleration — detecting improving vs declining traders
- Bonus: Full rolling analytics dashboard
- Mean ± 2 standard deviation threshold flagging
- Z-score calculation for every trade (global baseline)
- Per-trader z-score — each trader judged against their own history
- Large notional value detection (price × quantity outliers)
- Consecutive loss streak detection using LAG() and running counters
- Price anomaly detection — off-market / fat finger trade identification
- Multi-dimensional composite anomaly score (P&L + quantity + notional)
- Daily risk summary report — aggregated anomaly counts per trading day
- Bonus: Full anomaly dashboard with CRITICAL / HIGH / MEDIUM / NORMAL risk levels
| Category | Concepts |
|---|---|
| Window Functions | RANK, DENSE_RANK, ROW_NUMBER, NTILE, PARTITION BY |
| Aggregate Windows | SUM OVER, AVG OVER, MAX OVER, MIN OVER, STDDEV OVER |
| Navigation Functions | LAG, LEAD, FIRST_VALUE, LAST_VALUE |
| Frame Clauses | ROWS BETWEEN UNBOUNDED PRECEDING, N PRECEDING AND CURRENT ROW |
| CTEs | Single CTE, chained CTEs, multiple CTEs in one query |
| Aggregations | GROUP BY, HAVING, CASE WHEN inside aggregates |
| Statistical | Z-score, standard deviation, mean, composite scoring |
| Data Quality | NULLIF, ABS, ROUND, division-by-zero protection |
| Date Functions | DATE_FORMAT, date-based GROUP BY, time period aggregation |
| Joins | INNER JOIN, CROSS JOIN for stats broadcasting |
- Top performer: Michael Chen (Equities, Americas) — highest net P&L and profit factor
- Most consistent: Sarah Collins (FX, EMEA) — lowest volatility relative to returns
- Anomalies detected: Trade IDs 51, 55, 60 — flagged as CRITICAL by composite anomaly score
- Trade 51: Unusual quantity (1,000 shares vs avg ~250)
- Trade 55: Extreme notional value ($384,000+ on GBP/USD)
- Trade 60: Large loss on Gold — z-score > 2 vs trader history
- Best month: June 2024 — driven by David Kim's GBP/USD position
- Desk breakdown: FX desk contributed ~45% of firm-wide P&L
- Go to db-fiddle.com
- Select MySQL 8 from the database dropdown
- Paste
01_schema_and_data.sqlinto the left (Schema) panel - Paste any query from modules 2–5 into the right (Query) panel
- Click Run
-- MySQL
SOURCE /path/to/01_schema_and_data.sql;
-- MS SQL Server (SSMS)
-- Open file and execute — schema is MSSQL compatible with GO statements intact├── 01_schema_and_data.sql -- Tables + 60 rows of sample data
├── 02_pnl_analysis.sql -- 8 P&L queries + bonus dashboard
├── 03_trader_rankings.sql -- 8 ranking queries + bonus dashboard
├── 04_rolling_returns.sql -- 8 rolling/moving average queries + bonus
├── 05_anomaly_detection.sql -- 8 anomaly detection queries + bonus dashboard
└── README.md
Built by Shubham Gupta
This project reflects the type of analytics work I've delivered in real financial services and fintech environments — from trade reporting at StoneX to spend analytics at American Express.
📧 Connect on LinkedIn
SQL MySQL MS SQL Server Financial Analytics Window Functions CTEs Anomaly Detection Trading Analytics Fintech Data Analytics Portfolio Project