Formula 1 Competitive Performance Analytics

Written by:

Scenario

As part of a strategic initiative, Formula 1 seeks to deepen competitive analysis and improve storytelling for fans, broadcasters, and partners. By standardizing official timing, telemetry, and weather into analytics-ready datasets, we aim to provide a 360 degree view of a Grand Prix weekend — covering car pace, tire behavior, pit-stop execution, qualifying performance, and strategy outcomes.

Objectives

  • Track sustainable race pace by tire compound and stint length across the grid
  • Quantify tire degradation and effective pit-lane time loss to explain strategy choices
  • Evaluate undercut/overcut and the advantages of pitting under SC/VSC conditions
  • Benchmark teammates like-for-like to separate driver execution from car pace
  • Highlight qualifying execution via sector-based “ideal lap” vs actual best lap
  • Deliver concise visuals and a one-page insights brief suitable for broadcast and digital

This initiative supports F1’s broader goals of enriching fan understanding, enabling data-driven storytelling, and enhancing the transparency and competitiveness of the sport.

Data Structure

  • Architecture: Landing (raw) → Staging (clean, dedupe, units) → Warehouse (star schema) → BI (Power BI semantic model).
  • Facts & grain:
    • F_Orders = 1 row per order–model–customer–order_date;
    • F_Deliveries = 1 row per delivered aircraft–delivery_date;
    • F_Backlog_Snapshot = model–customer–month_end;
    • F_MarketTraffic_Snapshot = region/country–year. Role-playing dates via order_date_keydelivery_date_keysnapshot_date_key.
  • Conformed dimensions: D_DateD_Model (family, variant, seats, range_nm, MTOW, cabin_key), D_Customer (name, group, status, geo_key), D_Geography (country, region), D_Engine (maker, model, thrust), D_Cabin (cabin_type). Optional bridges: Model↔EngineCustomer↔Region for many-to-many.
  • Keys, SCD, quality: Integer surrogate keys + preserved business keys; SCD2 on Customer (and Model if needed); FK constraints; standard codes (IATA/ICAO, ISO-country); alias mapping tables; incremental loads by date_key, late-arriving handling, unit normalization.
  • Metrics & semantics: Measures for Orders, Deliveries, Backlog Units, YoY/MoM, CAGR, Fleet Growth; Date intelligence (DAX) from D_Date; yearly partitions; data dictionary + lineage; refresh cadence daily/weekly.

Business Questions Solved

  • Who sustains the fastest median race pace by tire compound on green-flag laps?
  • What are the per-stint degradation slopes (s/lap) by driver and compound?
  • Undercut vs overcut: within ±2s battles, which strategy gains more after 3 laps?
  • What is the effective pit-lane time loss, and which teams execute most consistently?
  • How much time is saved by pitting under SC/VSC vs green-flag stops?
  • In qualifying, how much time did drivers leave vs the ideal lap (sum of best sectors)?
  • Teammate like-for-like delta: average gap in the first 10 laps of a stint on the same compound.
  • DRS impact: main-straight average speed with DRS on vs off on representative laps.
  • Track evolution: median sector changes FP1 → FP2 → FP3 → Q to inform run sequencing.
  • Does stop count (1/2/3) correlate with finishing position for this track archetype?

Tools & Technologies Used

  • Python (3.10+), VS Code, Jupyter Notebooks
  • FastF1 → community library interfacing with timing/telemetry; on-disk cache
  • pandas, NumPy → data cleaning, joins, feature engineering
  • Matplotlib, Plotly → broadcast-ready charts and explanatory visuals
  • PyArrow/Parquet → fast, reproducible columnar storage
  • Tableau → visualize the data
  • Git/GitHub → versioning and collaboration

Business Impact & Insights

  • Enhanced broadcast storytelling: Clear visuals and metrics (degradation curves, pit-loss distributions, ideal-lap deltas) explain strategy calls in real time and deepen fan engagement.
  • Transparent competitive analysis: Standardized pace and strategy KPIs enable apples-to-apples comparisons across drivers, teams, and sessions.
  • Faster decision support: Pre-quantified break-even points for undercut/overcut and SC/VSC windows reduce reaction time for live analysis segments and digital products.
  • Driver & team benchmarking: Like-for-like comparisons separate execution from car pace, informing data-driven narratives and season-long story arcs.
  • Operational efficiency: Cached ingestion and Parquet workflows accelerate iteration, enabling consistent, race-over-race insights and scalable content for F1’s digital platforms.

View Code & Dashboard

Leave a comment