๐ŸŽฏ

SUPER SELECTOR Algorithm

Statistical Unified Player Evaluation and Ranking SELECTOR

Our predicted XI algorithm combines multiple data sources to generate optimal team compositions:

Scoring Components

  • Base Classification (0-30 pts) - Role-based scoring for position fit
  • Performance Tags - Phase-specific tags (PP_ELITE, DEATH_SPECIALIST, etc.)
  • Derived Metrics - boundary%, consistency_index, death_dot_pct
  • Price Tier Bonus (5-15%) - Investment level consideration
  • Variety Optimization - LHB/RHB balance, spin/pace mix

Hard Constraints

  • C1: Captain cannot be Impact Player
  • C2: Maximum 4 overseas players
  • C3: Minimum 20 overs bowling coverage
  • C4: At least 1 wicketkeeper
  • C5: At least 1 spinner
๐Ÿˆ

PFF-Inspired Grading

Process over outcome evaluation from Pro Football Focus

Pro Football Focus revolutionized NFL analytics by grading every play. We adapt their methodology:

Key Concepts

  • Ball-by-Ball Grading - Evaluate each delivery on a -2 to +2 scale
  • Process Over Outcome - Good decisions that fail still get positive grades
  • Context Adjustments - Situation, opposition, and conditions matter
  • WAR (Wins Above Replacement) - Aggregate value metric

Cricket Applications

  • P0 Batter grading vs bowling type and phase
  • P0 Bowler grading by delivery outcome quality
  • P1 Fielding impact assessment
  • P2 Captain decision grading
๐Ÿ“„ PFF Research Document (Internal)
๐Ÿ€

KenPom Efficiency Metrics

Tempo-free statistics from college basketball analytics

Ken Pomeroy's basketball ratings remove pace from the equation. We apply similar concepts:

Key Concepts

  • Adjusted Efficiency - Opponent-normalized performance
  • Tempo-Free Stats - Per-possession (or per-ball) metrics
  • Four Factors - Decompose performance into components
  • Strength of Schedule - Quality of opposition faced

Cricket Applications

  • P0 Four Factors: Boundary%, Dot Ball%, Extras, Bowling Changes
  • P0 Venue Park Factors: Adjust for pitch and ground size
  • P1 Opposition Strength Index: Weight by opponent quality
  • P1 Adjusted Strike Rate: SR normalized by context
๐Ÿ“„ KenPom Research Document (Internal)
๐ŸŽจ

Player Clustering (K-Means V2)

Archetype-based player classification

We use K-means clustering to identify natural player archetypes:

Batter Archetypes

  • EXPLOSIVE_OPENER - High SR, PP aggression, boundary-heavy
  • PLAYMAKER - Consistent scoring, adaptable approach
  • ANCHOR - Low dot%, innings builder, lower SR
  • MIDDLE_ORDER - Middle overs specialist, rotation focus
  • FINISHER - Death overs specialist, high SR at end

Bowler Archetypes

  • WORKHORSE - Consistent economy, regular overs
  • NEW_BALL_SPECIALIST - PP wickets, swing/seam
  • DEATH_SPECIALIST - Low death economy, yorker execution
  • WICKET_TAKER - High wickets, aggressive approach
๐Ÿ“„ Creative Archetype Descriptions (Internal)
๐Ÿ“Š

CricPom: Novel Composite Metrics

KenPom-for-Cricket adjusted rating system

CricPom adapts college basketball's KenPom methodology to T20 cricket, producing opponent-adjusted, venue-neutral player ratings that account for tournament quality and conditions similarity.

Core Adjusted Metrics

  • AdjBRR (Adjusted Batting Run Rate) โ€” Batting run rate adjusted for bowling quality faced, venue park factor, and match context. Formula: raw_RR ร— (league_avg_bowling / opponent_bowling_quality) ร— venue_factor
  • AdjBE (Adjusted Bowling Economy) โ€” Economy rate adjusted for batting quality faced and venue. Lower is better. raw_econ ร— (league_avg_batting / opponent_batting_quality) ร— venue_factor
  • CEM (Composite Efficiency Metric) โ€” All-rounder evaluation combining AdjBRR and AdjBE into a single efficiency score. CEM = wโ‚ยทAdjBRR_percentile + wโ‚‚ยท(1 - AdjBE_percentile)
  • OSI (Opponent Strength Index) โ€” Weighted average quality of opponents faced, used as the adjustment denominator in all CricPom ratings

5-Factor Tournament Quality Engine

CricPom consumes the Tournament Quality Weighting system (see below) to weight data from 426 T20 tournaments. Each tournament's data receives a composite weight computed as:

W = โˆ(fแตข^wแตข)^(1/ฮฃwแตข) โ€” geometric mean of 5 factors

  • PQI (25%) โ€” Player Quality Index: average career quality of participants
  • Competitiveness (20%) โ€” Match balance and outcome distributions
  • Recency (20%) โ€” Exponential decay favoring recent data
  • Conditions Similarity (15%) โ€” How closely conditions match IPL 2023-2025
  • Sample Confidence (20%) โ€” Statistical reliability from match volume

How It Differs from Raw Stats

Aspect Raw Stats CricPom Adjusted
Opposition qualityIgnoredOSI-adjusted
Venue effectsIgnoredPark factor adjusted
Tournament relevanceAll equal5-factor weighted
RecencyAll equalExponential decay
All-rounder evaluationSeparate batting/bowlingUnified CEM score

Status: Tournament weights computed (TKT-187). CricPom foundation metrics (AdjBRR, AdjBE, CEM) implemented (TKT-190). Groundwork research complete.

๐Ÿ“„ KenPom Research Foundation (Internal)
โš–๏ธ

Tournament Quality Weighting

Jose Mourinho's 5-Factor Composite Weight System

Not all T20 data is equal. Our weighting system quantifies tournament quality across 426 tournaments and 9,357 matches:

5-Factor Composite Weight

  • Player Quality Index (PQI) - Average career quality of tournament participants
  • Competitiveness Index (CI) - Match outcome balance, margin distributions
  • Recency Decay - Exponential decay weighting recent tournaments higher
  • Conditions Similarity - How closely tournament conditions match IPL 2023-2025 (Founder Decision #6)
  • Sample Size Confidence - Statistical reliability based on matches played

IPL 2023+ Baseline (Founder-Locked)

All conditions comparisons use IPL 2023-2025 as the baseline, not all-time IPL averages. The data shows the 2021-22 to 2023-25 transition produced the largest single jump in IPL history: Run Rate +1.00, Boundary% +3.2, Six% +1.65. Using the all-time average (RR 7.86) would dilute comparisons against a fundamentally different era than the modern IPL (RR 8.98).

Tournament Tiers (Provisional)

  • Tier 1A IPL - Baseline (1.0x weight)
  • Tier 1B PSL, SA20, The Hundred, MLC, BBL, CPL - Major franchise leagues (0.70-0.85)
  • Tier 1C ILT20, LPL, Super Smash, Vitality Blast - Established leagues, lower overlap (0.50-0.70)
  • Tier 2 T20 World Cup, Asia Cup - High-quality international (0.60-0.80)
  • Tier 3 SMAT - Domestic Indian T20 (0.40-0.50)

Status: Plan approved by Founder. IPL 2023+ baseline locked. Implementation via TKT-183 (8-12 days).

๐Ÿ“„ Tournament Weighting Plan (Internal)
๐Ÿ”ฌ

Dual-Scope Analytics Framework

All-Time vs Since-2023 view architecture (TKT-181)

Every analytical view now exists in two scopes to balance historical context with current-form accuracy:

Architecture

  • _alltime views - Full IPL history (2008-2025, ~1,169 matches). Used for career records, historical comparisons
  • _since2023 views - Current analytical window (2023-2025, ~219 matches). Used for all predictive outputs, tags, archetypes
  • 80 dual-scope views - 40 pairs covering batting, bowling, phase, venue, matchup, and Film Room tactical analysis

Why 2023+? The Data Evidence (Founder-Approved)

DuckDB analysis of 1,169 IPL matches reveals a structural break at 2023 โ€” the largest single-era shift in IPL history:

Era Matches Run Rate Boundary% Six% Dot%
2008-20123227.6015.24.0436.1
2013-20173147.9016.14.6334.8
2018-20201808.1716.95.5533.8
2021-20221347.9816.55.4135.2
2023-20252198.9819.77.0631.6

2023+ vs 2008-2022 deltas: Run Rate +14.2%, Boundary% +23.1%, Six-hitting +49.6%, Dot Ball% -10.0%

What Caused the Break

  • Impact Player rule (2023) โ€” 12 effective players per side, inflating batting depth and scoring rates
  • 2022 mega auction reset โ€” team compositions fundamentally reshuffled, pre-2023 team context obsolete
  • Evolved batting intent โ€” six-hitting up 49.6%, batters attacking from ball one in the modern IPL
  • 219 matches provides sufficient sample for statistical reliability across all analytical views
๐Ÿ“„ TKT-181 Review Document (Internal)
๐ŸŽฏ

Insight Confidence Framework

Editorial confidence scoring for analytical insights (TKT-094)

Every insight published in the magazine needs a confidence assessment. This framework scores each analytical claim on a 0-100 scale with letter grades:

Scoring Breakdown

  • Sample Size (40%) โ€” Capped at 300 balls/innings. HIGH (โ‰ฅ300), MEDIUM (โ‰ฅ100), LOW (<100)
  • Consistency (25%) โ€” Metric stability across sub-samples (first half vs second half of career)
  • Recency (20%) โ€” How recent the underlying data is (1.0 = all 2025, 0.5 = mix)
  • Cross-Validation (15%) โ€” Does the insight hold across conditions (home/away, bat/field)?

Grade Boundaries

  • A (โ‰ฅ85) Publish with confidence
  • B (โ‰ฅ70) Publish with minor caveats
  • C (โ‰ฅ55) Add sample size caveat and limitations
  • D (<55) Do not publish as standalone insight

Status: Framework implemented. Bridges with existing Confidence Intervals (TKT-145). Used editorially for all stat pack claims.

๐Ÿ“Š

Data Foundation

Ball-by-ball analysis from Cricsheet

Data Sources

  • Cricsheet Ball-by-Ball - 219 IPL matches (2023-2025)
  • IPL 2026 Auction Data - Squad compositions and prices
  • Historical Records - Team vs team, venue performance

Derived Metrics (115+ Views)

  • 80 dual-scope views (_alltime + _since2023 pairs)
  • Batter consistency index, boundary percentage, dot ball percentage
  • Partnership synergy scores, pressure sequences
  • Phase-specific performance (PP, middle, death)
  • Bowling type matchups, handedness analysis
  • 13 Film Room tactical views (entry points, wicket clusters, bowling changes)

Validation

  • Every insight reviewed by cricket domain expert (Andy Flower)
  • 8-step quality assurance process
  • Founder sign-off on all key outputs