About Our Data
Big Hit Cricket is built on a ball-by-ball T20 database covering roughly 9,600 matches and 2.2 million deliveries across 18 competitions. This page explains where that data comes from, what decisions were made, what is not included, and why our numbers sometimes differ from other cricket statistics sources.
Where does the data come from?
Every delivery in Big Hit Cricket's database originates from Cricsheet — a publicly available ball-by-ball T20 dataset maintained by Stephen Rushe and released under the Creative Commons Attribution-ShareAlike 4.0 (CC BY-SA 4.0) licence. Cricsheet is, by some margin, the most comprehensive free ball-by-ball cricket dataset available and is used by data teams at international cricket boards as well as independent researchers worldwide.
We do not scrape other statistics websites, estimate missing values, or infer data from scorecards. If a delivery is in our database, it is because the ball-by-ball record exists in Cricsheet. This means our numbers are reproducible — anyone with access to the same Cricsheet export will arrive at the same figures.
Ball-by-ball data provided by Cricsheet (Stephen Rushe), licensed under CC BY-SA 4.0.
Our ingestion pipeline parses every Cricsheet JSON file and tags each delivery with match context (competition, venue, innings, over number, batting position), phase label (powerplay / middle overs / death), dismissal type, and extras classification. The processed data is stored in a PostgreSQL database on Supabase and refreshed automatically whenever Cricsheet publishes new match files — typically within hours of a match ending.
Competition coverage
The database includes 18 competitions. Coverage varies by league depending on how comprehensively Cricsheet has digitised older seasons. The table below shows the current state for each competition, including the earliest season we consider reliable enough to include in leaderboards and the matchup engine.
| Competition | Coverage | In database from | Notes |
|---|---|---|---|
| Indian Premier League | 100% | 2008 | 1,169/1,169 matches |
| The Hundred | 100% | 2021 | Complete from inception |
| International League T20 | 100% | 2023 | Complete from inception |
| SA20 | 100% | 2023 | Complete from inception |
| Major League Cricket | 100% | 2023 | Complete from inception |
| Lanka Premier League | 100% | 2020 | Complete from inception |
| Mzansi Super League | 100% | 2018 | Complete from inception |
| Vitality T20 Blast | 99.45% | 2014 | Near-complete — minor gaps in early seasons |
| Pakistan Super League | 99.37% | 2016 | Near-complete |
| Caribbean Premier League | 97.84% | 2013 | Near-complete |
| Big Bash League | 97.60% | 2011 | Near-complete |
| Bangladesh Premier League | Partial | 2012 | In database, coverage varies by season |
| Nepal Premier League | Partial | 2023 | Included where available |
| CSA T20 Challenge | 71.36% | — | Excluded from matchup engine |
| Syed Mushtaq Ali Trophy | 57.68% | — | Excluded from matchup engine |
| Super Smash (men's) | 46.40% | — | Excluded from matchup engine |
| Men's T20 Internationals | Broad | 2005 | Cricsheet coverage, not full global set |
For the matchup engine specifically, we apply a stricter rule: a competition must have 95%+ coverage before its records are used in head-to-head match data. CSA T20 Challenge, Syed Mushtaq Ali Trophy and Super Smash are currently below this threshold and are excluded. Using patchy data in a matchup context — where you're asking "who has the better record against this specific bowler?" — would produce misleading results. We prefer to show fewer records than inaccurate ones.
What is not included — and why our totals differ from Cricinfo
Cricinfo (ESPNCricinfo) tracks every T20 match played anywhere in the world, including domestic competitions that are not in Cricsheet. The most significant gap for most users will be competitions that exist outside the set of leagues Cricsheet maintains: the Afghanistan domestic T20 circuit (Shpageeza Cricket League, Afghanistan Premier League), parts of the Bangladesh domestic scene, some South Asian domestic leagues, and miscellaneous associate nation T20s.
This is not a flaw in our methodology — it is a transparency point. We only include data we can verify at the delivery level. A career total built from scorecards alone, without ball-by-ball records, would have a different quality of data to one built from delivery data. We chose not to mix the two.
Rashid Khan is one of the most active T20 cricketers of the last decade. On Big Hit Cricket, he shows approximately 393 T20 bowling appearances. Cricinfo records around 514 — a gap of roughly 120 innings.
That difference is almost entirely explained by the Afghanistan domestic circuit. Rashid has played extensively in the Shpageeza Cricket League and the Afghanistan Premier League — competitions that are not currently in Cricsheet. Those leagues alone account for a large portion of his career T20 output. He has also appeared in a small number of other associate-nation competitions that fall outside our dataset.
For the competitions we do share with Cricinfo — IPL, PSL, BBL, T20 Blast, CPL, The Hundred — Rashid's figures on Big Hit Cricket will be very close to identical. The gap is purely about scope, not data quality.
The same logic applies to any player with a significant domestic T20 history in leagues outside our 18 competitions. Indian domestic players (Syed Mushtaq Ali Trophy), South African domestic players (CSA T20), Pakistani domestic players (National T20 Cup), and players from associate nations will typically show lower career totals on Big Hit Cricket than on Cricinfo. Players who play primarily in IPL, PSL, BBL, T20 Blast, CPL and The Hundred will have totals very close to Cricinfo's records.
How phases are defined
Every delivery in the database is tagged with one of three phase labels based on the over in which it was bowled:
Field restrictions apply. Highest dismissal rate for openers.
Wicket-taking phase for spinners. Strike rate typically dips.
Acceleration phase. Where games are won and lost in T20 cricket.
Phase splits use the over number as the boundary, not the ball count or match situation. A delivery bowled in over 7 is always a middle-overs delivery, regardless of whether it follows an extended powerplay due to rain interruption. This keeps the definitions consistent across competitions that have different powerplay rules (some domestic leagues use 5-over powerplays rather than 6).
Minimum ball thresholds apply to phase-specific leaderboards to prevent small-sample outliers. The defaults are: all phases = 100 balls; middle overs = 60 balls; powerplay and death = 30 balls. These can be adjusted in the Stat Builder.
How we handle incomplete records and edge cases
Super overs (the tie-breaking additional over) are excluded from all statistics. A super over is a separate innings under its own rules and including it in a bowler's T20 economy rate or a batter's T20 strike rate would distort career figures. Cricinfo excludes super over data from career records by default; we do the same.
Innings where a batter retired hurt are included in batting totals but not counted as a dismissal when calculating averages, consistent with how Cricinfo treats them. Absent batter innings (where a player did not bat) are excluded entirely.
Extras (no-balls, wides, byes, leg-byes) are tracked separately and excluded from individual batting and bowling averages where appropriate. A wide is included in a bowler's economy rate (it costs a run) but not in balls faced for the batter. This matches standard T20 conventions.
Cricsheet uses a player registry with unique IDs. Older match files (pre-2019) used plain name strings without registry IDs, which can cause the same player to appear under multiple name variations. Our ingestion pipeline normalises these where the name is unambiguous. In a small number of cases involving less-prominent players with common names, some career records may be split.
Matches abandoned before both teams had batted are included in the database but only the completed innings contributes to player records. No innings is 'estimated' for the team that did not bat.
The database is updated automatically every six hours via a scheduled ingestion pipeline that pulls new match files from Cricsheet. IPL matches, for example, typically appear in the database within a few hours of the final ball. The last-updated timestamp in the site header reflects the most recently ingested match.
Metrics you won't find on Cricinfo
Because we work at the delivery level, we can calculate metrics that scorecard-based aggregators cannot. These are core to what makes Big Hit Cricket useful beyond a standard stats lookup:
Strike rate on deliveries that did not result in a four or six. Measures a batter's ability to score between the ropes — essential for understanding whether high headline SRs are reliant on a few big shots.
How many balls a batter faces, on average, before hitting a six. Fewer balls = more destructive. Useful for comparing big-hitters across different team roles and batting positions.
The ratio of a batter's death-over strike rate to their powerplay strike rate. Values above 1.0 indicate a batter who gets faster as the innings goes on — the defining trait of T20 finishers.
A composite score combining economy rate, dot ball percentage and wicket-taking rate into a single 0–100 figure. Designed to capture overall threat rather than any single dimension of bowling performance.
Economy rate with boundary deliveries removed. Isolates a bowler's control between the ropes — a bowler who concedes 8 runs per over but half of those are sixes is more dangerous than one who leaks at 7.
Every batting and bowling metric can be filtered to powerplay, middle overs or death overs individually. This means a batter's powerplay SR and death SR are tracked separately, not blended into a single figure.
The full set of metrics is available in the Stat Builder, with column tooltips explaining each one. The Form Guide leaderboards also surface the most useful batting and bowling metrics with definitions.
Found an error?
If you spot a figure that looks wrong — a player's innings count significantly off, a match missing from a career record, or a metric that doesn't add up — please let us know. The most likely explanations are: (a) the match is from a competition outside our 18, (b) there is a player name normalisation issue in Cricsheet's registry, or (c) there is a genuine ingestion bug. We want to know about (c).
Contact: contact.bighitcricket@gmail.com
Explore the data
Every tool on Big Hit Cricket is built directly from the delivery-level data described above. The numbers you see in the Stat Builder are the same numbers that power the matchup engine, the venue profiles and the form guide — there is no separate editorial layer or manual override.