1. Pi-Ratings Match Model
Each team carries four ratings: home attack, home defense, away attack, away defense. After every fixture we update them iteratively from goal difference and current rating gap. The model converges in roughly 8 league weeks; we cold-start on each season's first 2 weeks with prior-season carry-over. Lambda (expected goals) for a fixture is derived from the home attack vs. away defense and away attack vs. home defense pairs, then converted into match-result, over/under, and BTTS probabilities via the Poisson distribution. Single coherent model = no contradictions between the 1X2 pick and the over/under pick.
2. Tier Thresholds
Every prediction is sorted into one of three tiers by confidence: Elite ≥ 60%, Solid ≥ 45%, Normal < 45%. Tiers are not weighted differently in the model — they exist to help users gauge expected accuracy. Our forward-tracking page reports hit-rate per tier so the relationship between confidence and reality is publicly auditable.
3. Strict On-Pitch Goal Metric (v3)
Traditional Win/Draw/Loss columns count any player who appeared (even for 1 minute) as part of the result. Our on-pitch metric isolates the minutes a player was physically on the pitch and counts only goals scored/conceded during those exact minutes. A player who played 30 minutes in a 3-0 win where all goals came in the second half registers 0 goals_for_on_pitch — because the goals happened after he came off. The v3 pipeline operates at minute granularity using minute-state arrays (winning / drawing / losing per pitch-minute), producing the winning-minute rate that powers the Player Impact tables.
4. Compound (player_id, team_id) Key
When a player transfers mid-season they appear as two separate rows in our Player Impact dataset — one row per (player_id, team_id) pair. Their pre-transfer minutes stay attributed to the old club, their post-transfer minutes count for the new club. This is the only way to honestly compare a player's impact at two clubs without double-counting or arbitrary attribution. MIN_MATCHES = 5 per row, so transient short-stay rows are filtered out.
5. Coverage
26 league competitions: 5 European top tiers (Premier League, La Liga, Serie A, Bundesliga, Ligue 1), 5 second tiers (Championship, La Liga 2, Serie B, 2. Bundesliga, Ligue 2), Eredivisie, Primeira Liga, Süper Lig, Premiership, Pro League, Saudi Pro League, MLS, Brazilian Série A, Argentine Liga Profesional, plus continental cups (UEFA Champions League, Europa League, Conference League) and tournament windows (World Cup 2026, Euro 2024 history, Copa America, AFCON, Asian Cup, friendlies). National-team fixtures filter out U17/U18/U19/U20/U21/U23/Women/Olympic teams; only A-team senior fixtures count.
6. Data Freshness
Daily build runs at 06:00 UTC via GitHub Actions: fetches new fixtures from API-Football, runs Pi-Ratings on completed results, computes predictions for upcoming windows (30 days for league play, 60 days for World Cup qualifiers + Nations League, 90 days for tournament hubs like World Cup 2026 / Euro 2024 / Copa America), regenerates static pages, and deploys to Cloudflare Pages. Forward-tracking ledger appends results — we never silently overwrite a prediction after the fact.
7. What We Don't Do
We don't backtest claims (forward-tracking only). We don't accept paid placement on any prediction tier. We don't run referral links to third-party commercial operators. We don't market our content to the predictions-as-product industry — material is for analytical interest only. We don't generate match commentary with LLMs at request time; the editorial sentences on each match page are template-based on real model output. We don't hide losing weeks — every published prediction stays on its match page after the result is known.
8. Open Standards & Citation
Site data is in JSON/JSON-LD with full schema.org markup (SportsEvent, NewsArticle, BreadcrumbList, FAQPage, Dataset). hreflang alternates link every page across 5 locales (en/tr/ar/es/zh). Our /llms.txt declares the canonical content index for AI agents. The strict on-pitch goal metric is described in plain English so the algorithm is reproducible by anyone with access to a minute-by-minute substitution + goal feed.