Pi-Ratings vs Elo vs xG: Three Models, One Football Match

📐

TournamentsMay 28, 2026 · 4 min read

Goalence Editorial

Three Lenses, One Match

Want to forecast a football match? There are three statistical traditions to draw on: Elo (the chess-rating veteran), Pi-Ratings (the home/away specialist), and xG (the shot-quality measure). Each one captures part of the truth. None of them captures all of it.

Goalence uses Pi-Ratings + Poisson. But every forecast we publish is shaped by all three traditions. Let's compare them.

Elo: The Chess Method, Borrowed

Elo was invented for chess in the 1960s. A single number per player. Win, your number goes up; lose, it falls. The size of the movement depends on whether the result was expected, beating someone 200 points above you is worth more than beating someone 200 points below.

For football, Elo assigns every team one rating. Strength: simple, robust, well-understood. Weakness: it ignores the fact that football has a large home-field advantage. Real Madrid at the Bernabéu and Real Madrid in Atletico's away changing room aren't really the same team. But a single Elo number cannot tell you that.

Pi-Ratings: One Team, Two Numbers

Pi-Ratings (Constantinou & Fenton, 2013) takes the Elo idea and splits every team into a home rating and an away rating. Two numbers per team. Real Madrid at home gets one rating; Real Madrid away gets another. After every match, both ratings update based on the actual result vs the pre-match expectation.

So where Elo says "one team, one number", Pi-Ratings says "one team, two numbers".

Strength: captures the home-field heterogeneity Elo misses. La Liga's home advantage averages 0.35 goals per match, Pi-Ratings makes that gap explicit.

Weakness: cross-league comparison is harder. A Pi-Rating of 1.5 in the Premier League means something different from 1.5 in MLS. Goalence handles this with league-specific calibration.

xG: The Shot-Quality Lens

xG (Expected Goals) is the youngest of the three. It entered the mainstream around 2015. Instead of looking at results, it looks at chance quality. Every shot gets a probability based on its location, angle, body part, and match state. A 0.30 xG shot is one most forwards would convert about 30% of the time.

Strength: measures the process, not the final score alone. A team that generates 2.5 xG but loses 0-1 is unlucky; xG flags this.

Weakness: it doesn't directly say who wins. You still need a method to convert xG into a scoreline probability. That's where Poisson comes in.

Goalence's Choice

At the core, we use Pi-Ratings + Poisson because the combination handles home-field correctly, updates after every match, and converts cleanly into outcome probabilities. xG shapes the lambda values, but the rating engine is Pi-Ratings.

Is this the best system? Honest answer: no system is.

We publish live, forward-tracked hit rates on our Stats page — every prediction is logged before kickoff and scored against the real result. We reset the counter on 24 May 2026, so the early sample is small and honest rather than impressive. Comparable to ensemble approaches. What matters most is the forward-tracking: we log every forecast before kickoff and compare against the actual result.

None of these models predicts the future perfectly. They make the unpredictable a little less so.

Which One Is Right?

Before a match, Pi-Ratings says 50% for the home side, Elo says 55%, xG-to-result says 52%. A five-point gap between them. Which is correct?

The answer: none of them is definitively right. All three are estimates. What matters is long-run calibration.

After 100 matches: if Pi-Ratings marked those matches at 50% and the actual home-win rate was 50%, it's calibrated. If Elo said 55% but the real rate was 50%, Elo was too optimistic in those games.

Goalence's dataset logs every forecast. In a year's time, we will be able to say clearly which system has been better calibrated.

Frequently asked questions

Why doesn't Goalence use Elo?⌄

Elo treats every team as a single rating, ignoring home/away asymmetry. In a league like La Liga where home advantage averages 0.35 goals, Elo loses meaningful signal. Pi-Ratings handles this natively.

Is xG more accurate than Pi-Ratings?⌄

Different question, different answer. xG measures process (chance quality); Pi-Ratings measures team strength updated by results. They're complementary — Goalence uses both: xG to inform lambda values, Pi-Ratings to derive base strengths.

What is forward-tracking?⌄

Every Goalence prediction is logged before kickoff and compared against the actual result after FT. No back-fill, no 'what we would have predicted'. The accuracy figure on /methodology reflects exactly that.

Pi-Ratings vs Elo vs xG: Three Models, One Football Match

Three Lenses, One Match

Elo: The Chess Method, Borrowed

Pi-Ratings: One Team, Two Numbers

xG: The Shot-Quality Lens

Goalence's Choice

Which One Is Right?

Frequently asked questions

Related Stories

The Anatomy of a Draw: The 0-0 Paradox and What the Data Reveals

Mid-Season Transfers: The Real Impact via Goalence Data

The "On-Pitch Goal" Metric: Why It Beats Traditional Match Stats