goalence
Pi-Ratings vs Elo vs xG: Three Models, One Football Match
📐
Tournaments·May 28, 2026·4 min read·Goalence Editorial

Pi-Ratings vs Elo vs xG: Three Models, One Football Match

Three statistical lenses for the same 90 minutes. Each picks up a different part of the truth — Goalence picks one, but here's why all three matter.

The Three Pillars of Football Modelling

If you want to predict a football match, there are three statistical traditions to choose from: Elo (the chess-rating veteran), Pi-Ratings (the home/away specialist), and xG (the shot-quality measure). Each one captures part of the truth, none captures all of it.

Goalence uses Pi-Ratings + Poisson at the core, but every prediction we ship is influenced by all three traditions. Here's how they compare.

Elo: The Chess Method, Borrowed

Elo was invented for chess in the 1960s. A single number per player. Win, your number goes up; lose, it goes down. The amount of movement depends on whether the result was expected — beating someone 200 points above you is worth more than beating someone 200 points below.

For football, Elo treats every team as one rating. Strength: simple, robust, well-understood. Weakness: it ignores the fact that football has a massive home-field advantage. Real Madrid at the Bernabéu and Real Madrid in Atletico's away dressing room aren't really the same team — but a single Elo number can't tell you that.

Elo: The Chess Method, Borrowed
Elo: The Chess Method, Borrowed

Pi-Ratings: One Team, Two Numbers

Pi-Ratings (Constantinou & Fenton, 2013) takes the Elo idea and splits each team into a home rating and an away rating. Two numbers per team. Real Madrid at home gets one rating; Real Madrid away gets another. After every match, both ratings update based on the actual result vs the pre-match expectation.

Strength: captures home-field heterogeneity that Elo misses. La Liga's home advantage averages 0.35 goals — Pi-Ratings makes that gap explicit.

Weakness: harder to compare across leagues. A Pi-Rating of 1.5 in the Premier League means something different from 1.5 in MLS. Goalence handles this with league-specific calibration.

xG: The Shot-Quality Lens

xG (Expected Goals) is the youngest of the three — it became mainstream around 2015. Instead of looking at results, xG looks at the quality of chances. Every shot gets a probability based on its location, angle, body part, and game state. A 0.30 xG shot is one most strikers would convert about 30% of the time.

Strength: captures process, not just outcome. A team that creates 2.5 xG but loses 0-1 is unlucky; xG flags this.

Weakness: doesn't directly tell you who wins. You still need a way to turn xG into a scoreline probability — and that's where Poisson comes in.

Goalence's Choice

We use Pi-Ratings + Poisson at the core because the combination handles home-field correctly, updates after every match, and converts cleanly into outcome probabilities. xG informs the lambda values, but the rating engine is Pi-Ratings.

Is this the best system? Honest answer: no system is. Our forward-tracking accuracy sits around 58% on resolved matches — better than pure Elo (~54%) and pure xG-to-result (~55%) in the same window, comparable to ensemble approaches. The forward-tracking is what matters: we log every prediction before kickoff and compare against the actual result.

None of these models predict the future perfectly. They make the unpredictable a little less so.

Goalence's Choice
Goalence's Choice

Tags

Pi-RatingsEloxGmethodologyprediction models

Frequently asked questions

Why doesn't Goalence use Elo?

Elo treats every team as a single rating, ignoring home/away asymmetry. In a league like La Liga where home advantage averages 0.35 goals, Elo loses meaningful signal. Pi-Ratings handles this natively.

Is xG more accurate than Pi-Ratings?

Different question, different answer. xG measures process (chance quality); Pi-Ratings measures team strength updated by results. They're complementary — Goalence uses both: xG to inform lambda values, Pi-Ratings to derive base strengths.

What is forward-tracking?

Every Goalence prediction is logged before kickoff and compared against the actual result after FT. No back-fill, no 'what we would have predicted'. The accuracy figure on /methodology reflects exactly that.

Related Stories