Reflections on 1,200-Game Objective Ref/VAR Study; Additions, Updates and More Shocking Findings

It gets worse. (Free read.)

Paul Tomkins

Oct 04, 2023

∙ Paid

Referee: Cheers mate!

VAR: Thank you mate!

Referee: Well done boys! Good process!

It really is too grim; too incompetent; too unprofessional; too depressing.

And, with a total lack of trying to remedy the situation to help Liverpool out of the hole into which their incompetence had shoved the Reds. Well done boys! Good process!

(Reminiscent of the scene on the Titanic. “No iceberg ahead!” - “Well done. Good process!”)

Anyway.

Another thing rarely mentioned is that Diogo Jota’s red card would also not have happened but for the wrongly disallowed goal, as the game would have been 1-0 to Liverpool, and thus unfolded from the centre-circle. (After which Liverpool could have won 1-0, or lost 3-1, or won 2-1, or lost 5-1; but everything that followed was a direct consequence of Liverpool, already harshly down to 10 men, having the redress of a good goal chalked off. The removal of the goal only added pressure to Liverpool; more of a mountain to climb, and maybe Jota doesn’t even come on if the Reds are still 1-0 up, but a midfielder or defender instead. But that’s impossible to prove, other than to say that the decision ruined the Reds’ chances of a fair result, courtesy of some drunk-sounding blokes shouting over each other and calling everyone ‘mate’ and ‘Oli’, to the point where it’s almost an outtake from ‘Airplane!’.)

Anyway.

This is the third and hopefully final instalment analysing the data I spent months collecting, collating and despairing about.

If you want to hear me talking about a bit of the study, I did so on Anfield Index Media Matters. (I don’t really do podcasts anymore, for various reasons, but I made an exception to get this noise out of my head.)

The two previous versions – the full study (brought forward in light of the Spurs fiasco), and a version that’s a quarter of the length, can be accessed for free below, before I sum up and round off and blow up.

Homer Sapiens: Refs, VARs, and Proving Clear and Obvious Biases and Fallibilities (Declassified Data Dossier)

Paul Tomkins

October 1, 2023

Homer Sapiens: Refs, VARs, and Proving Clear and Obvious Biases and Fallibilities (Declassified Data Dossier)

(Image from this This Is Anfield article.) *Declassified by me. Eight full years of refereeing data, four years of VAR The four best teams, 1,200 games. The data is in. It’s never been clearer: referees (and even VARs) often don’t give decisions as to what the correct call

Read full story

And that has all the context, albeit a few new things will be in this article.

Next, the shorter version:

[HIGHLIGHTS] Refs, VARs, and Proving Clear and Obvious Biases and Fallibilities TL;DR Version

Paul Tomkins

October 2, 2023

[HIGHLIGHTS] Refs, VARs, and Proving Clear and Obvious Biases and Fallibilities TL;DR Version

(Apparently the VARs drew the lines on the image above. They must have used invisible ink pixels.) Abridged! The trouble with shorter articles with lots of data covering a lot of different aspects of analysis is that you don’t get to discuss the nuances, and you get hammered for not explaining more, or people misunderstand the data or graphs.

Read full story

If you have any issue with the shorter versions and would like more clarification, it may be in the original 16,000-word study.

The Trilogy: End Game

So, having done my final refereeing study, there are still a few things to clarify, and one bit of a bombshell to safely dentate before I step away from the explosive topic.

Let's start with a fact that I’ve been using for a while (courtesy of Andrew Beasley’s database), but which I’ve put into a new context below:

Game-minutes since last 2nd-yellow to a Liverpool player: 21 (PL)*
Game-minutes since last 2nd-yellow to an Liverpool opponent: 27,090 (PL)**

* Excluding around 10 minutes of stoppage/added time **Excluding around 1,800 minutes of stoppage/added time

That's over TWENTY-SEVEN THOUSAND MINUTES of Liverpool matches since Sadio Mané was sent off for Southampton in October 2015. (Or nearly 30,000 with added time.)

The odds of this happening randomly and without any thumb on the scales are so astronomical, given that 444 opposition players have been booked in that time (lower than opposition players booked for rivals, mind).

Are you telling that not once, in 301 games and almost 30,000 minutes of football, not a single one of those 444 players did anything to merit a 2nd yellow?

How many times did a ref, in contrast to Simon Hooper with Jota, stop himself and think “okay, one more warning”?

And you're asking me, with this being just one example, to not ask questions of such inexplicable data?

Of Premier League teams since 2015, Swansea City at 76 games have the longest run without an opponent ever receiving a second yellow. Crystal Palace, Spurs and West Ham are now into double figures for opponent 2nd-yellows in that time, between 11 and 14 each; while Arsenal and Chelsea have nine apiece, and the Manchester clubs five and six.

It does show that title-chasing teams (City and Liverpool) are treated more harshly when it comes to 2nd-yellows to opponents (another example of referees being too scared to influence things?); but City still have 5x as many, in addition to many, many more penalties.

Liverpool are treated more harshly than other clubs in almost every single officiating metric. Fewer penalties. Fewer yellow cards to opponents. Fewer VAR interventions. And so on.

And when weighted for xG for and against, it's even more stark.

Very Abnormal (and No, I Don’t Just Mean Me)

The above is one of many stats I've been using to show how radically differently Liverpool are officiated to other clubs.

And that's before getting onto the more recent and alarming trends from officials since they seemed to circle the wagons in regard to the blowup with Jürgen Klopp in April.

As I will show in this roundup and clarification piece, the 12 league games since have seen “abnormal” refereeing patterns.

A key finding of my study was that, objectively, the four best/most normal refs treated Liverpool fairly, overall, in the 100-or-so games they did.

The problem was the almost 200 games when the refs were anyone but Michael Oliver, Anthony Taylor, Andre Marriner and Kevin Friend.

And the biggest problem was the c.100 games done by the refs who, as a whole, did Liverpool the least frequently, due to being rookies, about to retire, or just not very good.

I’ll explain more later, but let’s see if the following scatterplot has anything outlaying on it:

We have a cluster of all the clubs with all the refs (88 combinations), and if you focus on those who have done 18 or fewer Liverpool games, you get the average above, in the bottom right corner.

Basically, it’s like the solar system complete with moons … and Liverpool are in Andromeda.

Remember, I’m not just randomly grouping refs together.

Those are all the ones with between 1-18 games. In almost 100 games, those refs treated Liverpool essentially like they were in the relegation zone. But more on this later.

Remember, where the officials can tip the scale with their thumb is in the tighter calls. Remember the pressure officials are under at Anfield, where they often try too hard to show they’re not afraid of the Kop.

I've referred to circumstantial evidence before, but the way it works is in terms of accumulation.

Any single incident can be written off. A few can be written off. But it reaches a tipping point, where it becomes a case to answer.

Patterns in data are less easily dismissed, once past the point of randomness. A few random patterns are natural; a cluster odd patterns stops being random.

For every super-outlying piece of data, like almost 30,000 playing minutes since a second yellow for a Liverpool opponent, there is the multiplication of unlikeliness that follows when factoring in the next series of odd data.

But my work can be dismissed by bad-faith accusations, and rival fans who think Liverpool fans always think they're the victims; plus Liverpool fans or club-beat writers in big media positions who know they can’t make a big deal as they try too hard to look impartial (a bit like Manc refs at Anfield and Scouse refs at Old Trafford and the Etihad). Any time you even mention that there may be something iffy, you’re treated like a leper.

The fact that all fans moan about their clubs’ decisions means no actual analysis is ever taken seriously. (And the fact that ‘neutrals’ when it comes to Liverpool, as with a couple of other big clubs, are often anti-Liverpool.)

There’s an instant “you’re crazy” tag given to anyone who looks into officiating patterns, as if everyone is too scared to go there.

It’s why I keep asking people to get the data themselves from places like Transfermarkt and do the double-checking, by going back through the seasons and viewing the expanded view of each referee for each club’s games.

(Interestingly, the Transfer Price Index I co-created with Graeme Riley in 2010 to track Premier League transfer inflation was called a nonsense at the time by some journalists, including one who does a lot of work now on VAR analysis. I've since seen many people and organisations copy our TPI model, or as recently as this year, present the idea as their own, 13 years on.)

It's not about overarching nefarious conspiracies (an equally stupid accusation used against people pointing out serious discrepancies, to discredit them, given that so many big conspiracy theories are often wild and insane), but often about all the ways any 50/50 can be given against the target without anyone noticing until they're all added up (which no one ever does).

For any discussion of iffy refereeing data, we need to establish the reflexive reactions people have, due to their own biases, anti-biases and possibly low IQs.

People even have black-or-white thinking on conspiracies. Either they’re espoused only by whackos; or anything and everything is real, including the moon being made of cheese. In truth, there are so many mid-level and low-level conspiracies; very few mega-conspiracies.

Mid-level and low-level conspiracies? Pettiness. Vendettas. Arse-covering. Biases. Nepotism. Cronyism. Jobs for mates. Favouritism. Hiding mistakes. Closing ranks to protect your own. Going rogue. ‘Phoning it in’ ... and so on.

None need be massive conspiracies. All can be corrupting, and result in a lack of integrity. All can be piled one upon the other, sometimes as a way to cover tracks.

All of these could occur within one single, small to medium-sized organisation, even if it never need be overtly stated (just often enforced by inference, hints, and punishment).

No nefarious grand plan, no grand conspiracy, just a constant drip of compromised positions.

Has it not been seen in the bad police forces for decades?

Not all police forces are toxic, and we clearly need the police, just as we need refs.

But there’s a legitimate need to weed out bent coppers and reform corrupt forces, and there’s a need to employ quality referees, who don’t let biases get in the way, and who don’t overreact with extreme anti-biases, which I’ll prove below.

(They can remain human and flawed, but there are levels beyond which it’s incompetence, favouritism, cronyism, etc., not least in the rare admittance of the truth by Mike Dean in saying he didn’t call Anthony Taylor over to the monitor as he’s a mate. And people think this is some rare exception; the one aberration, like the one single bent cop in the entire history of the world.)

In the many British true crime books I’ve read covering from the 1940s to the 1960s, no one ever believed the police could be corrupt. It’s quite startling, in how people simply refused to believe it was possible.

It was taken as 100% gospel that the British Bobby and the finest from Scotland Yard were as honest as Jesus Christ. Some may have been; but others weren’t. That kind of lack of scrutiny and blind trust led to astonishing corruption by the 1960s and beyond.

I’m not saying that PGMOL is that bad – just that blind trust of the integrity of English referees (who are, like the police 70 years ago, treated as weirdly saintly in terms of their integrity) is also incredibly naive.

We need to be able to trust referees, and they’re not helping.

Yes, incompetence is often mistaken for corruption, but it can often disguise it, too.

For officials to be working in the UAE 48 hours before a game is not necessarily gold-bars-in-the-5-star-hotel-bedroom corrupt, even if it's the home of the Manchester City owners (a club still facing 115 charges that includes millions in illegal payments), and that region is sportswashing the hell out of European football; but it is corrupting their ability to do their job properly at the weekend.

And who benefits? Manchester City. That’s likely a coincidence in this case.

Oliver Kay wrote a good piece on the optics of the refs’ extra curricular activities, but when financial chicanery is already part of the charge book (and they’re being paid handsomely to ref in Saudi and the UEA), we have a right to be suspicious beyond optics.

Imagine the outrage if, two days before a big game involving Liverpool's rivals, the refs spent time at the Boston Red Sox, learning about officiating in other sports, then made the 10-hour flight back just in time for the game. They'd go nuts.

And rightly so. And that's with no accusations or charges of FSG making various secret payments than run into millions of pounds.

(As it is, refs from Merseyside are almost all exclusively barred from doing Liverpool games, even if they don’t support the Reds, while a raft of refs from Manchester do both Manchester teams; which may not mean much, but again, seems odd. And it may actually work against Manchester clubs, too.)

People will use confirmation bias to prove their point, and confirmation bias to disprove it, based on single incidents, or clusterfucks like at Spurs.

But I'm talking eight years of deep data (in this case focusing on what I call the Main Four, of Liverpool, City, Chelsea and Man United as the best four teams 2015-2023, and 1,200 games, separated into game difficulty, and home and away).

And while I wasn’t actively seeking do so, it didn't need hindsight to see the probability of the following things happening in games this season when looking at a referee’s data; all have come to pass, with one exception.

It was reasonably likely that:

Stuart Attwell would give two tight Big Decisions to the Main Four home team (he did so in the Man United vs Forest game). Why? Of 25 Big Decisions he's given out in games involving the Main Four since 2015, a staggering 20 have gone to the home team, and around 80% have gone to the Main Four team.
A referee from Manchester would give Liverpool a penalty at Anfield, as they do so as a collective more than almost any individual referee (result: Chris Kavanagh did so for the obvious takeout of Mo Salah.) Same pattern applies to Scouse refs at Manchester stadia, where they are more generous than at Anfield. (Even Paul Tierney, after a dozen games at Anfield with only a penalty Against the Reds, finally relented and gave Liverpool two penalties in the spring, albeit both were stonewall and immediately followed his assistant attacking a Liverpool player.)
Simon Hooper would not make a Big Decision in a Main Four game (and he did not do so in the Man United vs Wolves game, and was dropped for the next game*.) In 18 games involving Main Four sides, he'd given just one Big Decision (usual referee rate is 2.7 games for a Big Decision For or Against). So based on that, he was not going to award Wolves a penalty, and while accidental, as with Curtis Jones (albeit Jones got some of the ball, and his foot bounced up off the side of ball and onto the top), you'd suspect that a keeper punching someone in the face when late to the ball is worthy of a three-match ban as well for excessive force and endangering an opponent, in addition to being a stonewall penalty.
Anthony Taylor would send off Rodri. A Mancunian ref at Manchester stadia is generally more likely to make Big Decisions Against them, unless the title is on the line (2019!). Taylor sending off Rodri was predictable in his data, where he has now given more Big Decisions Against City at the Etihad (6) than For (5). Away from home, however, his balance is 3:1 in City's favour, which is better than expected away, to go with worse than expected at home (a reversal of his Liverpool data: kinder than expected at Anfield, harsher away). Remember, Main Four clubs should have an overall ratio of 2:1 for, with it slightly better at home and slightly worse away.

To take it back further, Mike Dean, the only Merseyside ref to do Liverpool in the past eight years (and even then, only from 2020 onwards, really, and only in minor games), never gave Liverpool a Big Decision at Anfield. He gave two Against.

While undoubtedly regularly called a Scouse wanker, he was a super-Homer to Manchester clubs in Manchester, where in 21 games he gave 10 Big Decisions For the Manchester clubs, and only two Against, at a massively skewed ratio of 5:1.

But in 34 away games (he was sent away more than to Manchester), Dean gave a combined nine Big Decisions Against the Manchester clubs, to compare with those two in Manchester. Away he gave ten Against United and City, to mean their ratio was minus away, when it should still have been positive. (He split his decisions For and Against almost equally involving the Manchester clubs away.)

Similarly, Anthony Taylor gives far more Big Decisions to Liverpool at Anfield than away.

I'll now call this Reverse-Polarity Bias, where Scouse and Manc refs are super-generous (bar someone like Paul Tierney to Liverpool) in the belly of the beast, and away from that rival city stadia, far more punitive than expected.

Are the PGMOL aware of how distorted Manc/Scouse refs’ data is home vs away? I expect not. But it’s clearly abnormal, and it’s clearly an indication of crowd-pleasing in hostile arenas, followed by some possible payback away from home.

The above trends may well be due to mere human failings – albeit in Tierney's case there is a long-running, barely contained seething between him and Jürgen Klopp.

But these failings cost teams points, maybe even league titles. It’s time to stop Manchester and Merseyside refs doing Manchester and Merseyside teams, albeit as so many refs are from Manchester and Merseyside, it’s a mess.

In Oliver Kay’s piece, he notes:

It has become increasingly clear that some officials regard VAR duty as arduous. Mike Dean, who retired from refereeing at the end of the 2021-22 season, spoke recently of “getting into the car on Friday and dreading Saturday” when he was in the offices at Stockley Park. “I was thinking, ‘I hope nothing happens,’” Dean told Simon Jordan’s Up Front podcast. “I used to be petrified sitting in the (VAR) chair.”

If Dean felt that way, why the hell was he doing that job?

As with his honesty about his dishonesty (protecting his mate Anthony), it’s hugely revealing.

Maybe it shows that it’s a tough job, but if he found it harder than refereeing to actually coldly look at many angles of footage, that’s alarming. He was experienced and not afraid of the limelight. Was it the constant fear of having to overturn a mate?

If so, that calls the whole process into question; as you would expect, with mates protecting mates, and other forms of cronyism (well intentioned or not). They’re a bunch of matey blokes who have each other’s backs (which is both noble and corrupt), and protecting each other is more important than a honest decision.

The Overreacting-Ref Reflex

So, let’s go back a few weeks.

Simon Hooper never gives Big Decision in big games. The biggest big-game Big Decision bottler there is.

That is, until he gets demoted after the Wolves debacle at Old Trafford, doing Manchester United on the opening weekend, in a higher-profile game due to its timing.

Hooper had never given a Big Decision in a Liverpool game. But Liverpool at Spurs was up next for him, in terms of big games.

Now, this would seem to suggest that there was no way Hooper was going to fail to make a Big Decision, having been dropped for not making one.

Of course, Darren England, who had only made two non-handball subjective interventions in almost 50 games as a VAR in Main Four games, helped to stitch Hooper up.

As happens in football, it's the next club in the firing line that gets punished.

(Of course, Hooper then did Liverpool right after his demotion, as often happens. But only against Aston Villa at home, and Liverpool quickly took the lead and controlled the game. He then did Everton at home to Arsenal, where Gary Neville also questioned the very wonky VAR lines, and the weird camera angle.)

Result of the Onana-gate demotion?

Spurs vs Liverpool. Suddenly, red cards everywhere. Having been told at half-time that a legitimate Liverpool goal had been ruled out, his response was to give Diogo Jota as many yellow cards as he could manage in the next few minutes. This was a man under pressure to act.

While the 2nd was a clear yellow card, we've seen hundreds and hundreds of incidents where refs, knowing of a serious earlier error, show some leniency to partially even things up; and while it’s not correct (two wrongs do not make a right), it usually happens to the point where you can expect it.

(And he'd already been forced to send off Curtis Jones after Darren England primed him with the worst angle, when refs are supposed to be reviewing in real time. Again, Hooper, having been demoted over Onana-gate, was being psychologically forced to make that a red card. Primed with the worst possible still-frame, and under pressure to make Big Decisions, he did just that.)

We know this. We know this because it has happened to Liverpool opponents in the last 301 games, countless times. Go on, one more warning.

He also didn't know that the first yellow was not even a foul, but then again, that's another mistake.

Instead, Jota was off, and before that, the VARs did not look at the clear taking out of Joe Gomez in the box.

Again, far more of a foul with heavier contact than for Liverpool players are being sent off for right now.

So, Simon Hooper now has three Big Decisions in 20 Main Four games. (The other being a penalty to Man City away at Fulham last season.)

Officials often have a choice: the orange card, as it were. Don’t quite give a red. If they keep giving some clubs a yellow and some clubs a red for almost identical borderline calls, that’s a kind of corruption than can be hard to notice in the moment. It may not be conscious, but it happens.

The data suggests this happens against Liverpool, but more about what the opposition are allowed to get away with, which means they never get the second yellow, and less frequently are punished with a penalty.

A main new finding is just how skewed the officiating has been since the Klopp/Tierney/John Brooks blowup at Anfield in April, which itself came soon after an official who is tied at the hip with Tierney ELBOWED a Liverpool player.

(I still can’t believe more would not have been made of this if it had been Marcus Rashford, Raheem Sterling, Harry Kane or various other England stalwarts, and not the Scotland captain whose narkiness can annoy opposition fans, so, basically, fuck him. Roy Keane almost said as much, and can you imagine what would have happened if a linesman had elbowed him? That linesman’s nose would now be popping out of his own arse.)

I looked at 88 ref/club combinations since 2015, and plotted them on a scatterplot that was busy due to the number of data points.

Patterns

In April, the barely hidden simmering feud between Paul Tierney and Jürgen Klopp burst into a war of words that saw Klopp suspended after berating John Brooks, the 4th official, over not one but two fouls on Mo Salah right in front of them.

What's happened since?

A lot.

And it all started a couple of games after Constantine Hatzidakis elbowed Andy Robertson, which I still think is one of the most outrageous things ever seen in English football, given that the worst violence in the opposite direction at the top level was a push by Paolo Di Canio on a referee. (Yes, that's super-serious, but these things have to work both ways.)

Anyway, that was prior to this 12-game run, once Klopp was banned.

So we have a sample of 12 league games for Liverpool; or roughly one-third of a season's worth of games.

First, either Tierney and Brooks have been the ref or VAR in a THIRD of those games.

Yes, the two refs in the three-way fight, back to do the Reds as often as possible.

In total, HALF of the games had a ref or VAR from Manchester (albeit that's actually often not a bad thing if the game is at Anfield, as my data shows; but away from home it's usually worse than normal – ~~good job this weekend's ref and VAR away at Brighton are not both from Manchester~~. Doh.)

Saturday will take it to over half of the games having Manchester input.

And if we band together the various referees' data to form one super(bad)ref, then this run of 12 games would represent the worst ref/club combo out of the 40 (from 88 overall) who have done a minimum of 12 games.

The next worst is Jonathan Moss for Liverpool, from his 19 games.

The 'expected' rate of Big Decisions in these particular 12 games (home and away, and opposition strength factored in), would be:

1.93 For, 1.08 Against.

Or, roughly 2:1.

Instead, it's 3:5.

50% more For than expected, 400% more Against than expected.

Again, small sample size. Random shit happens.

But I've showed that if you add up all the small sample sizes of all the refs who have done Liverpool fewer than 20 times, you get the Grand Canyon of missing Big Decisions for the Reds, and often, a cascade of Big Decisions Against.

This is my point. Add the data up.

Four red cards? How many were clear?

Mac Allister’s was overturned not by the VAR team that day who just happen to seem to despise Klopp (Tierney and the guy who elbowed Robertson), but by an independent panel.
Jota’s came from two yellows, when one wasn’t even a foul.
Virgil van Dijk’s was away, early, in a big game, and when refs (such as Anthony Taylor and the Vincent Kompany two-footer, or Paul Tierney and the Harry Kane leg-breaker on Robertson) don’t tend to give red cards early. Accidentally and fractionally clipping the man before he clearly played the ball, it was minimal contact; technically a foul, but a red card seems harsh, as the DOGSO wasn’t 100% clear.
Curtis Jones’ was given as a yellow card. The VAR, who failed in all his other duties, put up a priming image that was the most damning possible, to sway Simon Hooper, rather than let the ref see the replay in normal speed. Only the worst replay angles were shown. Jones’ foot goes into the side of the ball, low (as you can see below) then is spun and and bounces over the ball. It looks bad, but it’s a harsh red card; another orange, as it were. Compare it to Harry Kane launching two seasons ago; in this case, Jones’ back leg is on the ground.

My point is that very few refs would give all four as sendings off.

I would wager that no single ref would give all four as red cards if they all happened in different circumstances.

The odds of all four being given as reds can be added to going almost 30,000 match-minutes without a second yellow card to a Liverpool opponent. To have four red cards in seven matches but no red cards for opponents for second yellows in 301 matches defies all logic and probability.

The refs making these massive calls, with at least two of them clearly wrong? (Mac Allister and Jota.)

Experienced, top-level?

Nah, Thomas Bramall, John Brooks and Simon Hooper.

The penalty Liverpool conceded to Aston Villa last season was 100% legitimate. Ibrahima Konaté was a fraction late. John Brooks was the ref, who Klopp had been banned for berating soon before.

But the worst challenge seen in the last 12 games?

This, from the same game. Brooks and the VAR Tony Harrington (another major name in refereeing), did not send off Tyrone Mings:

I mean, again, it defies belief. It’s not even borderline.

(And unlike Diogo Jota’s high boot against Spurs a little before then, it was done with force and momentum.)

Liverpool got three stonewall penalties in the 12-game run, that were hard to deny.

It just seems that it has to be stonewall or it will be #penaltypool or #LiVARpool, two strong social media forces that put pressure on referees when it comes to Liverpool and Big Decisions; another factor to consider.

(People are fully aware of how much pressure Twitter/X creates with campaigns and agitation and protests and smears, and yet no one seems to think about how it affects referees. When it comes to Liverpool, I would suggest badly, given the various outrage hashtags like #penaltypool and LiVARpool. We live in a world where appeasing the unthinking social media mob is often more important than truth, facts and fairness.)

Never mind that Liverpool have by far the fewest positive VAR interventions.

(Don't let facts get in the way of a good story about how Liverpool are favoured, when the data says the exact opposite).

What Liverpool do have is the most incorrect on-field offsides given at both ends. But these are factual; or they are, if the VAR isn't dicking about.

So, the referees in the 12 games, as one super(bad)ref, ranks as harsh as there is.

Let’s go back to the (busy) scatterplot from the other day.

If you were to group together the 12 games, the refereeing Balance of Decisions vs Expectations would be represented by the purple dot (let’s add yet another colour to the mix!). That is worse than any single ref who has done 12 games for any of the four clubs.

And they’re worse even that the terrible Jonathan Moss, who gave Spurs a Big Decision at Anfield in the pre-VAR days by asking if the video had anything.

(And the one single ref above the line, Bobby Madley, hasn't done the Reds since before Mo Salah joined!)

But here’s where it gets super-freaky.

New Galaxy

As I noted earlier, look at the average of all the refs who have done fewer than 19 games for the Reds, and the above constellation moves into a different galaxy.

Based on the norms of the Main Four, Liverpool should have had a balance of +8.49 Big Decisions from these 93 games.

Instead, it's -6, a harmful swing of nearly 15 missing Big Decisions For the Reds.

Then, if you look at the 112 games done by the refs who have done Liverpool between 19-26 times.

The balance of Big Decisions should be +10.38. Instead, it's +1, with almost 10 missing Big Decisions For the Reds.

The refs who have done 28-36 games are the three best refs by the Objective Ref Rater, which merely compares every ref against the average from the 1,200 games for Big Decision Ratio (so, 1.68:1 is the norm for the Main Four), Homer rating, and Big Decision Frequency (active or inactive vs the normal decision rates).

For all individual refs, the more games means the closer to the mean; for the gaggle of refs who have all done Liverpool less than 19 times, their average gets worse by a margin that’s literally off the charts.

These refs are ‘generous’ to Liverpool, but they are also objectively the best referees, when compared to deviation from the averages across the whole 1,200-game sample (as discussed and explained in the the main study):

Michael Oliver
Anthony Taylor
Andre Marriner

But what we can see, for a fact, is that:

Inexperienced, rookie or disillusioned old refs treat Liverpool very harshly;
Mid-level refs treat Liverpool very harshly, but not as harshly (in part as they include Kevin Friend, the 4th-best ref), to counter the likes of Jonathan Moss (weird), and Martin Atkinson, who massively changed his approach to Liverpool games in 2015 following Steven Gerrard’s book, which slated him, and the arrival of Klopp;
The most experienced, senior and well-respected refs are kinder than average to Liverpool in a way that undoes only a fraction of the damage.

Conclusions From Overall Study

Liverpool only have ‘fair and balanced’ refereeing from less than a quarter of the refs to do the club in the past eight years.
The ‘better’ the refs (judged either objectively via the method I used, or number of games in charge), the better' the Reds’ Balance of Big Decisions.
There appears to be an unusually bad run of Big Decisions for the Reds since the fallout between Klopp, Tierney and Brooks last season, and Tierney (as VAR) and Brooks (as ref) have been behind some shocking decisions.
Nearly 30,000 match-minutes since a second yellow for a Liverpool opponent.

The rest, below, are repeated from previous article, for those who missed it:

Big Decisions (red cards, penalties, second yellows) change games.
When extrapolated, basically, a Positive Balance of Big Decisions in a match comes out at virtual title-guaranteeing form; no Big Decisions For or Against is likely to see a team finish around 4th; while facing a Negative Balance within games would mean mid-table at best.
- 1.441 ppg – 54.8 season pro rata 38 games
- 1.963 ppg – 74.6 season pro rata 38 games
- 2.536 ppg – 96.4 season pro rata 38 games
So Big Decisions are huge. They are indeed Huge Decisions.

In the 1,200 Main Four matches covered since 2015, the pinkish bars below show the difference between one of those clubs having a positive balance of Big Decisions within a game; no decisions at all; and a negative balance, as 38-game pro rata extrapolations.

The best two referees via my Objective Ref Rater coefficient are Michael Oliver and Anthony Taylor; the two referees generally described, subjectively, as the best. This pair are very close to the expected norm, and I had no idea who would emerge as the top two. So it’s a good sign that the model is on the right tracks, even if no model can capture every aspect.

Taylor has made some terrible decisions in Liverpool games, but it’s fair to point out that this Mancunian treats Liverpool ‘okay’ (albeit mostly at Anfield, in contrast to many referees who seem like they have to prove their manliness by never giving decisions to Liverpool in front of the Kop).
Those ranked 3rd and 4th are Andre Marriner and Kevin Friend respectively. (Both recently retired.)
Several of the worst-ranking refs from the model now work as VARs or for the PGMOL, training the current referees.

Manchester City, with the best xG ‘GD’ by a reasonable distance, get the best Balance of Big Decisions. This is to be expected. I have no issue with them getting the most penalties and having the best Balance of Big Decisions.

Liverpool, with the 2nd-best xG ‘GD’ by a reasonable distance, get the worst Balance of Big Decisions, at roughly 14 fewer than expected. This makes zero sense. However, when Liverpool have the best refs, the picture flips.

If Liverpool only had the refs objectively ranked as the best, or ‘most normal’, the Reds’ figures would be less freakishly bad; the worst (weakest?) refs, who do more games than the better refs, are extra-bad for Liverpool, it seems.

A finding of real interest is that, in general, Liverpudlian refs (or from Merseyside in general) are much more likely to give a Big Decision to Mancunian clubs in Manchester, but much less likely to give a Big Decision to Liverpool at Anfield (albeit the only Liverpudlian ref who has done Liverpool is Mike Dean.)

Conversely, a Liverpudlian ref is much less likely to give an away Big Decision to a Manchester club, and a Mancunian referee is much less likely to give an away Big Decision to Liverpool.

So, on this part of the study, we can say that “rival” refs are overly generous when at the home of the “enemy”, but then extra harsh on those clubs in away games, when not feeling the pressure to try and look as unbiased as possible. Normality is flipped on its head.

Whatever the reason beyond my subjective theory above, the data involving Liverpudlian and Mancunian refs in games involving Liverpool, Man City and Man United suggests an inability to referee “normally”. These are the kinds of issues I’ve been concerned about for years, including the feuds officials have that they cannot disguise (unless really forced to).

Referees who have done c.100 Main Four games since 2015 see their data cluster more tightly together around ‘normal’, with no extreme outliers – but still quite a reasonable divergence, from c. +0.3 extra Big Decisions For per game for some club/ref combos and -0.3 for others. This is roughly half the levels reached by the positive and negative outliers.

As a general fact, the home/away split for all Premier League penalties 2015-2023 is: 461 home (56.77%), 351 away (43.23%). Home clubs have the advantage of their fans, a familiar pitch, less travel, etc., and you would expect some home advantage that is not indicative of a ref being a Homer. That normal split can be said to be c.57:43.

A referee will make a Big Decision in a match involving a Main Four club every 2.74 games, with the ratio, as noted, 1.68:1 in favour of the Main Four club.

At home, Big Decision likelihood For a Main Four club doubles on a trendline, from playing the best teams (0.10 extra Big Decision) to worst teams (0.20).

However, at home, the Main Four between 2015 and 2023 averaged 2.2 Big Decisions for every 1.0 against. These are the strongest teams, so should be above the general league frequencies.

Away, Big Decision likelihood For a Main Four club also doubles on a trendline, from playing the best teams to worst teams; but it starts from a lower expectancy rate (just under 0.0), and rises to an expectancy rate very similar with playing the better teams at home (0.10). The trendlines for home and away Big Decisions for the Main Four are absolutely parallel.
These frequencies in easier/harder games are used in my Objective Ref Rater, and to create Expected Big Decisions, against which Actual Big Decisions can be compared.

Only five referees out of the 22 to officiate Liverpool games since 2015 have given Liverpool a positive Balance of Big Decisions vs expectations. These just happen to include the four ranked objectively as the most ‘normal’ referees (Oliver, Taylor, Marriner and Friend); plus Bobby Madley, who hasn’t done a game for the Reds since 2017 due to suspension for inappropriate behaviour.

For the other three Main Four clubs, at least ten referees have given the club a positive Balance of Big Decisions vs expectations. While the following scatterplot is busy (sorry!), it contains the 88 ref/club combiations.

Of the refs who are way below expected Big Decisions for each of the four clubs (88 ref/club combinations), no fewer than ten of the harshest 17 are “referee/Liverpool” combinations. (This has now gone to 11 of 17, if adding Simon Hooper’s data from Spurs; a small sample size, as some of these are, but it’s a lot of similarly bad small sample sizes that add up to getting on for 200, or 2/3rds, of games.)
NEW: this is a quadrant of the scatterplot, showing 16 red dots of Liverpool below the line at 0, that represents par for expected Big Decisions:

Any of those refs in isolation mean little; but add all those samples together and you get hundreds of games.
Without Oliver and Taylor, Liverpool’s Big Decisions Balance (in the remaining 200+ games) would be more akin to a team below mid-table.

A VAR will make a subjective Big Decision intervention (so, not including offsides, etc.) every 8.38 games, or three times as infrequently as a referee on the pitch. (Which makes sense, as the ref should be seeing the obvious things with no need for the VAR to do anything other than confirm.)

Liverpool get by far the fewest subjective foul-based VAR Big Decisions of the Main Four, perhaps to counter the misleading #LiVARpool narrative. (They do get the most offside overturns, as incorrect offside decisions involving Liverpool by a lino are massively higher than for the other three clubs. Again, this is interesting given what should have happened at Spurs this weekend, as is why assistant referees are so eager to flag.)

Stuart Attwell is also by far the biggest Homer, with almost all of his decisions going to whoever is at home (20 out of 25): the Main Four team, or if an away game, the team at home against the Main Four side. Attwell is also the ref most likely to favour a Main Four team, albeit done mostly at home, naturally.

By some distance, Liverpool games feature the fewest Big Decisions.
Also, the fewest VAR overturns.
Also, the fewest yellow cards.
Plus, the fewest penalties.
(And no second yellow card for an opponent since 2015, when all other regular Premier League clubs have at least five, and Spurs are nearing 10.)
At times it seems like referees and VARs are totally passive during Liverpool matches; this season it’s been like they’ve not actually been on duty when it comes to obvious errors. (But at other times the refs and VARs are on overdrive. Four red cards?!)
Paul Tierney’s overall data is fairly normal, but in his case, the contrast between his record for Liverpool (as both a ref and a VAR) and the other Main Four clubs is what leads to my constant questioning of his suitability. For the other clubs his VAR decisions are almost all in their favour; and for Liverpool, 100% are against (three, all going to Manchester clubs), two of which were highly dodgy (even Gary Neville and Roy Keane called an apparent foul on David de Gea ludicrous).

Refs do not make anywhere near as many bookings at Anfield. At Anfield, away players are booked only 93.4% as often as they are at Old Trafford, 85.2% as often as they are at the Etihad, and 78% as often as at Stamford Bridge. This may partly explain why no opposition player ever gets a second yellow card (a Big Decision) when playing Liverpool, as they more rarely get the first. (Graphs below from last season.)

Teams’ win percentages can still be high with ungenerous refs, and low with more generous refs. But on average, Big Decisions change results by a big margin.

Rankings for Premier League penalties won per season since 2015: 1 Manchester City; 2 Leicester City; 3 Brentford; 4 Manchester United; 5 Crystal Palace; 6 Brighton; 7 Nottingham Forest; 8 Chelsea; 9 Fulham; 10 Liverpool. Dispels myth that smaller clubs don’t get Big Decisions. (Palace, like some other clubs, also beneficiaries of lots of opposition red cards, and are the biggest beneficiaries of 2nd-yellows.)

VAR Big Decisions tend to be consistent and steady within the timeframe of a match, with 2.62 overturns for every minute-time of the game (1-90); 236 subjective overturns in total as of early September 2023. (Involves some double-counting in Main Four head-to-heads.)

Subjective overturns have a natural distribution that makes for a lovely logical graph (below) – but only after the first 20 minutes; the first ten minutes sees very little action, as if it’s deemed too early to intervene (just as a bad early tackle is more likely to be given as a yellow card by the on-field ref).
Only Chelsea have to wait longer than the Reds for a Subjective Player-To-Player Big Decision, at 45 minutes to Liverpool’s 43. Man City have to only wait 12 minutes for their first For; as do Man United.

The average time of Subjective Player-To-Player Big Decisions For for Chelsea and Liverpool is over the 60-minute mark. For the Manchester clubs, it’s under 50 minutes.

As such, the treatment by VAR of Chelsea and Liverpool is vaguely similar, but Chelsea do better. The treatment of those two clubs in contrast to the Manchester clubs is alarming. What is going on?
Again, Liverpool games involve fewer VAR subjective calls than the average of the other three Main Four teams.

Excluding offsides and handballs, so focusing purely in fouls and other physical player-to-player decisions (“Subjective Player-To-Player Big Decision”), Liverpool’s VAR balance is -2 (now -3 after the trip to Spurs). All the other clubs have a positive balance.

Anfield has seen Liverpool have just three VAR Subjective Player-To-Player Big Decisions For the Reds.

Penalties vary in relation to quality of teams involved; red cards do not.

One thing I showed a couple of years ago was that over a seven-year period and 600 Premier League penalties, foreign defenders were penalised more than expected (based on share of minutes played), and homegrown attackers won far more penalties than expected (again, based on share of minutes played). So it’s not helpful to have foreign strikers and foreign defenders if you want Big Decisions.

Keep reading with a 7-day free trial

Subscribe to The Tomkins Times - Main Hub to keep reading this post and get 7 days of free access to the full post archives.