Over the weekend, so-called “video-assisted referees” (or VAR) made a major mistake in the Liverpool-Tottenham match that denied Liverpool a goal, which proved to be the difference in the match. It was not the first error made by VAR — Sky Sports has documented 14 instances of VAR errors in the Premier League over just the past 2 seasons, for which an apology was later issued, as shown in the graphic below.
In my book The Edge, I made the case for a light touch with technology-assisted refereeing. Below I have excerpted that argument. At the bottom, THB Pro subscribers can download a PDF of the full chapter, which is titled Hacking the Athlete and the Games. Enjoy!
Part of the application of technology to refereeing games is about reducing the uncertainty in judgments, but another part is about legitimacy; that is, the general acceptance of referee judgments and the overall integrity of competition. Technology, it turns out, presents challenges to legitimacy and the key to preserving legitimacy. Let me explain.
In general, people—especially the sports-viewing public—understand uncertainties just fine. Studies of public understanding of probabilities related to weather forecasts indicate that, when it comes to the weather, the public actually has an appreciation for probabilistic information, even when the information is provided with no mention of the possibility that the forecast might be wrong. It turns out that we each have enough experience with weather forecasts to be able to develop a sense of their accuracy.
Such research on the public understanding of science suggests that we should be cautious about assuming how the public might react to information in other contexts as well. Now let’s apply this thinking to sports. Consider that most sports broadcasts are accompanied by a wealth of statistical information, some of which is very sophisticated. The role of technology-assisted refereeing in sports is primarily about legitimacy—the better that referees understand what just happened on the field of play, the more likely they are to make the right calls, which in turn means that the players and the fans are more likely to regard the outcome of a competition as fair, and a system that delivers fair outcomes is more likely to be regarded by athletes and fans as legitimate.
Consider two shots on goal in the World Cup that occurred forty-four years apart. In the 1966 World Cup final, England was awarded a goal in extra time against West Germany when the ball may or may not have crossed the goal line after ricocheting straight down off the crossbar. Video evidence remains inconclusive (though one analysis conducted thirty years later by engineers at Oxford University found that the ball was six centimeters short of fully crossing the line). Nonetheless, the goal was given and became part of football lore. Uncertainty still lingers over whether the ball actually crossed the goal line, and a half century later, some Germany supporters still feel wronged.
In contrast, the persistence of uncertainty today is often not possible. In the 2010 World Cup, England’s Frank Lampard had an obvious goal disallowed, also against Germany, this time in the quarterfinals. Like its doppelganger generations before, Lampard’s shot ricocheted downward off the cross bar. In this case, however, the ball undoubtedly crossed the goal line, according to the video replays. Although Germans with long memories called it payback for 1966, the English felt aggrieved, and most neutral observers found the incident problematic: one of the most important events to occur on the pitch was missed by the officials but seen clearly by millions around the world. There was no hiding behind fuzzy black-and-white television imagery. The wrong decision was there for everyone to see.
It is this difference between what the referee can detect on the pitch and what television viewers can see at home—with ultra-slow-motion replays in high definition—that has led to the introduction of technological aids to assist referees. Equipped with this high-tech assistance, human referees are now faced with about the same level of uncertainty that remote spectators (the fans watching their TVs) must deal with. Referees’ uncertainty has been realigned with spectators’ uncertainty.
But sports have stopped short of trying to reduce uncertainty in every decision that referees make. It’s just not practical or affordable—or perhaps even necessary from a legitimacy standpoint—to do so. In soccer, goal-line technologies are used to detect whether balls cross the goal line, not to detect whether balls go out of bounds or who should have subsequent possession. In tennis, the Hawk-Eye system is used sparingly, with players granted a limited number of challenges to deploy it in a match. In professional cricket and basketball, video replay as an officiating aid is also limited to certain situations.
With the introduction of technology, uncertainty does not go away. During the 2013 French Open, Ukrainian tennis player Sergiy Stakhovsky received much attention (and a fine from officials) for taking a picture of a ball mark on the clay with his iPhone and then tweeting it to document a dispute over a ball that was ruled out. When Stakhovsky had done the same thing in Munich a few weeks earlier, several of his colleagues on the professional tennis tour tweeted back to dispute his complaint. The introduction of technology has not eliminated line call disputes, and social media has created new challenges for officials. One can only guess what John McEnroe, the famously controversial and colorful tennis star of the 1970s and 1980s, might have done with an iPhone and a Twitter account.
However, technology can create incentives that help human referees perform better. A 2011 study by David Hamermesh and colleagues in the American Economic Review looked at more than 3.5 million pitches in Major League Baseball games from 2004 to 2008 to assess whether umpires displayed biases in how they ruled balls and strikes. The study found a small but significant bias in how umpires ruled depending on whether the umpire’s “race” was the same or different than the pitcher or batter. When the umpire was being evaluated by a computerized system (or performed before an exceptionally large crowd or in an important game), however, the bias went away. The presence of the technology helped umpires overcome potential biases when they knew that they would be evaluated objectively.
Technology also creates new problems. In February 2016, Colorado State University was hosting Boise State in basketball. The teams had played a close game that went to overtime. With 0.8 seconds left, Boise State had the ball in its half with the score tied at 84. Boise State inbounded the ball to James Webb III, who took two quick steps and, before the buzzer, launched a one-handed shot as he was falling out of bounds. Miraculously, the ball went in. Boise State won, right?
Nope. The referees believed that the game clock had not started on time, so they went to the video replay. The video replay technology allowed the referees to compare when the clock started, when it should have started, and how long it took for Webb to get the shot off. The referees determined, based on the video replay, that Webb took about 1.2 seconds to get his shot off, more than the 0.8 seconds remaining. The shot was waved off, and the game went into a second overtime, from which CSU emerged victorious.
It did not take long for TV viewers and reporters to perform their own timing of the final play to discover that Webb’s shot was launched in only 0.6 seconds, not the 1.2 seconds claimed by the referees. What had happened? The software used by the replay technology had a bug—it counted the elapsed time twice as fast as what really occurred. The Mountain West Conference, in which both teams played, issued a statement explaining that Boise State should have won the game. Under NCAA rules, protests of such mistakes are not allowed, so Colorado State kept the victory. The conference announced that changes would be made to the rules governing the use of video replays and the episode would be used as a future teaching tool. Technology doesn’t necessarily get rid of uncertainty—it may just move it from one setting to another.
The improved alignment of refereeing decisions and what the public observes on television through the introduction of technology is to be applauded. When such decisions fall out of alignment—whether less or more precise than what the public demands—sport faces a legitimacy crisis that then necessitates innovation.
For instance, sometimes we look to technology to do the impossible. The struggles that the NFL has faced in defining a “catch” illustrate this point. Here is an experiment that you can try right now. Pick up a ball (or any tossable object). Throw the ball to your nearest companion and ask him or her to catch it. Now answer this: Did he or she catch your throw? I bet that neither one of you will have any difficulty answering either yes or no. Understanding the catching of a ball doesn’t require nuance, subtlety, or litigious interpretation. That used to be the case in the NFL. But not anymore.
As recently as 1996, the NFL needed only 109 words to define what it meant to “catch” a football.* This is the number of words that appear in the prior paragraph. By 2015, the NFL definition of a “catch” had swollen to almost 600 words, and the definition of a catch had become an endless point of controversy among fans, players, officials, and the media.
Defining a catch became difficult after the introduction of high-definition slow-motion instant replay. Today’s television technology allows viewers to have a far better look at the act of catching a ball than referees could ever hope for on the field. Using instant replay can help referees have the same view as everyone else, but it also forces the NFL to define more precisely what a catch actually is. It was clever when the folksy TV commentator and former coach John Madden used to explain that a player getting one knee down inbounds while making a catch was equivalent to having two feet come down in bounds. Madden wasn’t invoking any real rule, just helping the viewer to make sense of referee judgments. High-definition TV and instant replay has consigned folksy interpretations of the rules to the era of vacuum tubes and low-definition broadcasts.
Today, referees need to be able to address what happens when the player bobbles the ball, or the ball touches the ground while it is being firmly held, or when the ball is jarred loose and hits the ground just after a player crosses the goal line for a touchdown. Everyone watching can see when these things occur, often in excruciating detail. Then the definition of a “catch” gets tough. The act of catching a ball is not, it turns out, a discrete event like a ball crossing a goal line. Whatever a catch is, it is ultimately a judgment made by a referee. Judging if a ball was caught or not is more like judging if pass interference occurred (“pass interference” is called when the defender makes prohibited contact with a would-be receiver while a pass is in the air) than it is like determining if a ball crossed a goal line. The closer we look at the making of a catch, the more contingencies we see, which has led to more considerations being identified that referees must be aware of. Hence, the word count in NFL rules for what it means to catch a ball has increased by 600 percent.
The best way for the NFL to deal with challenges over the definition of catching the ball is to accept that it can’t be precisely defined. The NFL should simply go back to letting referees decide if a ball was caught or not, based on the sort of general guidance provided in the old 109-word definition. Such an approach is not all that radical, as the NFL routinely determines that some decisions are not reviewable, such as whether a field goal that passes over the upright is good or not. In that case, technology could certainly be used, but a decision has been made not to go there. And football has survived. Using technology in sport in a smart way means giving up the idea that its use can eliminate uncertainty—and that means that sometimes it is better to figure out how to live with uncertainties rather than try to completely eliminate them.
Let’s consider one more example that illustrates this last point. In baseball, pitches are judged to be strikes or balls by a human umpire. In 2015, FiveThirtyEight, a website owned by ESPN, published an evaluation of umpire accuracy in MLB. It found that the average MLB umpire is 86 percent accurate in judging balls and strikes. In a game that involves 280 pitches (for both teams), the umpire might make 40 mistakes. That is a lot of mistakes.
The technology exists to standardize the calling of balls and strikes in baseball, and the technology has been tested in the minor leagues. There are arguments for and against adopting the technology. An argument for adopting it is that it would standardize the definition and the strike zone and apply the standard impartially to each player. However, some observers argue that umpires and their judgment are a key part of the game. Wherever you come out on this issue, at its core is the notion of uncertainty and judgment. Is umpire error (14 percent on average) part of the game of baseball? Or is umpire error something to be reduced and, ideally, eliminated? In 2003, Arizona Diamondbacks pitcher Curt Schilling answered this question by taking a baseball bat to one of the cameras of an early computerized pitch-calling system. Your answer to this question gets to the core of what you think baseball is as a sport and the role that technology has in it.
Put me down on the side of keeping the imperfect umpires in place. Why? I can think of at least three reasons. First, for better or worse, human umpires are part of the tradition of baseball. Even if computers are more accurate, deploying them completely changes that tradition, which is an important norm of the sport. Second, umpire uncertainty is also essential to the game of baseball. Remove the umps and you have a different game. Finally, if baseball is, as Pittsburgh Pirate first baseman Willie Stargell once said, “a reflection of life,” then uncertainty in the application of the rules deserves a prominent place in the game.
Thanks for reading! Below, THB Pro subscribers can download a PDF of Chapter 7 of The Edge — Hacking the Athlete and the Games. Comments welcomed, as always.