Statistics Surge

Little known to fans or casual sport watchers, numbers drive the world of athletics. Data analytics are used to better understand the true value of a player’s skill, they aid in helping teams determine what they should focus on in a game or practice, and they may even direct the the evolution of a sport.

Statistics+Surge

Last year, Aaron Judge –– a world renowned baseball player for the New York Yankees –– led the MLB in a very meaningful statistic: wRC+, garnering a value of 207. This, among other achievements, helped lead him to win the 2022 American League MVP. This is because analysts, scouts, coaches, and managers use seemingly meaningless (at least to most fans) numbers –– like wRC+ –– and statistics to divide a good player from a great one, to separate the mediocre from the excellent; to know who they want playing for them and who they definitely don’t want on their next season’s lineup. 

To a non-baseball fan, or even a casual fan, Judge’s number of 207 wRC+ means nothing: 207 out of what? 207 balls hit? 207 bases stolen? 207 pieces of gum chewed? It is arbitrary unless a deeper dive is taken into “sabermetrics,” the intricate new world of data statistics and numbers –– a world that is increasingly helping teams accurately determine the value of a player, as well as where and on what a team should focus its energy. 

Popularized by media like the nonfiction book Moneyball by Micheal Lewis, is the use of “sabermetrics” to determine the value of a player. In baseball, for example, as opposed to calculating simple statistics like “batting average,” “runs batted in” or “bases stolen” –– values that were considered golden numbers before the statistic revolution –– analysts center their energy on statistics that are supposed to more accurately reflect the talent of an individual player, as opposed to the luck of a player at the error of others.  

The aforementioned wRC+ is one of these new statistics. It stands for “Weighted Runs Created Plus.” Simply put, it eliminates the external factors that help or harm a player’s ability to score runs: ballpark elevation and stadium dimensions. It even takes into account the time period in which an athlete played (stadiums in different time periods had different regulations and thus different outfield boundary walls: unfairly inflating or deflating home runs scores). Then, the metric “weights” a run so that the league average is 100 runs. Thus, anything higher than 100 is considered above average –– putting Judge’s score of 207 in a new light. 

wRC+ is an example of the intricate statistics that strive to eliminate the uncertainties of athletics and highlight the true value of an athlete. Moneyball provides countless examples of statisticians –– often Harvard or MIT engineering graduates –– who tirelessly work to boil the game down to the true skills of a player as opposed to the chance error of others or the chance stadium they are in. 

“Sabermetrics” and the new world of statistics provides another benefit for teams: the clarification of what they should focus their energy on. The obsession with numbers extends far beyond baseball, too. Many other sports have adopted the use of analytics to gain a competitive advantage, be it accurate player evaluation or knowledge about what the players should focus on in a game.  

For example, in basketball, a major revelation came in the early/mid-2010s, with the rise of the three-point shot. Through analyzing shot-type overlaid with team wins/scoring, coaches and analysts slowly realized that the extra point gained from making a three-point shot exponentially increased scoring efficiently (and thus team wins) if implemented at a higher –– accurate –– volume. Because of this, they started focusing more of their energy on training players for three-pointers (as well as valuing players who could accurately sink them). 

Famously popularized by Stephen Curry and his “splash brother” Klay Thompson, the three-point shot has risen to basketball glory and glamor. Prior to this, the focus was on getting “easy” points by driving to the basket, taking layups, and mid-range jump shots. In the 1990-1991 NBA season, teams only attempted an average of 7.1 three-pointers per game. Now, however, as teams take this to another level –– largely taking only three-point shots and layups, and eliminating the mid-range jump shot –– the number of 3-pointers attempted per game has increased to a whopping 34.3 shots per game.

As a result of this increase, NBA scoring is at an all-time high. Teams are averaging 114.1 points per game this season, a major step up from 96.3 just 11 years ago. This sharp increase illuminates how a deeper dive into seemingly innocuous numbers (the essence of “sabermetrics”) can cause the methodology of a sport to quickly shift. 

The same can be seen in baseball. Teams looked at data on where certain batters tend to hit the ball on the field, and from this, decide if it is worth it to shift their players to the side that player tends to hit the ball (note that it is considered a “shift” when three or more infielders are positioned on one side of second base). The purpose of this shift is to have a stronger chance of fielding a ball hit to a batter’s pull (i.e. strong) side and has proven wildly effective. 

This efficacy is measured by the decrease in the league-wide ground ball batting average on balls in play (Groundball BABIP), as “shifted” outfielders can more easily return grounded balls because they are closer to them. During the 2017 season, teams were shifting on about 15% of all plate appearances, leading to a .241 groundball BABIP. On the other hand, in the 2020 season, teams were shifting a staggering 30% of all plate appearances, leading to just a .229 groundball BABIP. As this technique became more common, however, the baseball authority has implemented new rules to limit or completely ban it for the upcoming 2023 season. 

A similar evolution of using statistics to focus on efficiency can be seen in technique-heavy sports like swimming or running. Comparing the stroke of a swimmer 25 or 30 years ago to swimmers now highlights the stark differences. In freestyle, for example, it used to be believed that a “straight-arm” catch was the most effective, as it pulled the most water back. Now, however, thanks to computer-analyzed data showing the velocity of a swimmer in the water, coaches know that the drag added by pulling with a straight arm eliminates any benefit added by the extra water pulled. Thus, high-level swimmers can be seen pulling with a 90-degree bend in the elbow, thought by analysts to strike the best balance between moving hydrodynamically through the water and pulling water behind you to propel you forward. 

As swimmers continue to evolve and improve, even more changes will be seen. For the longer distance freestyle, for example, Katie Ledecky –– seven time olympic gold medalist in the 200, 400, 800, and 1500 meter freestyle –– has perfected a new technique. This one, however, is specific to kick. The kick rate of swimmers changes depending on the even length: the shorter the distance, the higher the kick rate. A 200m, for example, would be around a six-beat kick (i.e. six kicks per stroke cycle)and longer events, like the 400, 800,and 1600m, will see anywhere from a four-beat to a six-beat kick. Ledecky, however, constantly holds a two-beat kick or less during her event, until the last 50, where she ramps up the rate. This allows her to maximize the use of her arm strength while not tiring herself out (the legs suck up far more oxygen than the arms). Only the future data analysis and swims will be able to tell if this is a particular skill to Ledecky or if this trend will expand out to all high-level distance swimmers. 

Analytics and the deep diving into numbers have become so popular in recent years that sports teams and leagues have started hiring people whose sole job is to look at data –– like The Oakland A’s did with Paul Depodesta back in the early 2000s. Exemplifying this is NFL data analyst Tom Bliss, who spends 90% of his work day importing NFL data and analyzing it in code like python and R.

“I write code to calculate metrics such as means or rates and visualize the metrics in a table or plot,” Bliss said. “Additionally, I use code to create predictive models including a team strength model to measure how valuable days of rest are in terms of a team winning or losing a game.”

Bliss further explains his role in making the NFL the best product it can possibly be, for players and fans.

“I help different departments within the office such as officiating, player health and safety, and football operations,” Bliss said. “Ultimately, I work with each group to make the game as competitive, safe, and well-officiated as possible to maximize our fans’ entertainment.”

Bliss outlined some of the ways that analytics are incorporated into the NFL. For example, a coaching staff may use historical data to determine when it’s most ideal to “go for it” on fourth down. Front offices can use data to evaluate a player’s skill, or how a player fits better with a certain team in free agency or the draft. Lastly, it can also be used to help athletes with workout routines, helping them not overwork a part of the body that is injured. 

It is one thing to see the use of statistics grow nationally by large sports corporations like the MLB or NFL, and by highly competitive olympic sports like swimming, but what about on a smaller scale? Does the same trend of the increasing importance of statistics hold true? Here at Paly, at least, it seems to. 

Jeff LaMere, head coach of boys basketball, feels that statistics are an integral part of his coaching. LaMere spends hours outside of practice studying an opposing team’s statistics and understanding their playing tendencies in order to help Paly’s team gain a competitive edge over them. He also thinks data can be important for analyzing and improving his own team as well. 

I want to look at numbers like “effective field goal percentage” versus ‘field goal percentage,’ so I want to know where I can value a three point shot, more than a two point shot,” LaMere said. 

LaMere refers to “field goal percentage” compared to “effective field goal percentage.” Essentially, field goal percentage is the percentage of total shots made out of shots taken, which is tracked both individually and as a team. Effective field goal percentage –– or EFG% –– tracks an entire team’s field goal percentage, but puts an extra emphasis on three point shots because they are worth more points. The reason this stat is so useful is that it allows a team to see if they should focus on scoring inside more, or if they’re more efficient when launching shots from beyond the arc. 

Due to dynamics that differ game to game, no contest is the same. Sometimes plays run slowly and other times flow quickly one after the other. Thus, certain numbers for each game can change drastically. Because of this, some statistics can be looked at that disregard the change in pace. LaMere explains how this plays into which statistics he decides to look at –– versus analyzing simply “the numbers.”

“[I set goals based on] offensive rebound percentage,” LaMere said. “[I say] let’s rebound the ball 40% of our misses as opposed to just getting eight offensive rebounds. Analyzing the game using those metrics versus pure numbers is more accurate.” 

With the rise of the use of sports data, technology to look at analytics is becoming increasingly accessible for coaches who want to look at them. For example, Coach LaMere uses the software company Hudl, which takes film from inputted games, and spits out lines and lines of basic to advanced statistics for the coaching staff to use. 

“QUOTE from Lamere on Hudl”

Paly swimming coach Danny Dye also uses statistics and numbers, although in a slightly different way. He uses past results from swimmers and their trends to understand what to expect out of the opponent.

“Using those stats, you know how to prepare your athlete, and they know what goals to set based on those expectations,” Dye said.

He analyzes times from his own team and his competitors in order to figure out the best matchups for event placement or relay order, in order for his athletes –– and his team –– to win. 

“So you look at what their swimmers can do and then, similar to a chess game, you place your athletes into a lineup based on the other teams’ past times,” Dye said. “Then you attempt to score out of the meet and get your team the win.”

By scoring out the meet, Dye means “playing” out different outcomes of races. For example, how many points will my team get if everyone places where they are seeded? Will we win then? What if everyone places on the place below they are seeded? Will we still win then? Dye often goes through 10+ “meet scenarios” with swimmers in different events, different relay orders, and different potential placing in their individual races. 

The use of intricate statistics has skyrocketed in recent years, with new technology making them easier to calculate, and prominent media showcasing them. Teams both nationally and here at play use statistics –– although in different ways –– to further their end goal: winning. 

In sports, everything boils down to becoming faster, stronger, and more efficient. And statistics, especially as we head towards an ever-increasing digital age, will only be one tool in helping teams, athletes, and coaches get there.