Can 'moneyball' be used to find the best ever F1 driver?

May 6th 2021

Justin Moore's statistical model to find the best F1 driver ever caused outrage – now its new iteration might raise a few eyebrows too, writes Preston Lerner

Can statistical analysis truly prove who is the best F1 driver?

Grand Prix Photo

Who’s the best driver in Formula 1? You can consult Driver of the Day fan votes or the Autocourse Top Ten list. You can tally up races won and calculate winning percentages. You can chart pole positions, fast laps and total overtakes. And at the end of the day, it still boils down to a popularity test: Moss or Fangio. Prost or Senna. Hamilton or Verstappen.

Justin Moore, an American software engineer with a passion for Formula 1, believes he’s got a better idea. During the past two decades, the application of data analytics to sports has challenged conventional wisdom about how players are evaluated and prompted a revolution in the way baseball, basketball and, to a lesser degree, American football are played. So three years ago, Moore decided to apply a data-driven approach to rating F1 drivers.

Straight off, he ran into a roadblock. “Formula 1 is probably the sport that has the biggest gap between what data the teams have and what’s available to the casual fan and the intellectual community,” he says.

As it turns out, the data set he was able to get his hands on was limited primarily to qualifying and finishing positions.

So rather than create a complex statistical model, Moore developed a variation on the Elo rating system long used in the chess world (and now found on esports platforms as well). Although the maths are esoteric, the principle behind the Elo system is dead-simple: Every chess match is a zero-sum proposition with points awarded to and subtracted from each player, depending on the result of the game and the relative ratings of the players.

“My model treats Formula 1 races and qualifying sessions as round-robin one-on-one tournaments,” Moore explains. “So Lewis Hamilton has a race against Max Verstappen, and he also has a race against Valtteri Bottas and Fernando Alonso and each and every other driver on the track.”

VALENCIA, SPAIN - AUGUST 22: David Coulthard of Great Britain and Red Bull Racing looks on after he crashes during practice for the European Formula One Grand Prix at the Valencia Street Circuit on August 22, 2008, in Valencia, Spain. (Photo by Bryn Lennon/Getty Images) — Moore’s previous model said that David Coulthard was better than Jim Clark…

Bryn Lennon/Getty Images

In 2018, Moore co-wrote a story that appeared on FiveThirtyEight – a website devoted to savvy data analysis, mostly in the political sphere but also applied to sports and culture – under the provocative title “Who’s the Best Formula 1 Driver of All Time?” All the usual suspects were at the top of the list – Ayrton Senna, Michael Schumacher, Hamilton, Sebastian Vettel, Juan Manual Fangio and Alain Prost, from first to sixth.

Then things started to get wonky. David Couthard (10th) ahead of Jim Clark (12th)? Riccardo Patrese (14th) in front of Jackie Stewart (20th)? Alonso 22nd? Stirling Moss 25th? Nigel Mansell 30th? No three-time World Driving Champion Jack Brabham? No two-time champ Emerson Fittipaldi? No Gilles Villeneuve, for God’s sake?

Red Bull signs another five Mercedes powertrain employees for new engine division

Red Bull's aggressive recruitment drive continues with another five Mercedes employees moving across to its powertrains division

6th May 2021

By display_d39769dbd8

The howling was loud enough to be heard through the din of a grid full of four-rotor Mazda 787B Le Mans prototypes. “The internet loved to tell me how wrong I was,” Moore admits cheerfully. “And a lot of things the critics said were valid.”

Turns out there were two systemic flaws with the model. First, it was based on stretches of five consecutive years, so one dreadful season could torpedo a driver’s rating. Second, and far more problematic, cars and drivers were treated as a single unit. Thus, Rubens Barrichello was ranked 15th on the basis of his stint with a dominant Scuderia Ferrari while Alberto Ascari faded to 19th because he wasted most of the 1954 season while waiting for the Lancia D50 to show up.

Before looking at the current F1 season, Moore tweaked the model. The biggest change was that cars and drivers were reconsidered as separate entities. But how to calibrate their relative values? Good question. Moore repeatedly ran the model plugging in different figures to see which numbers produced results that most closely corresponded to what really happened. At the end of the day, he says, the sweet spot was roughly two-thirds car, one-third driver.

“Moore’s contrarian conclusion was that Verstappen was the best driver of 2020”

Moore was coy about how the new model affects the historical ratings, though he intimated that Moss and Alonso will be hustling up the list. But he’s already written another article for FiveThirtyEight about how F1 appears to be shaping up this year.

According to Moore’s analysis, the last year’s Mercedes-AMG F1 W11 was significantly stronger than Red Bull’s RB16. No surprise there. But Moore’s contrarian conclusion was that Verstappen was the best driver of 2020 even though he finished a distant third to Hamilton in the title hunt. Moore’s model sees 2021 as a toss-up between these two titans, with Verstappen holding a 51 percent to 49 percent edge in qualifying while Hamilton gets the same advantage in races. The first three events of the season would seem to confirm this forecast.

When factoring in driver and car reliability – which are treated as independent variables – Moore rates Sergio Perez as the best of the rest, with Daniel Ricciardo ahead of McLaren team-mate Lando Norris and Carlos Sainz Jr. likely to outperform the admittedly quicker Charles Leclerc at Ferrari. Pierre Gasly and George Russell net out as the most likely diamonds in the rough.

And what of Hamilton’s wingman at Mercedes? “One thing that sticks out in my mind is the degree to which the model isn’t impressed by Valtteri Bottas,” Moore says.

Love it or hate it, Moore’s analysis provides food for thought. To me, the biggest problem is that the data set is so crude. In baseball, the result of every pitch can be quantified from start to finish. Racing, alas, is opaque by design. The action we see on track is just the tip of the iceberg, but the vast majority of the important stuff takes place behind closed doors.

Further complicating matters, there’s no perfect agreement about the qualities that ought to be prized most highly in a driver. How do you balance a car-control virtuoso like Ronnie Peterson against a set-up/development genius like Mario Andretti, or the racecraft of a John Watson against the single-lap pace of a René Arnoux? Can a driver save tyres? Eke out fuel consumption numbers? Avoid mayhem caused by others?

Bottas Portugal pole — Pole or no pole, the new model isn’t kind to Valtteri Bottas

Grand Prix Photo

Still, I can’t help but wonder if the top F1 teams haven’t developed their own proprietary driver-rating programs. In addition to qualifying and finishing results, they’ve got access to the lap times of every car in every race. They can make allowances for tyre life, fuel load and track position. Trap speeds provide powerful clues about engine power and aerodynamic performance. Just imagine all the hard data and informed speculation they could load into a computer!

I wouldn’t be surprised if each team maintains a driver-rating list of its own to help make informed decisions when silly season rolls around. And who knows? Maybe conventional wisdom is right after all. Maybe the best drivers end up in the best teams precisely because they’re the best drivers, and not the other way around.

Comments

Loading Comments

Can 'moneyball' be used to find the best ever F1 driver?

Justin Moore's statistical model to find the best F1 driver ever caused outrage – now its new iteration might raise a few eyebrows too, writes Preston Lerner

Related article

Red Bull signs another five Mercedes powertrain employees for new engine division

Comments

MSM-1194 - Updated

MSM-1194

Test post

test post lorem ipsum