"Unified Field Theory" for predicting WDC

Blog Zbod

Podium Finisher
Since this lot (me included) do tend to have statistical OCD, I thought I should post this.

Around the autumn of 2010, F1 Racing magazine had a short article about some (unnamed) physics boffin who had crunched the numbers on 40 years of F1 results, looking for a formula that would describe factors relevant to winning WDC (beyond simply the points awarded), a sort of a "unified field" theory for analysing (or predicting) any given driver's possibility of winning. This is the formula he came up with:

X = (11*(√p ÷ r)) +10w + 5s + (5f ÷ 2) - 20a

Where:
p= number of poles
r= number of retirements
w= number of wins
s= number of second or third place results
f= number of points scoring results lower than 3rd (currently, 4th thru 10th)
a= average race result


The article I'm afraid offered no explanation of its application, which is unfortunate, because the formula itself has a couple of problems. Firstly, the first term is meaningless if a driver has not had a DNF. And secondly, the fifth term potentially could be used to penalise a DNF, as well as the first.

The solution I applied to the first problem was to was to credit each driver with a single DNF at the outset. So mathematically, drivers with no retirements receive full credit for (11 times the square root of) the number of poles taken. Thereafter, the credit for poles is reduced by the inverse of the number of retirements plus one. I.E., one DNF = (1+1=2) 1/2 credit, two DNFs = (1+2=3) 1/3 credit, and so on.

As to the second problem, the first term specifically penalises for DNFs. It is clear this was the creator's intent but I cannot make the same claim regarding the fifth term. Therefore I elected not to include a "non-result" in the average of results.

The formula isn't particularly byzantine, and I didn't run it against the past 40 years to vet it, but I presume the blokes at F1 Racing did.


Here is how the maths work out for the current season to date (with my tweaks):



For comparison, this is how the previous season worked out:



No slight intended to the drivers not included in these stats. I limited my selection to a handful of front runners (and Webber, to contrast to his teammate) to limit the work load.

I post it now because it is another tool for the statistically-obsessed among us to use to analyse the possible permutations for the remainder of the season. What scenarios would have to come to pass for anyone to catch Vettel? If there's sufficient interest, I might expand it to the entire field.
 
I for one will be interested to see how your calculations pan out...

.Edit

You have Hamilton finishing 3rd in the championship this year which would be his best result since 2008
 
Last edited:
This sort of thing is okay if you want to look at historical results but I'm struggling to understand how it can be used to predict a winner given that you need to input historical results to get an answer.

Also, it ignores the human factor. For example, if you were to run the stats for Jenson Button for last season he would probably be there or there abouts as a title challenger but as McLaren have produced a car which probably goes fastest in reverse his chances of winning a race this season (let alone the title) are nil.
 
I think it can only be used for predictions within a season FB , so Button's score for this year wouldn't include any results from 2012.

It seems strange to me that the formula doesn't contain championship points anywhere, since that's what really counts. I guess the test for it would be to calculate the values from the mid point of a few seasons and see if it does a better job of predicting the champion than just using the championship standings at that point. I'm tempted to do it but I can't be bothered to work out all the stats right now.

For me the formula should contain terms like:
Current WDC points
No. Races Remaining
Points obtained from average race finishing position excluding DNFs
No. DNFs (or No. car failures + No. DNFs caused by driver error)
Upward/downward trend in car performance.
 
Current championship points would probably result in an over-fitted model?

I agree with sushifiesta - since the main potential use for this model is to spot the unexpected championship contender, comparing cars' pace in the first quarter of the season with that in the second might be a useful indicator to add - though difficult to come up with a consistent methodology for.
 
Back
Top Bottom