Filep Race Engineering
Specialist engineering consultancy

Actual projects and news

Keeping you up to date with current involvements...

Python for timing and data analysis

ADAC GT MASTERS Nürburgring 2022

As the amount of race car data we get out of the data acquisition system is getting more and more, I started some time ago to learn Python as I have heard that it is better than Excel and faster.
I have to admit that I am surprised by the possibilities, frightened by seeing how much I do not know and happy that I have decided to start this learning process.
One additional project started in Excel some time ago was to analyze timing data after the races, trying to understand what happened, compare different manufacturers as most of the racing I do is controlled by some sort of balance of performance, compare cars of the same make to understand which team is stronger with the same race material and maybe discover some new insights.

Last race weekend at the GT Masters Race of Nürburgring, we have been quite surprised by some car performances, so I decided to try to apply my knowledge acquired in Python to timing data analysis.
Some of the inspiration comes from Jasper  whom I follow on Medium and from different aspects of data analysis learned in Udacity courses.

When looking at the plots below, some points have to be taken into account:
    - The data does not reflect details like traffic, technical issues, etc.
    - We never know if setup was the best for the given conditions
    - Where all the drivers 100% fit ?
    - Data has to be seen as a result of a team effort influenced by outer factors
Just consider that we are not allowed to see only the drivers when these king of data is analyzed.


The analysis is based first step on box plots, as I try to better understand the way this can be used. Before analysis, data has been cleaned by removing first lap, all in/out laps and the laps falling outside the max data point - calculation is explained below.
For those who have even less experience with box plots, this is a form of presenting statistical insight into data, quite similar to a histogram. The graph below with more explanations can be found at https://www.r-bloggers.com/2012/06/whisker-of-boxplot.


Starting from the bottom, the whisker is marking the minimum (best) lap time. According to boxplot theory, this should be actually a calculated value with the formula Q1 – 1.5 x IQR.
I have checked the data with Excel just to be shore that the lowest whisker corresponds to the min lap time and it is all correct, showing me that I still need to dig deeper into the theory of boxplots.
The lower edge of the box represents the first quantile (Q1). Between the minimum value and this value we have 25% of the fastest laps. This value is also verified, as theory would say that we have only 24.65% of values contained in the range Min -> Q1 The next line is within the box, representing the median. This is the value at the middle of the data set, with sorted lap times. For example, for a race with 51 laps, if the lap times are sorted in ascending order, the median will be the lap time at position 26, dividing the data set into 2 halves. Attention has to be paid, as this value is different to the average lap time.
Next comes Q3 (third quantile) – considering data from min to Q3, 75% of the fastest lap times fall into this range.
The upper whisker represents the highest data point, excluding any outliers – quite handy to be able to exclude full course yellow or safety car laps.
I think that boxplots are quite interesting as we quickly have a visual insight into the performance of the competitors. For example, we can clearly see in what range the fastest laps and “normal race laps” can be driven. The smaller the length of the plot, the better the performance!



Let’s have a look at the race data!

We can see that Mercedes has been strong in both races, that Audi and Lamborghini have been quite slow, BMW strong in second race for small amount of laps and Porsche fast overall and very consistent – look at where their median is compared to Q1 of other brands except Mercedes.
A big advantage of using Python is that we are able to quickly extract information’s based on different criteria – for example, which is the best team per manufacturer and which is the best car.
First, we have a look at the teams running Audi’s – data is only from the second race!

Even if the race result did not find any of the Montaplast by Land Motorsport cars at the front, one of the teams cars has set the fastest Audi lap time. Rutronic Racing was strong, having 25% of the fastest lap times at a better pace than Land Motorsport. Median lap time is close but with a slight advantage for Rutronic. The other 2 Audi Teams – Car Collection Motorsport and Eastalent Racing Team have been somewhat off the pace – but, as I have mentioned at the beginning, statistics alone is not telling the complete story – the Car Collection cars have suffered some technical issues, Race 1 was clearly dominated by them.

As I still haven’t found out how to color boxplots by team, here is a small insight into the race numbers:
Montaplast by Land Motorsport – #1, #28, #29
Rutronic Racing – #15, #27
Car Collection Motorsport – #33, #69
Eastalent Racing Team – #54
The overall picture gets now a little bit clearer! Best car of Montaplast by Land Motorsport was #1, followed by #28 and #29.
Best car of Rutronic Racing was #27, with #15 having similar pace to #28.
Car Collection is missing one car and the second one has had some aerodynamic deformations not helping the performance, while #54 of Eastalent Racing is steadily improving.



ChassisSim

The long break we all have had gave me the possibility to get used und implement one of the new features of ChassisSim into my work. I am talking about the possibility to control ChassiSim out of an Excel sheet wich gives the possibility to run both single simulations and batch runs of setup steps that can be pre-defined and the software will run while you are having a break or sleep.

The big advantage of this is that you can think about different scenarios you would like to have analyzed, put them all into the excel sheet, export the run files to ChassisSim and let it do the job, without the need to enter every time the setup, data file naming, etc, etc.

But get prepared on generating a huge amount of data wich has to be analysed...


This feature is called "ChassisSim Setup Service" and you could also benefit from it. I would set up the model for you, put everything together in order to be able to run and send you the package together with the excel sheet. Alternatively, I could do the simulations for you and send you the report on sensitivities or setup directions. What I need is general vehicle data and some data where the car was running in order to get a nice correlation.



Land Motorsport

After a quite succesfull 2018 when I have been part of the team as a data / performance engineer in ADAC GT Masters / VLN / 24H Nürburgring and 24H Spa, this year I am happy to rejoin as race engineer in the ADAC GT Masters Championship, taking responsibility over car #28 with Christopher Haase and Max Hofer.