Title of master thesis: Impact of Training Data Volume on Neural Network Training and Accuracy
Overview
- Date:Starts 9 June 2023, 10:00Ends 9 June 2023, 11:00
- Location:Nexus, Physics building campus Johanneberg
- Language:English
Abstract: This master thesis explores the impact of data volume on model training and accuracy in the context of neural networks. The study focuses on conducting experiments on two image-based networks performing classification tasks, namely ResNet50 and MobileNetV2, as well as a time sequence network aiming to identify and predict brake squeals on cars, with both the data and the architecture provided by Volvo Cars. The objective is to investigate the behaviour and accuracy of these networks as they are trained on progressively smaller subsets of the original dataset.
With this study, we aim to gain some insight into the how neural networks perform under different data availability scenarios. This type of information can become key in decision making processes regarding data collection, model development, and output handling, particularly in situations where data volume is limited.
The research begins by establishing a baseline performance of the networks when trained on the entire dataset. Subsequently, various subsets of the original dataset are created by progressively reducing the volume of training data. The performance of the networks is then evaluated using these reduced datasets. This process allows for a comprehensive analysis of the effect of data volume on model training and accuracy.
Throughout all of this process, statistical studies will be carried out to verify the robustness of our results, as well as the possible influence the different subsets have on the results.
For the image-based networks, the experiments involve training ResNet50 and MobileNetV2 models on subsets of the ImageNet-1K dataset, containing over 1.2 million training images across 1000 categories. The study examines how the reduction in training data volume affects the convergence of the models, as well as their accuracy in classifying images. Furthermore, the evolution of a network's output for a specific image will be analysed for different stages in the training, aiming to evaluate how the network's confidence in a prediction evolves.
Additionally, the thesis explores a time sequence network provided by Volvo Cars, which aims to classify signals from brake pads into squealing and non-squealing categories. As before, the experiments involve training the network on progressively smaller subsets of the original signal dataset, which consist of signals in the form of waves. Furthermore, the network also aims to perform regression in order to predict the shape of the brake pressure signal.
Password: 644429
Examiner: Giovanni Volpe
Supervisor: Tomas Björklund
Opponents: Daniel Olander & Hannes Johansson