Jenni, Raphael (2022) Better Code Representation for Machine Learning. Other thesis, OST Ostschweizer Fachhochschule.
This is the latest version of this item.
better-code-representation-for-machine-learning-main-ee9fcf44.pdf - Supplemental Material
Download (1MB)
Abstract
Using machine learning for code becomes more and more common. Different approaches based on paths or BERT are available. This paper focuses on improving parts of the input vector by creating a more compact embedding. Furthermore, it explores and discusses ways to reduce the amount of data inserted into a model when working with code changes. The results presented in this paper show that it is possible to reduce the input data into a latent space, cutting it to half the input data size, representing differences and similarities between code paths in a very compact way while still maintaining an accuracy of 99%. Moreover, it is shown that with proper preprocessing, it is possible to reduce the amount of data inserted into a code changes model by around 84%.
| Item Type: | Thesis (Other) |
|---|---|
| Subjects: | Area of Application > Development Tools Technologies > Programming Languages |
| Divisions: | Master of Science in Engineering (MRU Software and Systems) |
| Depositing User: | Users 59836 not found. |
| Date Deposited: | 07 Feb 2023 09:41 |
| Last Modified: | 06 Nov 2025 09:56 |
| URI: | https://eprints.ost.ch/id/eprint/1071 |
Available Versions of this Item
-
Better Code Representation for Machine Learning. (deposited 19 Sep 2022 07:38)
- Better Code Representation for Machine Learning. (deposited 07 Feb 2023 09:41) [Currently Displayed]
