×

You are using an outdated browser Internet Explorer. It does not support some functions of the site.

Recommend that you install one of the following browsers: Firefox, Opera or Chrome.

Contacts:

+7 961 270-60-01
ivdon3@bk.ru

Formation and analysis of the efficiency of the dataset for teaching language models to recognize and analyze the source code of programs

Abstract

Formation and analysis of the efficiency of the dataset for teaching language models to recognize and analyze the source code of programs

Kakutin D.Y., Dmitriev A.S.

Incoming article date: 23.04.2022

This article describes the formation of a training set for training language neural networks for their use in tasks related to the analysis and search for matches and / or correspondences in meaning / value, and specifically with functions and methods in the source code of a programming language. The key parameters required in the sample for the correct training of the neural network are determined.

Keywords: source code, machine learning, natural language processing, neural network, data analysis