Estimation of the dimensionality of the attribute space for multi-label classification
Abstract
Estimation of the dimensionality of the attribute space for multi-label classification
Incoming article date: 06.05.2025This study addresses the challenges of evaluating feature space dimensionality in the context of multi-label classification of cyber attacks. The research focuses on tabular data representations collected through a hardware-software simulation platform designed to emulate multi-label cyber attack scenarios. We investigate how multi-label dependencies — manifested through concurrent execution of multiple attack types on computer networks — influence both the informativeness of feature space assessments and classification accuracy. The Random Forest algorithm is employed as a representative model to quantify these effects. The practical relevance of this work lies in enhancing cyber attack detection and classification accuracy by explicitly accounting for multi-valued attribute dependencies. Experimental results demonstrate that incorporating such dependencies improves model performance, suggesting methodological refinements for security-focused machine learning pipelines.
Keywords: multivalued classification, attribute space, computer attacks, information security, classification of network traffic, attack detection, informative attributes, entropy