During the first step, a set of training data is provided with labels, or classes, on which the algorithm can rely on to learn. Labeling the training data is crucial. You can compare this with a person who learns how to label data correctly. This person also needs a first set of labeled sample data. In contrast with a human, a data classification algorithm will be able to learn the correct insights much faster and will be able to detect more subtle relationships and will also be able to process much larger amounts of data, which improves the quality of the detected relationships. Apart from adding a label, no other input is required within Trendskout to train the classification algorithm.
Through “Feature Selection” and “Feature extraction” algorithms, which are executed completely autonomously, Trendskout independently detects which set of variables – eg. columns, pixels … – must be used in the classification model.
During training, Trendskout evaluates various algorithms, algorithm parameters and data transformations. To evaluate this, different quality scores are calculated by running the model on an unused part of the training data and testing whether the discovered relationships are correct. Based on the result of this evaluation, Trendskout will independently try new combinations of algorithms, parameters and data transformations. This part of the training step is called hypertuning. This entire process is started in Trendskout via a simple click on the Run / Train button and is done without user intervention.