Clustering

Clustering

Clustering algorithms are designed to group unstructured data. They do so based on shared features between data sets. These clusters can be used in Trendskout to add new data to the most relevant data clusters. Trendskout uses a range of best-of-breed techniques for Deep Learning, Machine Learning and AI.

A typical example of the use of data clustering can be found in the creation of marketing personas. These are used to define customer segments with similar profiles and needs. Clusters can be used in a variety of other use cases: other notable examples are fraud detection and predictive maintenance.

Clustering algorithms do not rely on human input to define new clusters and they are therefore considered an unsupervised learning technique. Initial training of the algorithm is entirely optional. However, it is still possible to apply a clustering algorithm to an initial data set, allowing you to add real-time data at the second stage. This data will subsequently be categorised to the right cluster by our AI model.

Practical business applications

- Marketing persona definition
- Detection of fraudulent activities or transactions
- Predictive maintenance planning


artificial intelligence

Powerful Cloud-AIOut-of-the-box with an intuitive interface, designed for non-data scientists


Clustering in the AI Flow

1. Connect

2. Analysis

Clustering

Clustering is one of the available Trendskout AI Flow analysis-functions.

3. Automate


How does this work technically?

The Clustering process

The Clustering process starts as soon as you click on the Run / Train button in Trendskout. The system will execute various clustering algorithms on the input data, linked to the clustering analysis via drag & drop in the AI ​​flow. Various combinations of algorithms and parameters are used herein, i.e. hypertuning.

2 criteria are crucial during the clustering process. First, the detected groups, clusters, must contain data points that are as close as possible to each other. Secondly, the number of clusters should remain limited. A clustering algorithm which detects groups of data that do not belong together, or that finds many small groups, means that further search is needed via hypertuning & Auto ML for better results. You can visualize the ratio between the number of clusters and the similarity in an “Elbow-curve”.

The hypertuning process is stopped once an optimum result is achieved.


Use of the Clustering output

There are two general types of use cases for clustering. The first is where the groups, clusters, are interpreted as advanced analysis on a data set and the clusters are used for better decisions and better insight into certain processes.

A second use is where the clusters, and the underlying clustering model, are used to assign new data points to a cluster. This is comparable to classification. The difference with classification is that the training step is “Unsupervised”, and the labeling is based on automatically discovered groups or clusters.


Clustering + Trendskout

The Trendskout automated machine learning platform contains various clustering algorithms that can be linked via a drag & drop interface to input and automation steps in an AI flow. All data transformation, hypertuning, algorithm selection and the management of all GPU / TPU Cloud Computing are fully managed in the background.

This makes deploying clustering applications in your organization a lot more efficient and you can experiment without worry.

Ready to discover all features during a live demo - with your data?Get in touch and we will be happy to show you the direct business value of artificial intelligence for your organisation.