top of page
Workflow Automation Templates
A library of ready-to-use workflow templates to accelerate your data journey

KMeans Clustering (Sklearn)
Group similar records using KMeans

Overview
This workflow demonstrates how to perform KMeans clustering using Scikit-learn to group similar records based on feature similarity.
Details
The workflow reads data from a CSV file and converts text features into numerical vectors using the Sklearn TF-IDF Vectorizer node. The transformed data is then clustered using the Sklearn KMeans node to identify distinct groups. The trained clustering model is saved using the Sklearn Model Save node, and the Sklearn Predict node assigns cluster labels to the data. Finally, the Print N Rows node displays a preview of the clustered output for verification.
bottom of page