KMeans Clustering (Sklearn)

Sparkflows /

Workflow Templates /

Workflow Automation Templates

A library of ready-to-use workflow templates to accelerate your data journey

Get Started

ChatGPT Image Feb 13, 2026, 04_44_29 PM.png

KMeans Clustering (Sklearn)

Back to Templates

Group similar records using KMeans

Overview

This workflow demonstrates how to perform KMeans clustering using Scikit-learn to group similar records based on feature similarity.

Details

The workflow reads data from a CSV file and converts text features into numerical vectors using the Sklearn TF-IDF Vectorizer node. The transformed data is then clustered using the Sklearn KMeans node to identify distinct groups. The trained clustering model is saved using the Sklearn Model Save node, and the Sklearn Predict node assigns cluster labels to the data. Finally, the Print N Rows node displays a preview of the clustered output for verification.