Online Retail Customer Segmentation
RFM-based customer segmentation on online retail transactions using unsupervised learning. The project benchmarks KMeans and GMM, then translates clusters into business-friendly segments and retention actions.
Overview
The report processes 525,461 raw rows into 407,664 clean rows and segments 4,312 customers into 3 clusters. KMeans (k=3) is selected as the baseline with a silhouette score of 0.4117. Segment outcomes include Champions, Potential Loyalists, and At-Risk groups, each with targeted CRM recommendations.
Highlights
- RFM feature engineering with log-stabilized Frequency and Monetary values.
- Model comparison: KMeans baseline and GMM secondary benchmark.
- Business outputs: segment sizes, revenue profile, and activation/win-back actions.
- Visuals: customer counts by segment, average revenue by segment, PCA projection.
Links
The report includes notebook and export references (for example, notebooks/01_customer_segmentation_starter.ipynb and data/processed/customer_segments.csv) to support downstream dashboards and campaign execution.