Customer Segments

Being able to categorize customers into meaningful groups based on spending habits is useful for businesses. Doing so can allow a business to test how changes might affect different groups of customers more effectively.

This project made use of the Wholesale Customers Dataset from the UCI Machine Learning Repository which contains the annual spending habits of 440 clients of a whole sale distributor.

Principle Component Analysis was used for dimensionality reduction of the features down to just two. An unsupervised clustering algorithm was used and evaluated using a Silhouette Score to find the best way to place the clients into separate groups.

Using the Gaussian Mixture Model clustering algorithm, the best results were achieved when the customers were clustered into two separate groups, which resulted in a Silhouette Score 0.429.

Note: This project was completed as part of my Machine Learning Nanodegree at Udacity.

The full project write up and source code can be accessed through the following links: