K-means clustering algorithm to group people who live close to each other.
The data are all people that will be clustered. The weights are the distance from each person's home to the destination. Moreover, the distance used is Haversine distance because the lat-lon coordinate system is used.
Prerequisites:
Download and install R.
- Clone the repository:
git clone https://github.com/erickmp07/cluster-people.git
To run the scripts:
cd cluster-people/codes
Then, start the R interactive terminal:
R
source("haversine_dist.R")
source("SSE.R")
source("weighted_kmeans.R")
source("print_result.R")
The print_result.R
script will read the CSV file and print the result generated by the K-means clustering algorithm.
To change the number of clusters, change the value of the K in the print_result.R
.
To change the input data, change the CSV file.
NOTE: The CSV file should have the columns: name, longitude, latitude and distance.
This project was developed with the following technologies:
PRs and stars are always welcome.
To ask a question, please contact me.
Licensed under MIT license.