Ch_19_Cryptocurrencies

Part 1: Preprocessing the Data for PCA / Part 2: Reducing Data Dimensions Using PCA / Part 3: Clustering Cryptocurrencies Using K-means / Part 4: Visualizing Cryptocurrencies Results

Please note: Crypto_Clustering_Saved.ipynb will not open in Jupyter notebooks but will open in Google Colabs.

Data Analysis of Cryptocurrencies

Introduction:

In this report, we will be analyzing the cryptocurrency market data to create a classification system that can be used for a new investment portfolio for Accountability Accounting. We will be using unsupervised learning to group the cryptocurrencies and will be utilizing machine learning techniques such as PCA and K-means algorithm to process and cluster the data.

Part 1: Preprocessing the Data for PCA

We started by importing the dataset from CryptoCompare into the Google Colab notebook as a Pandas DataFrame named crypto_df. Then, we removed the rows that have at least one null value, kept all the cryptocurrencies that are being traded, and filtered the DataFrame so it only has rows where coins have been mined. Afterward, we removed the 'IsTrading' column and 'CoinName' column from the crypto_df DataFrame as it's not going to be used on the clustering algorithm.

Next, we used the get_dummies() method to create variables for the two text features, Algorithm and ProofType, and stored the resulting data in a new DataFrame named X. After that, we used the StandardScaler fit_transform() function to standardize the features from the X DataFrame.

Part 2: Reducing Data Dimensions Using PCA

We applied PCA to reduce the dimensions to three principal components. We created a new DataFrame named pcs_df that includes the following columns, PC 1, PC 2, and PC 3, and used the index of the crypto_df DataFrame as the index.

Part 3: Clustering Cryptocurrencies Using K-means

Using the pcs_df DataFrame, we created an elbow curve using mapplotlib, since hvPlot.pandas is an unsearchable library or currently unavailable (after nearly a dozen attempts to locate it and install it) in order to find the best value for K. We found that the best value for K was 4. We then used the pcs_df DataFrame to run the K-means algorithm to make predictions of the K clusters for the cryptocurrencies’ data.

Part 4: Visualizing Cryptocurrencies Results

We created a scatter plot to visualize the different clusters of cryptocurrencies based on the PCA algorithm. The scatter plot shows the relationship between PC1, PC2, and PC3, where the size of the data points is based on the TotalCoinSupply and the colors are based on the K-means predicted clusters.

Conclusion:

In this report, we analyzed the cryptocurrency market data using unsupervised learning techniques such as PCA and K-means algorithm to process and cluster the data. We found that the best value for K was 4, and we created a scatter plot to visualize the different clusters of cryptocurrencies based on the PCA algorithm. There were 532 actively trading cryptocurrencies that the model cleaned and processed. This report provides a classification system that can be used for a new investment portfolio for Accountability Accounting.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
crypto_clustering.ipynb		crypto_clustering.ipynb
crypto_clustering_resaved.ipynb		crypto_clustering_resaved.ipynb
crypto_clustering_starter_code.ipynb		crypto_clustering_starter_code.ipynb
crypto_data.csv		crypto_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ch_19_Cryptocurrencies

Please note: Crypto_Clustering_Saved.ipynb will not open in Jupyter notebooks but will open in Google Colabs.

Data Analysis of Cryptocurrencies

Introduction:

Part 1: Preprocessing the Data for PCA

Part 2: Reducing Data Dimensions Using PCA

Part 3: Clustering Cryptocurrencies Using K-means

Part 4: Visualizing Cryptocurrencies Results

Conclusion:

About

Releases

Packages

Languages

radatu/Ch_19_Cryptocurrencies

Folders and files

Latest commit

History

Repository files navigation

Ch_19_Cryptocurrencies

Please note: Crypto_Clustering_Saved.ipynb will not open in Jupyter notebooks but will open in Google Colabs.

Data Analysis of Cryptocurrencies

Introduction:

Part 1: Preprocessing the Data for PCA

Part 2: Reducing Data Dimensions Using PCA

Part 3: Clustering Cryptocurrencies Using K-means

Part 4: Visualizing Cryptocurrencies Results

Conclusion:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages