Skip to content

Commit 7e56327

Browse files
authored
Update README.md
1 parent 94f6757 commit 7e56327

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ The objective of this repository is to outline a set of instructions for impleme
99
### Overview
1010
Principal Component Analysis is a widely utilized statistical method employed for reducing dimensionality and visualizing data. Its primary objective is to identify prominent patterns and correlations within high-dimensional datasets by transforming the original variables into a new set of uncorrelated variables known as principal components. By representing the data using the top two or three principal components, it becomes possible to plot the data points and gain insights into their distribution and patterns.
1111

12-
12+
<img src="PCA/figures/pcafig.JPG" width="800" height="400">
1313

1414
Let's consider a scenario where we have $n$ observations with measurements on a set of $p$ features. PCA aims to discover a low-dimensional representation of the dataset that retains as much variation as possible. The underlying idea is that each of the $n$ observations exist in a $p$-dimensional space, but not all dimensions are equally informative. PCA seeks a small number of dimensions that capture the most interesting aspects, with interestingness measured by the amount of variability exhibited by the observations along each dimension.
1515
Each principal component is a linear combination of the original variables and represents a specific direction in the data space. The first principal component is a linear combination of the primary predictors that captures the most variance in the dataset. It determines the direction of the greatest changes in the data. The larger the range of changes along the first component, the more information it contains. The second principal component is also a linear combination of the original predictors, capturing the remaining variance in the dataset while being uncorrelated with the first principal component.

0 commit comments

Comments
 (0)