Skip to content

kazarmy/GetAndCleanProject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

README for the run_analysis.R script

The run_analysis.R script takes as input the data from the UCI Machine Learning Repository's Human Activity Recognition Using Smartphones Data Set and produces as output a wide tidy data set with the average of the means and standard deviations of each measurement, for each activity and each subject. The tidy data set meets the principles of tidiness stated in [1] and [2], and it has a code book.

Regarding the "what columns are measurements on the mean and standard deviation" issue, I have decided to include only features with mean() or std() because "measurement" here appears to refer only to the smartphone sensor signals listed in Table 2 of [3].

Running the script

  1. Install the dplyr package if you have not already done so.
  2. Download run_analysis.R and the HAR data set zipfile into R's working directory.
  3. Execute the command source("run_analysis.R"). The script will print messages showing the steps taken.
  4. If all goes well, the tidy data set will be in tidy_data_2.txt in the UCI HAR Dataset subdirectory of the working directory.

Reading the tidy data set

After you have run the run_analysis.R script, you can read and view the tidy data set using the following code:

data <- read.table("UCI HAR Dataset/tidy_data_2.txt", header = TRUE, check.names = FALSE)
View(data)

Note that despite check.names = FALSE above, the columns are accessible from R. You just need to surround the column names with backticks, e.g. data$`avg-tBodyAcc-mean()-X` . The code book contains the rationale for the column names.

References

[1] Hadley Wickham (2014). "Tidy Data". Journal of Statistical Software, vol. 59, no. 10.

[2] David Hood (2015). "Tidy Data and the Assignment". https://class.coursera.org/getdata-030/forum/thread?thread_id=107 Accessed on 24 July 2015.

[3] Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz (2013). "A Public Domain Dataset for Human Activity Recognition Using Smartphones". 21th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN 2013. Bruges, Belgium.

About

Getting and Cleaning Data (Coursera/JHU) course project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages