First Steps to Data Analysis in R

This post is notes from the Coursera Data Analysis Course.

Here are some basic R commands that should useful for obtaining data and looking at data in R. Ideally these commands are useful for steps 4, 5, and 6 of the 11 Steps to Data Analysis.

Load the data and just look at it


download.file('http://location.com', 'localfile.csv')
data <- read.csv('localfile.csv')
dim(data)
names(data)
quantile(data$column)
hist(data$column)
head(data)
summary(data)
str(data)
unique(data$column)
length(unique(data$column))
table(data$column) - count of how many times each value appears in the column
table(data$column1, data$column2)

any(data$column < 100)
all(data$column > 100)

colsums(data)
colmeans(data, na.rm=T)
rowMeans(data, na.rm=T)

Look for missing values


is.na(data$column)
sum(is.na(data$column))
table(data$column, useNA="ifAny")

For more information on any R command, just type ? in the R console. For example, if you want to know more about the dim command, just type ?dim

Leave a Reply

Your email address will not be published. Required fields are marked *