First Steps to Data Analysis in R

This post is notes from the Coursera Data Analysis Course.

Here are some basic R commands that should useful for obtaining data and looking at data in R. Ideally these commands are useful for steps 4, 5, and 6 of the 11 Steps to Data Analysis.

Load the data and just look at it


download.file('http://location.com', 'localfile.csv')
data <- read.csv('localfile.csv')
dim(data)
names(data)
quantile(data$column)
hist(data$column)
head(data)
summary(data)
str(data)
unique(data$column)
length(unique(data$column))
table(data$column) - count of how many times each value appears in the column
table(data$column1, data$column2)

any(data$column < 100)
all(data$column > 100)

colsums(data)
colmeans(data, na.rm=T)
rowMeans(data, na.rm=T)

Look for missing values


is.na(data$column)
sum(is.na(data$column))
table(data$column, useNA="ifAny")

For more information on any R command, just type ? in the R console. For example, if you want to know more about the dim command, just type ?dim

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s