R Commands for Cleaning Data

This post is notes from the Coursera Data Analysis Course. Here are some R commands that might serve helpful for cleaning data. String Replacement sub() replace the first occurrence gsub() replaces all occurrences Quantitative Variables in Ranges cut(data$col, seq(0,100, by=10)) breaks the data up by the range it falls into, in this example: whether the … Continue reading R Commands for Cleaning Data

R Graph Commands for Data Analysis

This post is notes from the Coursera Data Analysis Course. Here are some basic R commands for creating some graphs. Exploratory Graphs boxplot barchart hist plot density Final Graphs for a report Final graphs need to look a little nicer. They must also have informative labels and a title and possibly a legend. plot(data$column1, data$column2, … Continue reading R Graph Commands for Data Analysis

9 problems with Real World Regression

This list comes from the Coursera Data Analysis Course. Linear and Logistic Regression are some of the most common techniques applied in data analysis. Here is a list of possible problems with regression in the real world. Confounders - variable that is correlated with both the outcome and other variables in the model Complicated Interactions … Continue reading 9 problems with Real World Regression

First Steps to Data Analysis in R

This post is notes from the Coursera Data Analysis Course. Here are some basic R commands that should useful for obtaining data and looking at data in R. Ideally these commands are useful for steps 4, 5, and 6 of the 11 Steps to Data Analysis. Load the data and just look at it download.file('http://location.com', … Continue reading First Steps to Data Analysis in R

Levels of Data Analysis

The list is ordered according to the level of difficulty. Descriptive just describe the data, common for census type of data Exploratory find relationships that were not clear beforehand, useful for defining future studies, remember correlation does not imply causation Inferential use a small dataset to say something about a larger population, most common goal … Continue reading Levels of Data Analysis

Data Analysis by Data Type

Data analysis is performed in many different fields and on many different types of data. Most fields call it something different. The following list comes straight from Jeff Leek's Data Analysis Coursera class. Name of Data Analysis by Data Type Biostatistics for medical data Data Science for data from web analytics Machine learning for data … Continue reading Data Analysis by Data Type

Data Analysis at Coursera

The Coursera Data Analysis course started yesterday. This course would be an excellent follow-up to the Computing with Data Analysis course. For a bit more about the course, check out this video explaining the content. The course consists of lectures, quizzes, and some data analysis assignments. There is still plenty of time to signup and … Continue reading Data Analysis at Coursera