Although not specific to data science, if you write a lot of documents with mathematical notation, you are probably familiar with . is a typesetting system common for mathematics.

There are now some nice resources to help you write and produce documents collaboratively on the web.

Another site, LaTeX Templates is just what you would guess. It contains of bunch of sample templates to make creating documents a bit quicker and easier.

The Strata Conference 2013 starts today, February 26, 2013. Beginning Wednesday, many of the keynotes will be live-streamed, so you can still follow along even if you (like myself) are not physically present. The talks are always very good, and if you have a chance, it is worth catching a few of the keynotes via live-streaming.

A while back James Kobielus wrote the article, Data Scientist: Consider the Curriculum. It contains one of the best descriptions of a data science curriculum I have seen. Also the article includes a list of algorithms/modeling techniques that should be known by a data scientist. Below is the list from the article.

linear algebra

basic statistics

linear and logistic regression

data mining

predictive modeling

cluster analysis

association rules

market basket analysis

decision trees

time-series analysis

forecasting

machine learning

Bayesian and Monte Carlo Statistics

matrix operations

sampling

text analytics

summarization

classification

primary components analysis

experimental design

unsupervised learning

constrained optimization

The list almost looks overwhelming.
Do you think anything is missing from the list?

The very first issue of Big Data Journal is out. All the articles are freely available for download. The titles and authors of the articles look quite good. I will probably be posting more as I read through some of the articles.

… to establish the country’s leading data science training and research facilities at NYU.

Part of the announcement is an M.S. in Data Science. Applications for the initial class, starting Fall 2013, are now being accepted. The Center for Data Science also plans to offer Ph.D. degrees via the Mathematics, Statistics, and Computer Science departments. I am not sure if an official Ph.D. degree in Data Science is being planned.

I just found this site a couple days ago. Quandl is a new startup that is a search engine for datasets. The site really has a lot of data (over 2 million datasets). Plus the data can be sorted, filtered, graphed, combined, and finally downloaded in many different formats (Excel, JSON, R, csv, XML). Most of the data is numerical and/or time series.

If you have been looking for some data to explore, Quandl may be a good place to look.