Tag Archives: CRISP-DM

Data Mining Standard Processes

There are a couple of standard processes for approaching data mining problems.

CRISP-DM

The most common approach is Cross Industry Standard Process for Data Mining (CRISP-DM).

Steps of CRISP-DM

  1. Business Understanding
  2. Data Understanding
  3. Data Preparation
  4. Modeling
  5. Evaluation
  6. Deployment

The steps are mostly self-explanatory, but the CRISP-DM wikipedia page has a lengthier description.

SEMMA

The second most popular process for data mining is SEMMA.

Steps of SEMMA

  1. Sample
  2. Explore
  3. Modify
  4. Model
  5. Assess

More details can be found on the SEMMA wikipedia page.

A Data Science Process?

Other than The Data Scientific Method (which is not a standard), I am not aware of any other process for data science.

Do you know of any processes for data science? Is anyone aware of a group working on standardizing a data science process?