Cross Industry Standard Process for Data Mining

Cross Industry Standard Process for Data Mining

CRISP-DM stands for Cross Industry Standard Process for Data Mining[1]. It is a data mining process model that describes commonly used approaches that expert data miners use to tackle problems. Polls conducted in 2002, 2004, and 2007 show that it is the leading methodology used by data miners.[2] [3] [4]

Contents

Major phases

CRISP-DM breaks the process of data mining into six major phases[5]:

  • Business Understanding
  • Data Understanding
  • Data Preparation
  • Modeling
  • Evaluation
  • Deployment

History

CRISP-DM was conceived in 1996. In 1997 it got underway as a European Union project under the ESPRIT funding initiative. The project was led by four companies: SPSS, NCR Corporation, Daimler-Benz and OHRA.

This core consortium brought different experiences to the project: ISL, later acquired and merged into SPSS Inc. The computer giant NCR Corporation produced the Teradata data warehouse and its own data mining software. Daimler-Benz had a significant data mining team. OHRA, an insurance company, was just starting to explore the potential use of data mining.

The first version of the methodology was released as CRISP-DM 1.0 in 1999.

CRISP-DM 2.0

In July 2006 the consortium announced that it was going to start the process of working towards a second version of CRISP-DM. On 26 September 2006, the CRISP-DM SIG met to discuss potential enhancements for CRISP-DM 2.0 and the subsequent roadmap. However, these efforts appear to be stalled. The SIG has not met, updated the CRISP website, or communicated anything to members since early 2007. As of June 22, 2011, the website redirects to an IBM page about SPSS.

Advantages

  • Industry neutral
  • Tool neutral
  • Closely related to the Knowledge Discovery in Databases Process Model
  • Anchors the data mining process

References

External links


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • Cross Industry Standard Process for Data Mining — CRISP DM signifie Cross Industry Standard Process for Data Mining[1]. Il s agit d un Modèle de Processus de data mining qui décrit une approche communément utilisée par les experts en data mining pour résoudre les problèmes qui se posent à eux.… …   Wikipédia en Français

  • Data mining — Not to be confused with analytics, information extraction, or data analysis. Data mining (the analysis step of the knowledge discovery in databases process,[1] or KDD), a relatively young and interdisciplinary field of computer science[2][3] is… …   Wikipedia

  • Exploration de données — Articles principaux Exploration de données Fouille de données spatiales Fouille du web Fouille de flots de données Fouille de textes …   Wikipédia en Français

  • Projet:Mathématiques/Liste des articles de mathématiques — Cette page n est plus mise à jour depuis l arrêt de DumZiBoT. Pour demander sa remise en service, faire une requête sur WP:RBOT Cette page recense les articles relatifs aux mathématiques, qui sont liés aux portails de mathématiques, géométrie ou… …   Wikipédia en Français

  • CRISP-DM — Der Cross Industry Standard Process for Data Mining gibt den Lebenszyklus in einem Knowledge Discovery in Databases Prozess vor. Dieser Prozess wurde aus einem Förderprojekt der Europäischen Union von namhaften Teilnehmern, u.a. der Daimler AG… …   Deutsch Wikipedia

  • JANUS clinical trial data repository — is a clinical trial data repository (or data warehouse) standard as sanctioned by the Food and Drug Administration (FDA). It was named for the Roman god Janus (mythology), who had two faces, one that could see in the past and one that could see… …   Wikipedia

  • Business and Industry Review — ▪ 1999 Introduction Overview        Annual Average Rates of Growth of Manufacturing Output, 1980 97, Table Pattern of Output, 1994 97, Table Index Numbers of Production, Employment, and Productivity in Manufacturing Industries, Table (For Annual… …   Universalium

  • Pharmaceutical industry in the People's Republic of China — The pharmaceutical industry is one of the leading industries in People s Republic of China, covering synthetic chemicals and drugs, prepared Chinese medicines, medical devices, apparatus and instruments, hygiene materials, packing materials, and… …   Wikipedia

  • coal mining — Coal was very important in the economic development of Britain. It was used as fuel in the factories built during the Industrial Revolution and continued to be important until the 1980s. The main coalfields are in north east England, the north… …   Universalium

  • Code for Sustainable Homes — The Code for Sustainable Homes is an environmental impact rating system for housing in England and Wales, setting new standards for energy efficiency (above those in current building regulations)[1] and sustainability which are not mandatory… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”