Cross Industry Standard Process for Data Mining

Cross Industry Standard Process for Data Mining: CRISP-DM stands for Cross Industry Standard Process for Data Mining^[1]. It is a data mining process model that describes commonly used approaches that expert data miners use to tackle problems. Polls conducted in 2002, 2004, and 2007 show that it is the leading methodology used by data miners.^[2] ^[3] ^[4]

Contents

1 Major phases

2 History

3 CRISP-DM 2.0

4 Advantages

5 References

6 External links

Major phases

CRISP-DM breaks the process of data mining into six major phases^[5]:

Business Understanding

Data Understanding

Data Preparation

Modeling

Evaluation

Deployment

History

CRISP-DM was conceived in 1996. In 1997 it got underway as a European Union project under the ESPRIT funding initiative. The project was led by four companies: SPSS, NCR Corporation, Daimler-Benz and OHRA.

This core consortium brought different experiences to the project: ISL, later acquired and merged into SPSS Inc. The computer giant NCR Corporation produced the Teradata data warehouse and its own data mining software. Daimler-Benz had a significant data mining team. OHRA, an insurance company, was just starting to explore the potential use of data mining.

The first version of the methodology was released as CRISP-DM 1.0 in 1999.

CRISP-DM 2.0

In July 2006 the consortium announced that it was going to start the process of working towards a second version of CRISP-DM. On 26 September 2006, the CRISP-DM SIG met to discuss potential enhancements for CRISP-DM 2.0 and the subsequent roadmap. However, these efforts appear to be stalled. The SIG has not met, updated the CRISP website, or communicated anything to members since early 2007. As of June 22, 2011, the website redirects to an IBM page about SPSS.

Advantages

Industry neutral

Tool neutral

Closely related to the Knowledge Discovery in Databases Process Model

Anchors the data mining process

References

^ Shearer C. The CRISP-DM model: the new blueprint for data mining. J Data Warehousing 2000;5:13—22.

^ Gregory Piatetsky-Shapiro (2002) KDnuggets Methodology Poll

^ Gregory Piatetsky-Shapiro (2004) KDnuggets Methodology Poll

^ Gregory Piatetsky-Shapiro (2007) KDnuggets Methodology Poll

^ Harper, Gavin; Stephen D. Pickett (August 2006). "Methods for mining HTS data". Drug Discovery Today 11 (15–16): 694–699. doi:10.1016/j.drudis.2006.06.006. PMID 16846796. http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6T64-4KDJSRH-4&_user=793840&_coverDate=08%2F31%2F2006&_rdoc=4&_fmt=full&_orig=browse&_srch=doc-info(%23toc%235020%232006%23999889984%23627946%23FLA%23display%23Volume)&_cdi=5020&_sort=d&_docanchor=&view=c&_ct=17&_acct=C000043460&_version=1&_urlVersion=0&_userid=793840&md5=f7f5b2376172e12b63177a32b03de111.

External links

CRoss Industry Standard Process for Data Mining Blog

Le site des dataminers Article publié par Pascal BIZZARI, Mai 2009

The Data Mining Group (DMG): The DMG is an independent, vendor led group which develops data mining standards, such as the Predictive Model Markup Language (PMML)

Categories:
Applied data mining
Computer science stubs

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

Cross Industry Standard Process for Data Mining — CRISP DM signifie Cross Industry Standard Process for Data Mining[1]. Il s agit d un Modèle de Processus de data mining qui décrit une approche communément utilisée par les experts en data mining pour résoudre les problèmes qui se posent à eux.… … Wikipédia en Français
Data mining — Not to be confused with analytics, information extraction, or data analysis. Data mining (the analysis step of the knowledge discovery in databases process,[1] or KDD), a relatively young and interdisciplinary field of computer science[2][3] is… … Wikipedia
Exploration de données — Articles principaux Exploration de données Fouille de données spatiales Fouille du web Fouille de flots de données Fouille de textes … Wikipédia en Français
Projet:Mathématiques/Liste des articles de mathématiques — Cette page n est plus mise à jour depuis l arrêt de DumZiBoT. Pour demander sa remise en service, faire une requête sur WP:RBOT Cette page recense les articles relatifs aux mathématiques, qui sont liés aux portails de mathématiques, géométrie ou… … Wikipédia en Français
CRISP-DM — Der Cross Industry Standard Process for Data Mining gibt den Lebenszyklus in einem Knowledge Discovery in Databases Prozess vor. Dieser Prozess wurde aus einem Förderprojekt der Europäischen Union von namhaften Teilnehmern, u.a. der Daimler AG… … Deutsch Wikipedia
JANUS clinical trial data repository — is a clinical trial data repository (or data warehouse) standard as sanctioned by the Food and Drug Administration (FDA). It was named for the Roman god Janus (mythology), who had two faces, one that could see in the past and one that could see… … Wikipedia
Business and Industry Review — ▪ 1999 Introduction Overview Annual Average Rates of Growth of Manufacturing Output, 1980 97, Table Pattern of Output, 1994 97, Table Index Numbers of Production, Employment, and Productivity in Manufacturing Industries, Table (For Annual… … Universalium
Pharmaceutical industry in the People's Republic of China — The pharmaceutical industry is one of the leading industries in People s Republic of China, covering synthetic chemicals and drugs, prepared Chinese medicines, medical devices, apparatus and instruments, hygiene materials, packing materials, and… … Wikipedia
coal mining — Coal was very important in the economic development of Britain. It was used as fuel in the factories built during the Industrial Revolution and continued to be important until the 1980s. The main coalfields are in north east England, the north… … Universalium
Code for Sustainable Homes — The Code for Sustainable Homes is an environmental impact rating system for housing in England and Wales, setting new standards for energy efficiency (above those in current building regulations)[1] and sustainability which are not mandatory… … Wikipedia

Academic Dictionaries and Encyclopedias

Cross Industry Standard Process for Data Mining

Contents

Major phases

History

CRISP-DM 2.0

Advantages

References

External links

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Cross Industry Standard Process for Data Mining

Contents

Major phases

History

CRISP-DM 2.0

Advantages

References

External links

Look at other dictionaries:

Share the article and excerpts

Direct link