Predictive modelling

Predictive modelling

Predictive modelling is the process by which a model is created or chosen to try to best predict the probability of an outcome. In many cases the model is chosen on the basis of detection theory to try to guess the probability of a signal given a set amount of input data, for example given an email determining how likely that it is spam.

Models can use one or more classifiers in trying to determine the probability of a set of data belonging to another set, say spam or 'ham'.

Models and classifiers

Many models exist to try to predict on the basis of input data.

Classification trees

Naive Bayes

"See main article: Naive Bayes classifier"

"k"-nearest neighbor algorithm

"See main article: k-nearest neighbor algorithm".

Majority classifier

upport vector machines

"See main article: Support vector machine"

Logistic regression

Logistic regression is a technique in which unknown values of a discrete variable are predicted based on known values of one or more continuous and/or discrete variables. Logistic regression differs from OLS regression in that the dependent variable is binary in nature. This procedure has many applications. In biostatistics, the researcher may be interested in trying to model the probability of a patient being diagnosed with a certain type of cancer based on knowing, say, the incidence of that cancer in his or her family. In business, the marketer may be interested in modeling the probability of an individual purchasing a product based on the price of that product. Both of these are examples of a simple, binary logistic model. The model is "simple" in that each has only one independent, or predictor, variable, and it is "binary" in that the dependent variable can take on only one of two values: cancer or no cancer, and purchase or does not purchase.

Uplift Modelling

Uplift Modelling is a technique for modelling the "change in probability" caused by an action. Typically this is a marketing action such as an offer to buy a product, to use a product more or to re-sign a contract. For example in a retention campaign you wish to predict the change in probability that a customer will remain a customer if they are contacted. A model of the change in probability allows the retention campaign to be targeted at those customers on whom the change in probability will be beneficial. This allows the retention programme to avoid triggering unnecessary churn or customer attrition.



Predictive modeling in archeology gets its foundations from Gordon Willey's mid-fifties work in the Virú Valley of Peru. [ Willey, Gordon R. “Prehistoric Settlement Patterns in the Virú Valley, Peru”, Bulletin 155. Bureau of American Ethnology, 1953 ] Complete, intensive surveys were performed then covariability between cultural remains and natural features such as slope, and vegetation were determined. Development of quantitative methods and a greater availability of applicable data led to growth of the discipline in the 1960s and by the late 1980s, substantial progress had been made by major land managers worldwide.

Generally, predictive modeling in archaeology is establishing statistically valid, causal or covariable relationships between natural proxies such as soil types, elevation, slope, vegetation, proximity to water, geology, geomorphology, etc., and the presence of archaeological features. Through analysis of these quantifiable attributes from land that has undergone archaeological survey, sometimes the “archaeological sensitivity” of unsurveyed areas can be anticipated based on the natural proxies in those areas. Large land managers in the United States, such as the Bureau of Land Management (BLM), the Department of Defense (DOD) [ Heidelberg, Kurt, et al. “An Evaluation of the Archaeological Sample Survey Program at the Nevada Test and Training Range”, SRI Technical Report 02-16, 2002] [ Jeffrey H. Altschul, Lynne Sebastian, and Kurt Heidelberg, “Predictive Modeling in the Military: Similar Goals, Divergent Paths”, Preservation Research Series 1, SRI Foundation, 2004 ] , and numerous highway and parks agencies, have successfully employed this strategy. By using predictive modeling in their cultural resource management plans, they are capable of making more informed decisions when planning for activities that have the potential to require ground disturbance and subsequently affect archaeological sites.

Health insurance

Predictive modeling in Health Insurance has several applications, most notably underwriting, capitation payment, and disease management.

Underwriting - Claims based risk adjustment models are used to predict costs for individuals in a future year, based on their demographic and medical and/or claims history. Other predictive models have been developed to help identify better healthcare risks including prescription histories and consumer data.

Capitation payment - Claims based predictive models (typically called risk adjustment models) are used to compensate health plans who attract sicker than average individuals, especially in some public programs such as the Medicare Advantage program. This lessens the incentive for health plans to focus on attracting healthier individuals.

Disease management - Predictive modeling is used in several ways, including adjusting for the relative health of different cohorts of individuals in disease management return on investment studies, and in identifying individuals who are most in need and would benefit the most from interventions.

The Society of Actuaries published a study on the commercially available claims based predictive (risk adjustment) models:

Customer relationship management

Predictive modelling is used extensively in analytical customer relationship management and data mining to produce customer-level models that describe the likelihood that a customer will take a particular action. The actions areusually sales, marketing and customer retention related.

[] WellNet Healthcare, for example, is a privately held company founded in 1994 that designs, implements and administers employer-sponsored health benefits using predictive-modeling technology licensed from Johns Hopkins University to identify high-risk members and proactively work with them using health-management programs so that catastrophic and costly events are avoided.

For example a large consumer organisation such as a mobile telecommunications operator will have a set of predictive models for product cross-sell, product deep-selland churn. It is also now more common for such an organisation to have a model of savability using an uplift model. This predicts the likelihood that a customer can be saved at the end of a contract period (the change in churn probability) as opposed to the standard churn prediction model.

ee also

* California Predictive Model [PDF||39.8 KiB ]
* Prediction interval
* Predictive analytics
* Uplift modelling
* Seymour Geisser


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Modelling biological systems — Modeling biological systems is a significant task of systems biology and mathematical biology. Computational systems biology aims to develop and use efficient algorithms, data structures, visualization and communication tools with the goal of… …   Wikipedia

  • Uplift modelling — Uplift modelling, also known as net response modelling or incremental response modelling is a new predictive modelling technique that directly models the incremental impact of targeting marketing activities.Uplift modelling has applications in… …   Wikipedia

  • Scientific modelling — Example of scientific modelling. A schematic of chemical and transport processes related to atmospheric composition. Scientific modelling is the process of generating abstract, conceptual, graphical and/or mathematical models. Science offers a… …   Wikipedia

  • Physiologically based pharmacokinetic modelling — Contents 1 What is a PBPK model? 2 History 3 Uses of PBPK modeling 4 Limits and extensions of PBPK modeling …   Wikipedia

  • Choice modelling — attempts to model the decision process of an individual or segment in a particular context. Choice modelling may also be used to estimate non market environmental benefits and costs[1]. Well specified choice models are sometimes able to predict… …   Wikipedia

  • Environmental niche modelling — Environmental niche modelling, alternatively known as species distribution modelling, (ecological) niche modelling, and climate envelope modelling refers to the process of using computer algorithms to predict the distribution of species in… …   Wikipedia

  • Physiologically-based pharmacokinetic modelling — Physiologically based pharmacokinetic modeling (PBPK) is a mathematical modeling technique for prediction of the absorption, distribution, metabolization and excretion (ADME) of a compound in humans and other animal species. PBPK modeling is used …   Wikipedia

  • Virtual Wards — A virtual ward is a cadre for providing support in the community to people with the most complex medical and social needs. The concept was developed in Croydon Primary Care Trust (South London) and virtual wards are now being introduced in… …   Wikipedia

  • List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… …   Wikipedia

  • Weka (machine learning) — Infobox Software name = Weka caption = Weka 3.5.5 with Explorer window open with Iris UCI dataset developer = University of Waikato latest release version = 3.4.13 (book), 3.5.8 (developer) latest release date = July 16, 2008 operating system =… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”