Predictive Modeling

What is predictive modeling?


Predictive modeling uses scientifically proven mathematical statistics to predict event outcomes.

Most often one event that a mathematician wants to predict or apply predictive analysis on it is in the future (also here physics and the mathematical notion of the future can be applied), but predictive modeling can be applied to any type of mathematically stated as "unknown" event, (almost) regardless of when it occurred.

In fact, one mathematician may use multiple predictive models or apply different functions in predictive modeling, which would expand her abilities rather than the simplistic models of if or not.

In many cases the model is chosen on the basis of detection theory to try to guess the probability of an outcome given a set amount of input data, for example, given an email determining how likely that it is spam.

Models can use one or more classifiers (or otherwise to say events databases in statistics) in trying to determine the probability of a set of data belonging to another set. For example, one computational model might be used to determine whether an email is spam or non-spam.

Depending on definitional boundaries, predictive modeling is synonymous with, or largely overlapping with, the field of machine learning, as it is more commonly referred to in academic or research and development contexts. When deployed commercially, predictive modeling is often referred to as predictive analytics.

Predictive mathematical modeling is often contrasted with causal modeling/analysis. In the former, one may be entirely satisfied to make use of indicators of, or proxies for, the outcome of interest. In the latter, one seeks to determine the closest to true cause-and-effect relationships. This distinction has given rise to scientific literature in the fields of research methods and mathematical and computational statistics and to the common statement that "correlation does not imply causation".

Predictive Modeling is a set of equations to calculate the best guess of unknown values.


Nearly any statistical model can be used for prediction purposes.

Broadly speaking, there are two classes of predictive models: parametric and non-parametric. A third class, semi-parametric models, includes features of both.

Parametric models make "specific assumptions with regard to one or more of the population parameters that characterize the underlying distribution(s)".

Non-parametric models "typically involve fewer assumptions of structure and distributional form [than parametric models] but usually contain strong assumptions about independence".

There are

  1. parametric
  • parametric with more assumption (guessing) around calculated value.
  1. non-parametric
  • non-parametric with no or less assumption (guessing) around calculated value.

Use Case of Predictive Modeling

Uplift modeling

Uplift modeling is a technique for modeling the change in probability caused by an action.

Typically this is a marketing action such as an offer to buy a product, to use a product more, or to re-sign a contract.

For example, in a retention campaign, you wish to predict the change in the probability that a customer will remain a customer if they are contacted.

A model of the change in probability allows the retention campaign to be targeted at those customers on whom the change in probability will be beneficial.

This allows the retention program to avoid triggering unnecessary churn or customer attrition without wasting money contacting people who would act anyway.


Predictive modeling in archaeology gets its foundations from Gordon Willey's mid-fifties work in the VirĂº Valley of Peru. Complete, intensive surveys were performed then variability between cultural remains and natural features such as slope and vegetation were determined.

The development of quantitative methods and greater availability of applicable data led to the growth of the discipline in the 1960s and by the late 1980s, substantial progress had been made by major land managers worldwide.

Generally, predictive modeling in archaeology is establishing statistically valid causal or covariable relationships between natural proxies such as soil types, elevation, slope, vegetation, proximity to water, geology, geomorphology, etc., and the presence of archaeological features.

Through analysis of these quantifiable attributes from land that has undergone archaeological survey, sometimes the "archaeological sensitivity" of unsurveyed areas can be anticipated based on the natural proxies in those areas.

Large land managers in the United States, such as the Bureau of Land Management (BLM), the Department of Defense (DOD), and numerous highway and parks agencies, have successfully employed this strategy.

By using predictive modeling in their cultural resource management plans, they are capable of making more informed decisions when planning for activities that have the potential to require ground disturbance and subsequently affect archaeological sites.

Published: 2022-05-04