Uncertain data

Uncertain data

In computer science, uncertain data is the notion of data that contains specific uncertainty. Uncertain data is typically found in the area of sensor networks. When representing such data in a database, some indication of the probability of the various values.

There are three main models of uncertain data in databases. In attribute uncertainty, in each tuple, each uncertain attribute is subject to its own independent probability distribution.cite journal|last=Prabhakar|first=Sunil |title=ORION: Managing Uncertain (Sensor) Data|url=http://mobisensors.cs.pitt.edu/files/papers/prabhakar.pdf] For example, if reading are taken of temperature and wind speed, each would be described by its own probability distribution, as knowing the reading for one measurement would not provide any information about the other.

In correlated uncertainty, multiple attributes may be described by a joint probability distribution. For example, if readings are taken of the position of an object, and the "x"- and "y"-coordinates stored, the probability of different values may depend on the distance from the recorded coordinates. As distance depends on both coordinates, it may be appropriate to use a joint distribution for these coordinates, as they are not independent.

In tuple uncertainty, all the attributes of a tuple are subject to a joint probability distribution. This covers the case of correlated uncertainty, but also includes the case where there is a probability of a tuple not belonging in the relevant relation, which is indicates by all the probabilities not summing to one. For example, in a relation concerning female ducks, there may be uncertainty about the gender of a given duck.

Tuple uncertainty

Correlated uncertainty

References

*cite conference|first=Habich|later=Volk|coauthors=Clemens Utzny, Ralf Dittmann, Wolfgang Lehner|publisher=IEEE|accessdate=2008-08-01|title=Error-Aware Density-Based Clustering of Imprecise Measurement Values|booktitle=Seventh IEEE International Conference on Data Mining Workshops, 2007. ICDM Workshops 2007.


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • Data quality — Data are of high quality if they are fit for their intended uses in operations, decision making and planning (J. M. Juran). Alternatively, the data are deemed of high quality if they correctly represent the real world construct to which they… …   Wikipedia

  • Data Securities International — Data Securities International, DSI is a company based in San Francisco, California that escrows source code for licensees. History In 1981, mathematician Dwight Olson saw an opportunity in the infant software product industry. Software companies… …   Wikipedia

  • Data-stream management system — A Data stream management system (DSMS) is a computer program that controls the maintenance and querying of data in data streams. A key feature of these DSMSs is the ability to execute a continuous query against a data stream. The use of a DSMS to …   Wikipedia

  • Flight data recorder — An example of a flight data recorder; the underwater locator beacon is the small cylinder on the far right. (English translation of warning message: FLIGHT RECORDER DO NOT OPEN) …   Wikipedia

  • Abundances of the elements (data page) — …   Wikipedia

  • DBSCAN — (for density based spatial clustering of applications with noise) is a data clustering algorithm proposed by Martin Ester, Hans Peter Kriegel, Jörg Sander and Xiaowei Xu in 1996.[1] It is a density based clustering algorithm because it finds a… …   Wikipedia

  • Dan Suciu — Fields Computer Science Institutions University of Washington Alma mater …   Wikipedia

  • Naive Bayes classifier — A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes theorem with strong (naive) independence assumptions. A more descriptive term for the underlying probability model would be independent feature model . In… …   Wikipedia

  • Version space — A version space in concept learning or induction is the subset of all hypotheses that are consistent with the observed training examples (Mitchell 1997). This set contains all hypotheses that have not been eliminated as a result of being in… …   Wikipedia

  • Ibogaine — drugbox IUPAC name = 12 Methoxyibogamine width = 180 CAS number = 83 74 9 PubChem = 363272 C=20 | H=26 | N=2 | O=1 molecular weight = 310.433 g/mol smiles = CCC1CC2CC3C1N(C2)CCC4=C3NC5=C4C=C(C=C5)OC melting point = 152 melting high = 153… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”