Data stream mining

Data stream mining: Data Stream Mining is the process of extracting knowledge structures from continuous, rapid data records. A data stream is an ordered sequence of instances that in many applications of data stream mining can be read only once or a small number of times using limited computing and storage capabilities. Examples of data streams include computer network traffic, phone conversations, ATM transactions, web searches, and sensor data. Data stream mining can be considered a subfield of data mining, machine learning, and knowledge discovery.

In many data stream mining applications, the goal is to predict the class or value of new instances in the data stream given some knowledge about the class membership or values of previous instances in the data stream. Machine learning techniques can be used to learn this prediction task from labeled examples in an automated fashion. In many applications, the distribution underlying the instances or the rules underlying their labeling may change over time, i.e. the goal of the prediction, the class to be predicted or the target value to be predicted, may change over time. This problem is referred to as concept drift.

Contents

1 Software for data stream mining

2 Events

3 Researchers working on data stream mining

4 Master References

5 Bibliographic References

6 Books

7 See also

8 External references

Software for data stream mining

RapidMiner: free open-source software for knowledge discovery, data mining, and machine learning also featuring data stream mining, learning time-varying concepts, and tracking drifting concept (if used in combination with its data stream mining plugin (formerly: concept drift plugin))

MOA (Massive Online Analysis): free open-source software specific for mining data streams with concept drift. It contains a prequential evaluation method, the EDDM concept drift methods, a reader of ARFF real datasets, and artificial stream generators as SEA concepts, STAGGER, rotating hyperplane, random tree, and random radius based functions. MOA supports bi-directional interaction with Weka (machine learning).

Events

International Workshop on Knowledge Discovery from Ubiquitous Data Streams held in conjunction with the 18th European Conference on Machine Learning (ECML) and the 11th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD) in Warsaw, Poland, in September 2007.

ACM Symposium on Applied Computing Data Streams Track held in conjunction with the 2007 ACM Symposium on Applied Computing (SAC-2007) in Seoul, Korea, in March 2007.

IEEE International Workshop on Mining Evolving and Streaming Data (IWMESD 2006) to be held in conjunction with the 2006 IEEE International Conference on Data Mining (ICDM-2006) in Hong Kong in December 2006.

Fourth International Workshop on Knowledge Discovery from Data Streams (IWKDDS) to be held in conjunction with the 17th European Conference on Machine Learning (ECML) and the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD) (ECML/PKDD-2006) in Berlin, Germany, in September 2006.

Researchers working on data stream mining

Carlo Zaniolo, University of California Los Angeles (UCLA), California, United States

João Gama, University of Porto, Portugal

Mohamed Medhat Gaber, University of Portsmouth, UK

Olfa Nasraoui, University of Louisville, USA

Hua-Fu Li, National Chiao-Tung University, Taiwan

Eyke Hüllermeier, University of Marburg, Germany

Marco Grawunder, University of Oldenburg, Germany

Latifur Khan, University of Texas at Dallas.

Master References

Gaber, M, M., Zaslavsky, A., and Krishnaswamy, S., Mining Data Streams: A Review, in ACM SIGMOD Record, Vol. 34, No. 1, June 2005, ISSN: 0163-5808

B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, Models and Issues in Data Stream Systems, in Proceedings of PODS, 2002.

Mining Data Streams Bibliography Maintained by: Mohamed Medhat Gaber

Bibliographic References

Grabtree I. Soltysiak S. Identifying and Tracking Changing Interests. International Journal of Digital Libraries, Springer Verlag, vol. 2, 38-53.

Klinkenberg, Ralf: Learning Drifting Concepts: Example Selection vs. Example Weighting. In Intelligent Data Analysis (IDA), Special Issue on Incremental Learning Systems Capable of Dealing with Concept Drift, Vol. 8, No. 3, pages 281—300, 2004.

Klinkenberg, Ralf: Using Labeled and Unlabeled Data to Learn Drifting Concepts. In Kubat, Miroslav and Morik, Katharina (editors), Workshop notes of the IJCAI-01 Workshop on \em Learning from Temporal and Spatial Data, pages 16–24, IJCAI, Menlo Park, CA, USA, AAAI Press, 2001.

Klinkenberg, Ralf and Joachims, Thorsten: Detecting Concept Drift with Support Vector Machines. In Langley, Pat (editor), Proceedings of the Seventeenth International Conference on Machine Learning (ICML), pages 487—494, San Francisco, CA, USA, Morgan Kaufmann, 2000.

Klinkenberg, Ralf and Renz, Ingrid: Adaptive Information Filtering: Learning in the Presence of Concept Drifts. In Sahami, Mehran and Craven, Mark and Joachims, Thorsten and McCallum, Andrew (editors), Workshop Notes of the ICML/AAAI-98 Workshop \em Learning for Text Categorization, pages 33–40, Menlo Park, CA, USA, AAAI Press, 1998.

Koychev I. Gradual Forgetting for Adaptation to Concept Drift. In Proceedings of ECAI 2000 Workshop Current Issues in Spatio-Temporal Reasoning. Berlin, Germany, 2000, pp. 101–106

Koychev I. and Schwab I., Adaptation to Drifting User’s Interests, Proc. of ECML 2000 Workshop: Machine Learning in New Information Age, Barcelona, Spain, 2000, pp. 39–45

Maloof, M.A. and Michalski, R.S. Learning Evolving Concepts Using Partial Memory Approach. Working Notes of the 1995 AAAI Fall Symposium on Active Learning, Boston, MA, pp. 70–73, 1995

Maloof M. and Michalski R. Selecting examples for partial memory learning. Machine Learning, 41(11), 2000, pp. 27–52.

Mitchell T., Caruana R., Freitag D., McDermott, J. and Zabowski D. Experience with a Learning Personal Assistant. Communications of the ACM 37(7), 1994, pp. 81–91.

Mohammad M. Masud, Jing Gao, Latifur Khan, Jiawei Han, Bhavani M. Thuraisingham: Integrating Novel Class Detection with Classification for Concept-Drifting Data Streams. ECML/PKDD (2) 2009: 79-94 (extended version will appear in TKDE journal).

Nasraoui O. , Rojas C., and Cardona C., “ A Framework for Mining Evolving Trends in Web Data Streams using Dynamic Learning and Retrospective Validation ”, Journal of Computer Networks- Special Issue on Web Dynamics, 50(10), 1425-1652, July 2006

Nasraoui O. , Cerwinske J., Rojas C., and Gonzalez F., "Collaborative Filtering in Dynamic Usage Environments", in Proc. of CIKM 2006 – Conference on Information and Knowledge Management, Arlington VA , Nov. 2006

Schlimmer J., and Granger R. Incremental Learning from Noisy Data, Machine Learning, 1(3), 1986, 317-357.

Scholz, Martin and Klinkenberg, Ralf: Boosting Classifiers for Drifting Concepts. In Intelligent Data Analysis (IDA), Special Issue on Knowledge Discovery from Data Streams, Vol. 11, No. 1, pages 3–28, March 2007.

Scholz, Martin and Klinkenberg, Ralf: An Ensemble Classifier for Drifting Concepts. In Gama, J. and Aguilar-Ruiz, J. S. (editors), Proceedings of the Second International Workshop on Knowledge Discovery in Data Streams, pages 53–64, Porto, Portugal, 2005.

Schwab I., Pohl W. and Koychev I. Learning to Recommend from Positive Evidence, Proceedings of Intelligent User Interfaces 2000, ACM Press, 241 - 247.

Widmer G. Tracking Context Changes through Meta-Learning, Machine Learning 27, 1997, pp. 256–286.

Widmer G. and Kubat M. Learning in the presence of concept drift and hidden contexts. Machine Learning 23, 1996, pp. 69–101.

Books

Gama J., and Gaber M. M. (Eds), Learning from Data Streams: Processing Techniques in Sensor Networks, a book published by Springer Verlag, 2007.

Ganguly A., Gama J., Omitaomu O., Gaber M. M., Vatsavai R. R. (Eds), Knowledge Discovery from Sensor Data, a book published by CRC Press, 2008.

Gama J., Knowledge Discovery from Data Streams, a book published by CRC Press, 2010.

See also

Streaming Algorithm

Stream processing

External references

IBM Spade - Stream Processing Application Declarative Engine

IBM Infosphere Streams

StreamIt - programming language and compilation infrastructure by MIT CSAIL

Categories:
Data mining
Business intelligence

Игры ⚽ Нужна курсовая?

Look at other dictionaries:

Data Stream — Mit Datenströmen (englisch: data streams) bezeichnet man in der Informatik kontinuierliche Abfolgen von Datensätzen, deren Ende nicht im Voraus abzusehen ist. Die einzelnen Datensätze sind dabei von beliebigem, aber festem Typ. Die Menge der… … Deutsch Wikipedia
Glossaire du data mining — Exploration de données Articles principaux Exploration de données Fouille de données spatiales Fouille du web Fouille de flots de données Fouille de textes … Wikipédia en Français
mining — /muy ning/, n. 1. the act, process, or industry of extracting ores, coal, etc., from mines. 2. the laying of explosive mines. [1250 1300; ME: undermining (walls in an attack); see MINE2, ING1] * * * I Excavation of materials from the Earth s… … Universalium
Mining — This article is about the extraction of geological materials from the Earth. For the municipality in Austria, see Mining, Austria. For the siege tactic, see Mining (military). For name of the Chinese emperor, see Daoguang Emperor. Simplified… … Wikipedia
coal mining — Coal was very important in the economic development of Britain. It was used as fuel in the factories built during the Industrial Revolution and continued to be important until the 1980s. The main coalfields are in north east England, the north… … Universalium
Web mining — is the application of data mining techniques to discover patterns from the Web. According to analysis targets, web mining can be divided into three different types, which are Web usage mining, Web content mining and Web structure mining.Web usage … Wikipedia
Mountaintop removal mining — Mountaintop removal site Mountaintop removal in Martin County, Kentucky M … Wikipedia
Microsoft Data Access Components — MDAC redirects here. For other uses, see MDAC (disambiguation). MDAC (Microsoft Data Access Components) Microsoft Corporation s MDAC provides a uniform framework for accessing a variety of data sources on their Windows platform. Developer(s)… … Wikipedia
Gold mining in Alaska — Gold mining in Alaska, a state of the United States, has been a major industry and impetus for exploration and settlement since a few years after the United States acquired the territory from Russia. Russian explorers discovered placer gold in… … Wikipedia
Ruby-Poorman mining district — The Ruby Poorman mining district in the U.S. state of Alaska produced nearly a half million ounces of gold, all from placer mines. Some of the largest gold nuggets found in Alaska are from the district, which lies along the Yukon River. [Alaska… … Wikipedia

Academic Dictionaries and Encyclopedias

Data stream mining

Contents

Software for data stream mining

Events

Researchers working on data stream mining

Master References

Bibliographic References

Books

See also

External references

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Data stream mining

Contents

Software for data stream mining

Events

Researchers working on data stream mining

Master References

Bibliographic References

Books

See also

External references

Look at other dictionaries:

Share the article and excerpts

Direct link