- Knowledge discovery
Knowledge discovery is a concept of the field of
computer science that describes the process of automatically searching large volumes ofdata for patterns that can be consideredknowledge "about" the data. It is often described as "deriving"knowledge from the inputdata . This complex topic can be categorized according to 1) what kind of "data" is searched; and 2) in what form is the result of the search represented.The most well-known branch of knowledge discovery is
data mining , also known asKnowledge Discovery in Databases (KDD). Just as many other forms of knowledge discovery it createsabstraction s of the input data. The "knowledge" obtained through the process may become additional "data" that can be used for further usage and discovery.Another promising application of knowledge discovery is in the area of
software modernization which involves understanding existing software artifacts. This process is related to a concept ofreverse engineering . Usually the knowledge obtained from existing software is presented in the form of models to which specific queries can be made when necessary. Anentity relationship is a frequent format of representing knowledge obtained from existing software.Object Management Group (OMG) developed specificationKnowledge Discovery Metamodel (KDM) which defines an ontology for the software assets and their relationships for the purpose of performing knowledge discovery of existing code. Knowledge discovery from existing software systems, also known assoftware mining is closely related todata mining , since existing software artifacts contain enormous business value, key for the evolution of software systems. Instead of mining individualdata set s,software mining focuses onmetadata , such as database schemas.Input data for knowledge discovery
*Databases
**Relational data
**Database
**Document warehouse
**Data warehouse
* Software Mining
* Text
**Concept mining
*Graphs
**Molecule mining
*Sequences
**Data stream mining
**Learning from time-varying data streams under concept drift
*WebOutput formats for discovered knowledge
*
Data model
*Metadata
*Metamodel s
*Ontology
*Knowledge representation
*Business rule
*Knowledge Discovery Metamodel (KDM)
*Business Process Modeling Notation (BPMN)
*Intermediate representation
*Resource Description Framework (RDF)
*Software metric s
Wikimedia Foundation. 2010.