Knowledge discovery is a concept of the field of computer science that describes the process of automatically searching large volumes of data for patterns that can be considered knowledge "about" the data. It is often described as "deriving" knowledge from the input data. This complex topic can be categorized according to 1) what kind of "data" is searched; and 2) in what form is the result of the search represented.

The most well-known branch of knowledge discovery is data mining, also known as Knowledge Discovery in Databases (KDD). Just as many other forms of knowledge discovery it creates abstractions of the input data. The "knowledge" obtained through the process may become additional "data" that can be used for further usage and discovery.

Another promising application of knowledge discovery is in the area of software modernization which involves understanding existing software artifacts. This process is related to a concept of reverse engineering. Usually the knowledge obtained from existing software is presented in the form of models to which specific queries can be made when necessary. An entity relationship is a frequent format of representing knowledge obtained from existing software. Object Management Group (OMG) developed specification Knowledge Discovery Metamodel (KDM) which defines an ontology for the software assets and their relationships for the purpose of performing knowledge discovery of existing code. Knowledge discovery from existing software systems, also known as software mining is closely related to data mining, since existing software artifacts contain enormous business value, key for the evolution of software systems. Instead of mining individual data sets, software mining focuses on metadata, such as database schemas.

Input data for knowledge discovery

**Relational data
**Document warehouse
**Data warehouse
* Software Mining
* Text
**Concept mining
**Molecule mining
**Data stream mining
**Learning from time-varying data streams under concept drift

Output formats for discovered knowledge

*Data model
*Knowledge representation
*Business rule
*Knowledge Discovery Metamodel (KDM)
*Business Process Modeling Notation (BPMN)
*Intermediate representation
*Resource Description Framework (RDF)
*Software metrics

