- Metadata repository
-
A Metadata repository is a database created to gather, store, and distribute contextual information about business data, when documented it is known as metadata. This contextual information of business data include meaning and content, policies that govern, technical attributes, specifications that transform, and programs that manipulate[1].
Contents
Definition
The metadata repository is responsible for physically storing and cataloging metadata. The metadata that is stored should be generic, integrated, current, and historical. Generic for a metadata repository means that the meta model should store the metadata by generic terms instead of storing it by an applications-specific defined way, so that if your data base standard changes from one product to another the physical meta model of the metadata repository would not need to change. Integration of the metadata repository allows all entities of the enterprise business to view all metadata subject areas. The metadata repository should also be designed so that current and historical metadata both can be accessed[2]. Metadata repositories use to be referred to as a data dictionary[3].
Repository vs. Registry
A metadata repository is similar to a metadata registry in that they only store metadata. The metadata repository is different from a metadata registry in that a repository provides response times suitable for browsing and reporting, while registries provides response times suitable for service virtualization[4].
Reason for use
Each database management system (DBMS) and database tools have their own language for the metadata components within. Database applications already have their own repositories or registries that are expected to provide all of the necessary functionality to access the data stored within. Vendors do not want other companies to be capable of easily migrating data away from their products and into competitors products, so they are proprietary with the way the handle metadata . CASE tools, DBMS dictionaries, ETL tools, data-cleansing tools, OLAP tools, and data mining tools all handle and store metadata differently. Only a metadata repository can be designed to store the metadata components from all of these tools[5].
Design
Metadata repositories should store metadata in four classifications: ownership, descriptive characteristics, rules and policies, and physical characteristics. Ownership, showing the data owner and the application owner. The descriptive characteristics, define the names, types and lengths, and definitions describing business data or business processes. Rules and policies, will define security, data cleanliness, timelines for data, and relationships. Physical characteristics define the origin or source, and physical location.[6]. Like building a logical data model for creating a database, a logical meta model can help identify the metadata requirements for business data[7]. The metadata repository will be centralized, decentralized, or distributed.
Centralized/Decentralized/Distributed
- A centralized metadata repository is the easiest to implement because there is only one database[8] A centralized design means that there is one database for the metadata repository that stores metadata for all applications business wide. A centralized metadata repository has the same advantages and disadvantages of a centralized database. Easier to manage because all the data is in one database, but the disadvantage is that bottlenecks may occur.
- A decentralized metadata repository stores metadata in multiple databases, either separated by location and or departments of the business. This makes management of the repository more involved than a centralized metadata repository, but the advantage is that the metadata can be broken down into individual departments.
- A distributed metadata repository uses a decentralized method, but unlike a decentralized metadata repository the metadata remains in its original application. An XML gateway is created[9] that acts as a directory for accessing the metadata within each different application. The advantages and disadvantages for a distributed metadata repository mirror that of a distributed database.
Entity-Relationship/Object-Oriented
Metadata repositories can be designed as either a Entity-relationship model, or an Object-oriented design.
Metadata Repository Solutions
If you choose not to build your own Metadata repository here are some vendors who can.
- ASG
- Logic Library
- BEA Systems
- CA
- MetaMatrix
*Troux Technologies
See also
References
- ^ Page 171 Moss, L. T., & Atre, S. (2003). Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications. Addison-Wesley Professional.
- ^ Chapter 2, Marco, D., & Jennings, M. (2004). Universal Metadata Models. Wiley
- ^ Page 239 Moss, L. T., & Atre, S. (2003). Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications. Addison-Wesley Professional.
- ^ page 5 - http://www.gartner.com/it/content/754400/754413/qa_what_is_a_registry.pdf Jess Thompson 9 November, 2007 Q&A: What Is a Registry/Repository, and Who Should Consider One?
- ^ Marco, D. (2000). Building and Managing the Metadata Repository: A Full Lifecycle Guide. Wiley.
- ^ Page 176 Moss, L. T., & Atre, S. (2003). Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications. Addison-Wesley Professional.
- ^ Page 185 Moss, L. T., & Atre, S. (2003). Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications. Addison-Wesley Professional.
- ^ Page 242 Moss, L. T., & Atre, S. (2003). Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications. Addison-Wesley Professional.
- ^ P246 Moss, L. T., & Atre, S. (2003). Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications. Addison-Wesley Professional
Wikimedia Foundation. 2010.