Enterprise Data Fabric

Enterprise Data Fabric

An Enterprise Data Fabric (EDF) is a distributed, operational data platform that sits between application infrastructures (such as J2EE or .NET Framework) and back-end data sources. It offers data storage (caching), multiple APIs for data access, reliable data distribution and real-time data analysis. All these features are designed with scalability and performance in mind.

Why It is Relevant

EDF proponents suggest that traditional infrastructure tools such as databases, data warehouses, and Enterprise Messaging Systems cannot handle the real-time needs of applications, thus the importance of the EDF. Generally speaking, other architectures suffer from:

* High latency and lack of scalability under concurrent loads
* Lack of effective state management in distributed environments
* Expensive and inefficient data replication
* Lack of flexibility in supporting event driven architectures as well as request/reply.

Fundamental Tenets

"1. It’s about operational data management:"Unlike a data warehousing system where terabytes (or petabytes) of data is consolidated from multiple databases for offline data analysis, the EDF is a real-time data store specifically optimized for working with operational data subsets needed by real-time applications – it can be referred to as the “right now” data, or the data accessed by many processes and applications. It is a layer of abstraction in the middle tier that collocates frequently used data with the application and works with backend databases behind the scenes.

"2. Distributed persistence via distributed caching:"An EDF stores data by utilizing main-memory distributed caching, which makes it many times faster than the traditional disk based DBMS. It harnesses the memory and disk across many clustered machines to co-locate data with consuming applications and provide unprecedented data access rates and scalability. Highly concurrent main-memory data structures are utilized to avoid lock contention. Different policies can be applied to different data subsets in different locations, making the data more application-centric as opposed to the other way around and isolating a user from implicit technology characteristics. Persistence becomes an attribute of all parts of the system, not just concentrated in the database. High availability or consistency of data is not compromised, as a configurable policy dictates the number of redundant memory copies to be maintained, and failure detection models built into the distribution system ensures data correctness. The in-memory data layer can be backed with a disk persistence layer that can be configured to receive data synchronously or asynchronously based on the usage scenario.

"3. Key database semantics are retained:"Quite like a database management system, distributed data in an EDF can be managed with transactional integrity, queried, and recovered from disk. This is unlike simple distributed caching solutions that provide caching of serialized objects and simple key-value pairs managed in hashmaps that can be replicated to your cluster nodes. An EDF also provides support for multiple data models across multiple popular languages – data can be managed as objects, XML documents or as relational tables and accessed via programmatic APIs (such as Java, C++, or C#) or query languages such as OQL, Xpath, and SQL, etc. Unlike a DBMS, where all updates are persisted and transactional in nature ACID, EDF relaxes the constraints allowing applications to control when and for what kind of data you need total ACID characteristics.

"4. Active data management:"Data in an EDF is a dynamic entity, which changes rapidly and is updated by many processes in a distributed environment. Thus in addition to the request-reply paradigm (ala databases), an EDF supports an event-driven model where applications are notified when events of interest are being generated in the fabric. Such a model is accommodated through a combination of ad-hoc querying (request-reply) and continuous querying (event-driven). In the continuous query model, applications can register queries representing complex patterns of interest. Unlike a database system where queries have to be executed on resident data, in an EDF data (or events) is continuously evaluated by a query engine that is aware of the interest expressed by hundreds of distributed client processes.

"5. Messaging like Semantics for Data Distribution:"While dealing with data management across distributed applications, developers expect reliable and guaranteed Publish-Subscribe semantics, quite like what is offered by messaging systems in the market. An EDF incorporates these messaging-like data distribution features on top of what looks like a database from a data access/storage standpoint to a developer. The system has knowledge about active subscribers and provides different levels of message delivery guarantees to those subscribers. Unlike traditional messaging where applications have to deal with piecemeal messages, message construction, incorporating contextual information in messages, managing data consistency across publishers and subscribers, an EDF enables a more intuitive approach - one where applications simply deal with a data model (Object or SQL) and subscribe to portions of the data model. When data publishers make updates to the business objects or relationships, subscribers are simply notified of the changes to the underlying distributed data fabric, and they can choose to access the relevant data instantaneously from the fabric.

References

External Links

* [http://www.forrester.com/Research/Document/Excerpt/0,7211,35918,00.html Forrester Research: Information Fabric]
* [http://www.gemstone.com/products/gemfire/edf.php GemFire Enterprise Data Fabric]
* [http://www.oracle.com/technology/products/coherence/index.html Oracle Coherence Data Grid] (Formerly the [http://www.tangosol.com Tangosol Coherence Data Grid] )
* [http://www.ibm.com/developerworks/downloads/ws/wsdg/learn.html?S_TACT=105AGX10&S_CMP=LP/ IBM WebSphere DataGrid - Free Download]
* [http://www.alachisoft.com/ NCache Enterprise Data Grid - Free Download]
* [http://www.gigaspaces.com/ GigaSpaces Enterprise Data Fabric]
* [http://www.cacheonix.com/ Cacheonix Data Fabric]
* [http://www.hazelcast.com/ Hazelcast Data Fabric]
* [http://www.adobe.com/products/livecycle/dataservices/index.html Adobe LiveCycle Data Services ES]


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Data center fabric — A data center fabric describes the hardware, software, and technology infrastructure required to power data centers. New and emerging business requirements drive IT organizations to evolve their technology infrastructures, fueling an ongoing… …   Wikipedia

  • Fabric computing — or unified computing involves the creation of a computing fabric consisting of interconnected nodes that look like a weave or a fabric when viewed collectively from a distance.[1] Usually this refers to a consolidated high performance computing… …   Wikipedia

  • Enterprise service bus — In computing, an enterprise service bus (ESB) refers to a software architecture construct. This construct is typically implemented by technologies found in a category of middleware infrastructure products, usually based on recognized standards,… …   Wikipedia

  • Enterprise Virtual Array — Das HP StorageWorks Enterprise Virtual Array, kurz EVA, erschien 2001 und ist seitdem das einzige voll virtualisierte Storagesystem auf dem Markt. Das System wurde von DEC entwickelt, von Compaq übernommen und im Zuge der Übernahme durch Hewlett… …   Deutsch Wikipedia

  • Computer data storage — 1 GB of SDRAM mounted in a personal computer. An example of primary storage …   Wikipedia

  • EDF — can stand for:In science and technology: *Earliest Deadline First, a dynamic scheduling principle used in real time operating systems. *Electric Ducted Fan, an electrically driven impeller or ducted propeller *Empirical distribution function, a… …   Wikipedia

  • Gemstone Database Management System — Infobox programming language name = GemStone Database Management System paradigm = Application framework year = 1991 typing = designer = implementations = dialects = influenced by = Smalltalk,Object oriented programming influenced = J2EEGemStone… …   Wikipedia

  • InfiniBand — The panel of an InfiniBand switch InfiniBand is a switched fabric communications link used in high performance computing and enterprise data centers. Its features include high throughput, low latency, quality of service and failover, and it is… …   Wikipedia

  • Converged Infrastructure — packages multiple information technology (IT) components into a single, optimized computing solution. Components of a converged infrastructure solution include servers, data storage devices, networking equipment and software for IT infrastructure …   Wikipedia

  • Teradata — Infobox Company company name = Teradata Corporation company type = Public (NYSE: [http://www.nyse.com/about/listed/lcddata.html?ticker=TDC TDC] ) company slogan = Raising Intelligence foundation = 1979 location = key people = Michael Koehler,… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”