- Membase
-
Membase Developer(s) Couchbase (merged from NorthScale), Zynga, NHN Stable release 1.7.1 / July 26, 2011 Written in C++, Erlang Operating system Cross-platform Type distributed key/value database system License Apache License Website membase.com Membase (pronunciation: mem-base) is an Open Source (Apache 2.0 license) distributed, key-value database management system optimized for storing data behind interactive web applications. These applications must service many concurrent users; creating, storing, retrieving, aggregating, manipulating and presenting data. In support of these kinds of application needs, membase is designed to provide simple, fast, easy to scale key-value data operations with low latency and high sustained throughput. It is designed to be clustered for single machine to very large scale deployments.
For those familiar with memcached, membase provides on-the-wire client protocol compatibility,[1] but is designed to add disk persistence (with hierarchical storage management), data replication, live cluster reconfiguration, rebalancing and multi-tenancy with data partitioning.
In the parlance of Eric Brewer’s CAP theorem, membase is a CP type system.
Contents
History
Membase was developed by several leaders of the memcached project, who had founded a company, NorthScale, expressly to meet the need for an key-value database that enjoyed all the simplicity, speed, and scalability of memcached, but also provided the storage, persistence and querying capabilities of a database. The original membase source code was contributed by NorthScale, and project co-sponsors Zynga and NHN to a new project on membase.org in June 2010.
As of February 8, 2011, the Membase project founders and Membase, Inc. announced a merger with CouchOne (a company with many of the principal players behind CouchDB) with an associated project merger. The merged project will be known as Couchbase[2]
Design drivers
According to the Membase site and presentations, Membase design decisions are weighed against three non-negotiable requirements. By design, membase is simple, fast, and elastic.[3]
Membase intends to be extremely easy to manage, and simple to develop against. Every node is alike in a membase cluster – clone a node, join it to the cluster and press the rebalance button to automatically rebalance data to it. Membase has wide language and application framework support due to its on-the-wire protocol compatibility with memcached; in fact, membase directly incorporates memcached “front end” source code, leveraging the memcached engine interface, guaranteeing compatibility today and in to the future.
Membase distributes data and data operation I/O across commodity servers (or VMs), replicates data for high-availability, transparently caches data in main memory, persists the data with a design for multi-tier storage management model (planned to support Solid-state drive and Hard disk drive media). It is a consistently low-latency and high-throughput processor of data operations. It is multi-threaded, with low lock contention; it automatically de-duplicates writes and is internally asynchronous everywhere possible.
Membase claims to scale with linear cost. Servers can be added to, or removed from, a running cluster with no application downtime. Employing commodity servers, virtual machines or cloud machine instances, data management resources can be dynamically matched to the needs of an application with little effort.
Data model
Key Features (persistence, replication/failover, scalability/performance)
Persistence
- Asynchronously writes data to disk after acknowledging write to client. In version 1.7 and later, applications can ensure data is synced to more than one server, while disk writes are still asynchronous.
- Tunables to define item ages that affect when data is persisted.[4]
- Supports working set greater than a memory quota per "node" or "bucket"
- Tunables to affect how max memory and migration from main-memory to disk is handled.[5]
- Configurable “tap” interface: External systems can subscribe to filtered data streams – supporting, for example, full text search indexing, data analytics or archiving.[6]
Replication and failover
- Multi-model replication support: Peer-to-peer replication support with underlying architecture supporting master-slave replication.
- Configurable replication count: Balance resource utilization with availability requirements
- High-speed failover: Fast failover to replicated items based upon request
Scalability and performance
- Distributed object store: Easily store and retrieve large volumes of data from any application, using any language or application framework
- Dynamic cluster resizing and rebalancing: Effortlessly grow or shrink a membase cluster, adapting to changing data management requirements of an application
- Guaranteed data consistency: Never grapple with consistency issues in your application – no quorum reads required
- High sustained throughput
- Low, predictable latency. When operating out of memory, most operations occur in far less than 1 ms (assuming gigabit Ethernet).
Prominent users
See also
- Memcached
- MemcacheDB
- NoSQL
- CouchDB
References
- ^ http://code.google.com/p/memcached/wiki/NewProtocols
- ^ Couchbase Website
- ^ membase.org:Does the world really need another NoSQL Database?
- ^ membase.org wiki: membase Background Flush
- ^ membase.org wiki: Disk > Memory
- ^ Want to know what your memcached servers are doing? Tap them.
- ^ a b NorthScale Releases High-Performance NoSQL Database
- ^ Best Open Source Database: Its probably a NoSQL
Commercially supported distributions
External links
Categories:- Open source database management systems
- Distributed computing architecture
- NoSQL
- Cross-platform software
- Structured storage
- Asynchronously writes data to disk after acknowledging write to client. In version 1.7 and later, applications can ensure data is synced to more than one server, while disk writes are still asynchronous.
Wikimedia Foundation. 2010.