- Citrusleaf database
-
Citrusleaf Developer(s) Citrusleaf, Inc. Stable release 2.0.23 / September 1, 2010 Written in C Operating system Linux Type distributed key/value database system License Enterprise (Perpetual or Subscription based) Website citrusleaf.net The Citrusleaf database is an ACID-compliant, post-relational NoSQL database produced and marketed by Citrusleaf, Inc. It was originally developed for managing the mission-critical data for applications on the Real-time web. These applications require the ability to store 5 to 10 Kilobytes of information on hundreds of millions of webs users and compare it to potential ads to display with sub-millisecond response time. Citrusleaf takes advantage of the properties of Solid-state drive (SSD) to accomplish this. As of 2010 Citrusleaf has been implemented in production.
Contents
History
While at Yahoo! and Aggregate Knowledge, the founders of Citrusleaf Corporation encountered a problem. The volume and performance demands of Real-time web applications caused traditional SQL databases to fail. This was due to several reasons. The first was the sheer volume of data. Keeping track of 5 to 10 Kilobytes of information for each of hundreds of millions of people produced a database with billions of objects. Retrieving and processing this information with sub-millisecond response time was impossible with traditional database approaches. Traditional databases approaches were designed with traditional rotational disk storage in mind. The average seek time of rotating disk storage is ten milliseconds and therefore a sub-millisecond response time is not possible.
Design Drivers
The answer lay in making use of solid state drives SSD. In addition to performance, Fault-tolerant design was an issue. Their applications were mission-critical so in addition to the performance requirements the solution had to be available without interruption. Therefore in 2008 Brian Bulkowski created a key-value data store and later was joined by Srini Srinivasan in 2009. Together they created the Citrusleaf database platform. The Citrusleaf database platform is an ACID-compliant, extremely fast, scalable, fault-tolerant database engine[citation needed]. The system is capable of 100,000 transactions per second per node, with a response time of under one millisecond[citation needed]. To support these transaction loads in a non-stop manner during node arrivals and departures, the authors created software solutions in the areas of distributed systems, real-time prioritization, and storage management across all kinds of storage.
Data model
Citrusleaf organizes all data into namespaces. These namespaces are similar to a database instance in an RDBMS, and control policies like replication count and storage location. Within a namespace, individual data objects are referenced by tables and primary keys which could be strings, integers, or binary data. A key is a unique reference to a piece of data: common keys include usernames and session identifiers.
Each data object is a collection of 'bins' in Citrusleaf's parlance, which are similar to column names in SQL. The system is schema-less in that different columns can be used in different data objects of the same table. Each column's value is typed. The types supported are strings, integers, blobs, and "reflection blobs", which are binary data which has been reflected by the serializer of an individual object (such as a Java blob generated by Java's serializer). The use of typed values allows different languages to inter-operate simply: a string set in Java will appear correctly through the Python client, even though Java and Python use different underlying character representations (Unicode vs UTF-8).
Some high level operations (such as atomically adding integers) are supported, in the style of Redis, but the set of instructions is not very rich.
Citrusleaf's data model allows it to be considered as a document store, although it is more similar to a schema-less version of the row based schema typically used in relational systems.
Replication and Failover
- Automatic failure detection and in-flight transaction rerouting for nonstop operation in the face of failure.
- Automatic Client failover: Clients track cluster membership for automatic load balancing and transaction re-try.
- Flexible replication policy: Set replication factors for individual data items.
- Randomized object replication allows smooth load balancing during failure recovery.
Scalability and Performance
- Distributed object store: Easily store and retrieve large volumes of data through Citrusleaf client for C, C#, PHP, Java, Python and Ruby.
- Automatic cluster resizing and rebalancing: Citrusleaf cluster will automatically grow or shrink using zeroconfig networking.
- High sustained throughput of over 100,000 transactions per second per commodity node.
- Real-time performance: Low, predictable sub-millisecond latency from memory or flash storage.
See also
References
External links
Categories:- Database management systems
- Distributed computing architecture
- NoSQL
- Cross-platform software
- Structured storage
Wikimedia Foundation. 2010.