Apache Cassandra DB

Cassandra DB

Apache Cassandra is an open source distributed database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients.

Cassandra achieves the highest throughput for the maximum number of nodes in all experiments

Apache Cassandra DB Services

Every node in the cluster has the same role. There is no single point of failure. Data is distributed across the cluster (so each node contains different data), but there is no master as every node can service any request.

Every node is working

Replication strategies are configurable. Cassandra is designed as a distributed system, for deployment of large numbers of nodes across multiple data centers. Key features of Cassandra’s distributed architecture are tailored for multiple-data center deployment, for redundancy, for failover and disaster recovery.

Fully functional replication

Read and write throughput both increase linearly as new machines are added, which means no downtime or interruption to applications.

No downtime, no interruptions

Data is automatically replicated to multiple nodes for fault-tolerance. Replication across multiple data centers is fully supported and ailed nodes can be replaced with no downtime.

Easy to fix issues

Writes and reads offer a tunable level of consistency. All the way from "writes never fail" to "block for all replicas to be readable" with the quorum level in the middle.

Consistency is key

Cassandra has Hadoop integration, with MapReduce support. There is support also for Apache Pig and Apache Hive.

Excellent support across Apache

Cassandra introduces CQL, a SQL-like alternative to the traditional RPC interface. Language drivers are available for Java (JDBC), Python (DBAPI2) and Node.JS (Helenus).

A very nice language, easy to use and fully supported

Cassandra Data Model

Cassandra’s data model is a partitioned row store with tunable consistency. Rows are organized into tables; the first component of a table’s primary key is the partition key; within a partition, rows are clustered by the remaining columns of the key. Other columns may be indexed separately from the primary key.

Tables may be created, dropped, and altered at runtime without blocking updates and queries.

Cassandra does not support joins or subqueries. Instead, Cassandra emphasizes denormalization through features like collections.

© Copyright 2012 - Acumen Consulting - St. Louis