Created 05-06-2016 05:38 AM
@hardikvdesai
Created 05-18-2016 05:43 PM
Give commit log an SSD
The simplest thing that you can which will yield a high performance boost is to give your commit log a dedicated SSD. Since cassandra utilises the commit log heavily, switching the commitlog_directory setting in cassandra.yaml to a dedicated SSd away from where you store sstables (the data files) will give much better write performances.
Heap space
Cassandra has a script that automatically allocates memory to each node, the script is very good in most usecases, but if you have lots of other tech running on the same machine which is likely in HDP, you probably want to check how much memory is assgined to your cassandra node. For cassandra 2.2.x the recomendation is between 2-8GB, for Cassandra 3+ you can extend the heap to 16GB and boost performance. This brings up another interesting point, heap overallocation. Remember that cassandra depends on GC for clearing up unused memtables and other datastructures, allocating too much memory will cause GC to slow down.
Enable JNA
Ensure that you have the JNA (Java Native Access) library enabled in your cluster. It allows java to use native C methods and gives it access to native memory which is utilised for offheap storage for many of the datastructures inside of cassandra. Check logs for the following two, the latter meaning JNA was able to get access to native memory: JNA link failure, one or more native method will be unavailable. CLibrary.java (line 121) JNA mlockall successful
Memtable = offheap
Configure memtables to be stored in native memory rather than the JVM's heap, in cassandra.yaml: memtable_allocation_type: offheap_objects
Compaction
Use the correct Compaction Strategy for your workload! Leveled compaction can really help READ heavy workloads since it guarantees that in 90% of reads you'll be able to retreive the row you want from an individual sstable once it has been compacted to levels higher than 0. Size-tiered compaction can heal deal with WRITE-burst type workloads where you expect there to be very high pressure peaks of writes.
Swap
Make sure you've disabled Swap, we dont wont cassandra going into swap space, performance will degrade very rapidly (and set /proc/sys/vm/swappiness to 1 just incase it gets re-enabled by accident).
There are whole books written about this, but these are some of the pointers off the top of my head.
Created 05-18-2016 05:43 PM
Give commit log an SSD
The simplest thing that you can which will yield a high performance boost is to give your commit log a dedicated SSD. Since cassandra utilises the commit log heavily, switching the commitlog_directory setting in cassandra.yaml to a dedicated SSd away from where you store sstables (the data files) will give much better write performances.
Heap space
Cassandra has a script that automatically allocates memory to each node, the script is very good in most usecases, but if you have lots of other tech running on the same machine which is likely in HDP, you probably want to check how much memory is assgined to your cassandra node. For cassandra 2.2.x the recomendation is between 2-8GB, for Cassandra 3+ you can extend the heap to 16GB and boost performance. This brings up another interesting point, heap overallocation. Remember that cassandra depends on GC for clearing up unused memtables and other datastructures, allocating too much memory will cause GC to slow down.
Enable JNA
Ensure that you have the JNA (Java Native Access) library enabled in your cluster. It allows java to use native C methods and gives it access to native memory which is utilised for offheap storage for many of the datastructures inside of cassandra. Check logs for the following two, the latter meaning JNA was able to get access to native memory: JNA link failure, one or more native method will be unavailable. CLibrary.java (line 121) JNA mlockall successful
Memtable = offheap
Configure memtables to be stored in native memory rather than the JVM's heap, in cassandra.yaml: memtable_allocation_type: offheap_objects
Compaction
Use the correct Compaction Strategy for your workload! Leveled compaction can really help READ heavy workloads since it guarantees that in 90% of reads you'll be able to retreive the row you want from an individual sstable once it has been compacted to levels higher than 0. Size-tiered compaction can heal deal with WRITE-burst type workloads where you expect there to be very high pressure peaks of writes.
Swap
Make sure you've disabled Swap, we dont wont cassandra going into swap space, performance will degrade very rapidly (and set /proc/sys/vm/swappiness to 1 just incase it gets re-enabled by accident).
There are whole books written about this, but these are some of the pointers off the top of my head.
Created 07-04-2019 10:34 PM
While Cassandra is based on the NoSQL family of databases, there's an explanation why we need to use a NoSQL database by Eileen McNulty on Dataconomy.
The four main challenges with Apache Cassandra and how to deal with them
Created 07-08-2019 03:07 PM
One more resource:
Should you use NoSQL or SQL Db or both by The Startup manager on medium