Set Java Home. Check if JAVA_HOME is already set using
echo $JAVA_HOME
If Java home is not set, YCSB will give you an error. In Cloudera, path to jre could be at /usr/java/default. Once you find the path, you can set JAVA_HOME by using the following:
export JAVA_HOME=<path-to-jre>
Load the data. Before you can actually run the workload, you need to "load" the data first. To load data, do the following:
Note: It is best to run YCSB from a dedicated node, if you are running from a cluster node, which is a master or region server node. Ensure the following otherwise, YCSB will be unable to run successfully:
Node you are running from has Zookeeper
OR
hbase-site.xml (usually in /etc/hbase/conf directory) is copied to ycsb-0.17.0/hbase20-binding/conf directory
Watchouts: Common errors seen with YCSB
Sometimes the YCSB data load or workloads run but the operation (insert/read/update) shows 0 rows inserted with estimated completion in 106751991167300 days 15 hours.
Error message:
2020-02-04 23:00:09:424 20 sec: 0 operations; est completion in 106751991167300 days 15 hours
To solve this, you need to let the HBase client know where your HBase configuration is, and this can be done in two ways:
Link to the hbase-site.xml in /etc/hbase/conf directory using the classpath parameter (-cp). Example:
./bin/ycsb run hbase20 -cp /etc/hbase/conf -p columnfamily=family -s -P workloads/workloada
Create a conf directory and copy your cluster’s hbase-site.xml to it.
By default, dir => hbase20-binding/conf is added to classpath, else you can add it to your command line using -cp option (-cp hbase20-binding/conf)
About YCSB
YCSB is an open-source specification and program suite for evaluating the retrieval and maintenance capabilities of computer programs. It is a very popular tool used to compare the relative performance of NoSQL database management systems.