Support Questions

aishwaryamdixit · ‎09-01-2017

Hi,

I wanted to know the details about the different options and commands present with pe tool.

Specifically number of rows, number of clients and columns. How it works?

pminovic · ‎09-02-2017

The best way to learn about various pe options is to run "hbase pe" without any options or commands:

$ hbase pe
Usage: java org.apache.hadoop.hbase.PerformanceEvaluation <OPTIONS> [-D<property=value>]* <command> <nclients>
...

About nclients I already replied to you in another question: This is the level of parallelism used to run the specified command, in case of default MapReduce it means that 10*nclinents mappers will be started. About other options you asked, and a few others I use:

rows            Rows each client runs. Default: One million
columns         Columns to write per row. Default: 1
presplit        Create presplit table. Recommended for accurate perf analysis (see guide).  Default: disabled
compress        Compression type to use (GZ, LZO, ...). Default: 'NONE'
table           Alternate table name. Default: 'TestTable'
bloomFilter     Bloom filter type, one of [NONE, ROW, ROWCOL]
valueSize       Pass value size to use: Default: 1024

Example:

hbase pe --table=TestTable2 --compress=GZ --presplit=4 randomWrite 5

And of course, first run one of write commands, followed by some reads. And for the output, look for the following lines in the output of the MR job:

HBase Performance Evaluation
Elapsed time in milliseconds=492463
Row count=1048560

You can also prepend "time" and run as "time hbase pe ...". For more details search the web, thought the results are segmented.

Cloudera Community

Support Questions

How performance evaluation tool pe of hbase works?