About asinghal

asinghal · ‎09-13-2017

CompactionState.NONE : means no compaction is currently running. bq. And major compaction is an asychronous process, how to predict the wait time to get the state form the client API You need to keep on polling the API with some arbitrary wait time.

asinghal · ‎09-13-2017

bq. 1) How to figure out the completion status for the compaction, since hbase client api majorCompact is an Asynchorous process you can use below API. CompactionState compactionState = admin.getCompactionState(table.getName()); 2) Is it mandatory to wait until compaction process completion , to query hbase for real time process Check your resources consumption with compaction as it impacts I/O, CPU usage and network . In standard server configuration , it is fine to run real time process during compaction.

asinghal · ‎09-07-2017

It's better you declare every field as VARCHAR and then use functions to convert them to numbers[1] for mathematical operations. [1] https://phoenix.apache.org/language/functions.html#to_number

asinghal · ‎08-10-2017

Do you have any errors in regionserver logs (it seems hbase:namespace table is not getting assigned some how).

asinghal · ‎08-09-2017

Before running above command please take regionserver and master down if not already.(and keep the zookeeper running)

asinghal · ‎08-09-2017

If it is not a production cluster(or not doing replication or other stuff dependent on zookeeper), can you try cleaning your zookeeper as it may be possible that there is znode for non-existent tables . bin/hbase clean --cleanZk

asinghal · ‎08-07-2017

Map intermediate data will be written and sorted on local disk before sending to the reducer machines. You can reduce Map output Use Combiner in between Compressing it with Gzip to save network IO but there will be a tradeoff for CPU (mapred.compress.map.output=true, mapred.map.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec) Decrease split size(this will distribute map across server) and increase the number of reducers so that they have fewer amount of data to sort and process stop speculative execution( mapred.map.tasks.speculative.execution=false) If you can optimize on sorting, update algorithm for map.sort.class bq. Will i get any performance improvement if i increase io.sort.mb paramter when Map() task generates huge amount of data? Yes (but impact may not be huge), you can use with io.sort.factor

asinghal · ‎08-07-2017

can you paste the complete stack trace.

asinghal · ‎08-07-2017

Currently, It seems there is no option in distcp to do so. File can be replaced by a File and directory by a directory by passing "-update" option to command.

asinghal · ‎08-07-2017

you can use "major_compact" command to run a major compaction on the table. In HBase shell:- hbase(main):013:0> major_compact 'tablename'

Online	Offline
Last Visited	‎02-05-2025 02:49 PM

Member Since	‎11-14-2015 06:34 PM
Last Visited	‎02-05-2025 02:49 PM
Posts	268
Kudos received	122

Cloudera Community

Re: In HDFS Distcp is there a way to replace a fil...

Re: Phoenix query throwing ArrayIndexOutOfBoundsEx...

Re: Cannot initiate connection as SYSTEM:CATALOG i...

Re: Cannot connect to Phoenix Query Server using J...

Re: Phoenix index becom unavaiable

Re: Is there a way to find the completion of major...

Re: Is there a way to find the completion of major...

Re: When creating a Phoenix view over an existing ...

Re: Hbase Master Fails

Re: Hbase Master Fails

Re: Hbase Master Fails

Re: Where context.write(key,value) method's output...

Re: org.apache.hadoop.hbase.client.RetriesExhauste...

Re: In HDFS Distcp is there a way to replace a fil...

Re: Need an easy and performant way to purge data ...