About elserj

elserj · ‎10-25-2018

The more backups you have, the more information the mapreduce needs to read to execute correctly. Similarly, the more incremental backups you have without a full backup means more files that the MR job needs to read/hold. The solution is to increase the Java heap that you provide to the job. You should also do some learning to understand the difference between Java heap and physical memory as your analysis implies that you don't understand the difference between them.

elserj · ‎10-11-2018

"waitTime=60001, operationTimeout=60000 expired" You need to include hbase-site.xml on the classpath for your application. It is obvious from the error that the hbase.rpc.timeout and phoenix.query.timeoutMs are not being respected from this error.

elserj · ‎10-11-2018

org.apache.hadoop.hbase.client.RetriesExhaustedException:Can't get the location for replica 0 "Replica 0" is the location of a Region. If the client can't find this location, that means that the Region is not being hosted (it's in-transition). Make sure all of the Regions for your table are assigned.

elserj · ‎09-26-2018

No, by definition the superuser has all permission to perform actions on the system. You can change who the superuser is, if you choose. Without strong authentication via Kerberos, you're wasting your time trying to apply any kind of authorization rules to your system because anyone will be able to masquerade as whoever they want.

elserj · ‎09-04-2018

You have issues where DataNodes are being marked as dead which causes HBase to be unable to reach the required level of replication that you have configured. Inspect why the DataNodes are being marked as failures: JVM gc pauses or networking issues are common suspects.

elserj · ‎08-29-2018

Sounds like you're hitting https://issues.apache.org/jira/browse/PHOENIX-4489. This was fixed in HDP-2.6.5. However, it seems like you are using a version of Phoenix which is not included in HDP, so you are on your own to address that issue.

elserj · ‎08-07-2018

You don't need to scan the entire table if you can enumerate the values of salt that you used. E.g. if you only use salt 000 through 009, you would have to execute 10 Gets to look for the data, 000:rowkey1, 001:rowkey1, 002:rowkey1, 003:rowkey1, ... If you used a stable hashing algorithm to choose a salt based on the rowkey value, you will know the exact salt value to use (e.g. rowkey1 always generates salt "004"). At the end of the day, HBase is only storing bytes -- it's up to you to know how you inserted the data and need to retrieve it.

elserj · ‎08-06-2018

Does your data actually span all of the regions you created splitpoints for? Or, when this finishes generating the HFile, does the client end up having to split the HFiles (and not just load them?). The only thing I can guess would be that the HBaseStorageHandler isn't doing something right. Generating only on HFile when you have 10 regions is definitely suboptimal.

elserj · ‎08-03-2018

When you are generating HFiles for HBase, the typical pattern is that you have one reducer per Region because HFiles must only contain data for a specific Region. As such, tweaking the number Reducers you get is more of a factor of presplitting your table to increase the number of Reducers (or merging, to reduce the number of Reducers).

elserj · ‎08-01-2018

How are you using Hive (mapreduce, tez, LLAP)? Can you add some context about where you think slowness would be? e.g. how long does it take to just read that data from Hive (run a select)? Can you tell how much time is actually spent writing data to HBase from logs? If you rerun the same INSERT, does it always take this much time? If you change the LIMIT, does 2000 rows take twice as long to insert?

Online	Offline
Last Visited	‎07-01-2022 02:44 PM

Member Since	‎07-17-2019 08:58 AM
Last Visited	‎07-01-2022 02:44 PM
Posts	738
Kudos received	429

Cloudera Community

Re: Why can't Object Stores like Amazon S3 be used...

Re: Not a host:port pair: PBUF, how to resolve?

Re: versioning question in hbase

Re: Phoenix query call from java on larger data se...

Re: Revoke permissions to a superuser on Hbase

Re: HBase 2.0 incremental backup failed with out o...

Re: Phoenix query call from java on larger data se...

Re: Cannot insert data from hbase to hive

Re: Revoke permissions to a superuser on Hbase

Re: HBase Region Servers shutdown after a while

Re: Spark, Apache Phoenix and Hbase, connection ut...

Re: Use HBase Shell Scan method to search in salte...

Re: HFile creation from Hive Table not working

Re: HFile creation from Hive Table not working

Re: Hive HBase Integration very slow inserts