Member since
09-19-2016
4
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3308 | 12-08-2016 08:13 PM |
12-08-2016
08:13 PM
So ... after a long hiatus. Turns out this is actually https://issues.apache.org/jira/browse/HBASE-13262 I was using hbase-client 0.96 with HBase 1.0.0 (CDH 5.5) and we had tables that were housing large XML payloads, which would force the bug to manifest when hbase.client.scanner.caching was a high value. There are multiple ways to fix this: Use hbase-client 0.98+, if you can afford to upgrade without impact Lower the value of hbase.client.scanner.caching in CM (this was what I ended up doing) Programatically, use Scan.setCaching(int) and/or Scan.setMaxResultSize() to avoid the region skipping.
... View more
12-02-2016
04:32 AM
Thanks for this - works for Parquet, but how does one do this for a table from CSV? Let's say a CSV schema changes, I want to be able to use the Avro schema evolution to create the table. I tried the same create statement, but using STORED AS TEXTFILE and with the ROW FORMAT DELIMITED etc. I end up getting null values.
... View more
09-19-2016
01:09 PM
Hello, I have several HBase tables defined using Avro schemas and I am trying to write a simple Java function to return the entire dataset for a given table (all records). I'm doing something like this (assume the "Customer avro" schema has been defined): DatasetReader<Customer> reader = null; RandomAccessDataset<Customer> customers = Datasets.load(PropertyManager.getDatasetURI(HBaseHelper.CUSTOMER), Customer.class); reader = customers.newReader(); According to the API docs, this should return the entire unflitered dataset. The URI method also uses the "dataset:" scheme so it is not getting a View. What I'm seeing is that only a very small subset of the entire table is actually returned when I get a handle to the iterator - ~20 out of 15000 records that are actually in the table, which is barely 0.1%. Please advise on how to get all records and if this is a defect with Kite - using the native HBase API is not an option because of the Kite encoding which is challenging to work with outside of Kite. EDIT: we do not seem to see this issue on a single-node HBase, only on an HBase cluster with Kerberos auth.
... View more
Labels:
- Labels:
-
Apache HBase
-
Kerberos