Member since
07-17-2019
738
Posts
433
Kudos Received
111
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2619 | 08-06-2019 07:09 PM | |
2854 | 07-19-2019 01:57 PM | |
4047 | 02-25-2019 04:47 PM | |
4030 | 10-11-2018 02:47 PM | |
1343 | 09-26-2018 02:49 PM |
07-05-2017
03:26 PM
1 Kudo
No, if you need to store 100 columns per row, you need to set the values on the prepared statement for each row.
... View more
06-30-2017
07:10 PM
Please use jstack to capture a stacktrace from the thin client as well as the PQS instance your client is talking to.
... View more
06-30-2017
03:46 PM
What's the actual error that the client sees? Or, does the client hang?
... View more
06-22-2017
04:04 PM
1 Kudo
"But I think you left an adjective out. I suppose you mean: bad?" Haha, oh my. This is why I shouldn't write responses late at night. Yes, "bad" 🙂 "We have been told by experts at Azure and Hashmap that we don't need to do major compaction and it is currently shut off." -- Again, what do you mean by "compactions are shut off"? HBase is still running compactions and will trigger compactions automatically which include all files in a region (which are "major compactions"). "We have been told that major compaction will block any writes to our
tables (we can't have this). I was told this is untrue at PhoenixCon but
when I asked Hashmap, they said that HDInsight has rewritten major
compaction and that it blocks writes." -- This sounds completely false to me. Last I checked, HDInsight was still HDP and the version of HBase included in HBase does not block writes during compactions. "2. We want to start using TTL. If minor compaction deletes these
records (that is what I took from above) is major compaction required?" Yes, this is explicitly called out in the following documentation that TTL's are applied during minor compactions https://hbase.apache.org/book.html#ttl
Major compactions will still be run on your system, whether or not you have scheduled (daily/weekly) configurations. You cannot and should-not try to disable compactions from running. Not running compactions means that the number of files in your system will continue to grow and the query performance will decrease significantly, not to mention put unnecessary pressure on the Namenode. "3. Why is there so much confusion about this?! Everyone seems to think TTL requires major compaction." I have to assume you're just frustrated and this is rhetorical. I can't tell you why people think what they do. I can point you at the official documentation.
... View more
06-22-2017
05:04 AM
"What does "expired data is filtered out and is not written back to the compacted StoreFile." mean?" Filtered data (by TTL) would be removed on compaction. That is what this statement means. As to your confusion from your test, remember that there is a difference between a "minor compaction" and a "major compaction". A "major compaction" is a re-writing of all files in a region, whereas a "minor compaction" is (possibly) a subset of the files in a Region. Because (I'm guessing) you actually mean that you've disabled schedule major compactions doesn't mean that compactions will never run in your system (this is actually a really idea if you've somehow done this, by the way). A minor compaction can remove data masked by a TTL -- this is the simple case. However, tombstones can *only* be removed when a major compaction runs. This is because the tombstone may be masking records that exist in a file which was not included in the compaction process. Long-story short: if you want to verify why your experiment works as it does, just grep out the region identifier for your table from the RegionServer log. You should see a message INFO (if not INFO, certainly at DEBUG) which informs you that a compaction occurred on that Region, the number of input files and size and the size of the output file. Before that compaction message, the file on disk would be the full size; after, it would be the reduced size. https://hbase.apache.org/book.html#compaction does a pretty good job explaining these nuances, but feel free to ask for more clarification.
... View more
06-21-2017
03:56 PM
That is simply telling you that you have two nodes not running HBase "master" services (those in charge of coordinating cluster operations). HBase only allows one active master at a time, but you can have multiple running in hot-standby. Since you only have 3 nodes, it is completely expected that you would only have one node running HBase master services. You should just proceed with your installation.
... View more
06-21-2017
03:49 PM
This grammar was added in Apache Phoenix-4.9.0 but does not yet exist in the version of Phoenix bundled in an HDP release. https://issues.apache.org/jira/browse/PHOENIX-476
... View more
06-20-2017
03:20 PM
1 Kudo
Maybe. ZooKeeper is used for a variety of systems in HDP for high-availability and distributed locking scenarios. For example, high-availability in HDFS with automatic failover requires ZooKeeper.
... View more
06-16-2017
06:53 PM
https://hbase.apache.org/book.html#trouble.log.gc https://hbase.apache.org/book.html#gcpause https://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
... View more
06-14-2017
03:40 PM
I would never rely on undocumented code. This is likely exposed because there is no functionality built into the REST server to safely handle it. If a client would request too much data than the REST server can fit into memory, it will cause an OutOfMemoryError and crash.
... View more