Member since
07-17-2019
738
Posts
433
Kudos Received
111
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3473 | 08-06-2019 07:09 PM | |
| 3669 | 07-19-2019 01:57 PM | |
| 5192 | 02-25-2019 04:47 PM | |
| 4663 | 10-11-2018 02:47 PM | |
| 1768 | 09-26-2018 02:49 PM |
07-13-2016
02:24 PM
1 Kudo
The Phoenix Query Server API documentation can be found on the Apache Calcite website. PQS is essentially a branding of the Avatica Server.
https://calcite.apache.org/avatica/docs/protobuf_reference.html The write performance of the Thin Driver with PQS is very close to the performance of the Thick driver as long as you as the client is using the batch-oriented APIs to write data (https://calcite.apache.org/avatica/docs/protobuf_reference.html#prepareandexecutebatchrequest). These map to the executeBatch() API calls on PreparedStatement. I have done performance evaluation which showed that, when using these API calls, performance between thick and thin is roughly equivalent. Non-batch API calls through the thin driver were roughly 30% slower than the thick driver. On the read side, I have not noticed any discernible difference in the performance between the two approaches. Overall, the thick driver is likely going to perform better than the thin driver because it has to do less work, but the gap is small. Depending on the physical deployment of your application, deploying multiple PQS instances on "beefier" server nodes than lightweight "edge" nodes, the thin driver might result in better overall performance.
... View more
07-13-2016
02:17 PM
Calling it a "thrift proxy" is a bit inaccurate (Apache Thrift is not in the picture at all).
... View more
07-13-2016
01:53 AM
The size of a table, in bytes, is not necessarily tied to the number of regions. For example, a change in configuration might cause more or less regions for the same amount of data. I don't have any definitive explanation why you saw the number of regions spike to 27; it might have just been transient. The number of regions likely increased from 5 to 17 due to splitting of the regions in this table as a part of the compaction. You can investigate the RegionServer and Master logs on your cluster for the given table to understand if the regions underwent any splits. There are many reasons that the number of regions might have increased -- it is hard to definitively say why given the information you provided so far. I would not be worried about 17 regions instead of only 5.
... View more
07-11-2016
08:37 PM
1 Kudo
I would highly recommend using Ambari to install your cluster to avoid future issues. It looks like the ZooKeeper nodes cannot communicate with one another. Are 10.0.1.103 and 10.0.1.105 the proper IP addresses? Can the node which you copied the exception from reach the nodes specified by those IP address? Have you inspected if the other nodes have errors?
... View more
07-08-2016
05:24 PM
Thanks, @Joshua Adeleke. Like in the other question linked by Srai, if you know the specific file(s) your job is reading, you could try to use the `hdfs debug recoverLease` command on those files. Normally, a lease on an HDFS file will expire automatically if the writer abnormally goes away without closing the file. If you are sure no client is trying to write the file, you could try the recoverLease to force the NN to let this operation succeed.
... View more
07-08-2016
03:50 PM
Can you share the hdfs fsck command you ran? It definitely sounds like HDFS is not healthy.
... View more
07-07-2016
03:34 PM
Loading jars out of HDFS, as enabled by HBASE-1936, would be an alternative to copying the jars to the local filesystem on each node running HBase.
... View more
07-06-2016
07:35 PM
1 Kudo
The book will cover what properties to set in hbase-site.xml which you can do via Ambari. However, it will depend on you copying out the necessary jar(s) to your cluster as well (/usr/hdp/current/hbase-client/lib should do the trick).
... View more
07-02-2016
05:35 PM
1 Kudo
It's just a "bug". The warning message just needs to be suppressed. This will be fixed in the final shipped version of HDP 2.5. I assume the sandbox just grabbed an earlier version that has this.
... View more
07-01-2016
03:47 PM
60 values of CODNRBEENF per day or in total? If you have 60 unique CODNRBEENF per day, leading with that column would be better. Otherwise, the date is probably better over time. If you are also querying on CODINTERNO and CODTXF (with FECHAOPRCNF and CODNRBEENF), then it makes sense to include them. It is not a problem to have four columns in the primary key constraint.
... View more