Member since
07-17-2019
738
Posts
432
Kudos Received
111
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1109 | 08-06-2019 07:09 PM | |
1345 | 07-19-2019 01:57 PM | |
1290 | 02-25-2019 04:47 PM | |
2347 | 10-11-2018 02:47 PM | |
620 | 09-26-2018 02:49 PM |
04-28-2021
05:00 PM
Frustrating that that link is hidden behind a 'paywall'. I have an account but I am not allowed to view without contacting sales
... View more
04-16-2021
06:39 AM
I am also getting the same issue and restarted zookeeper and then region server and then hbase master but issue didn't resolve. Even I have deleted hbase znode but still issue is there. Regards, Satya
... View more
12-14-2020
02:37 AM
echo "scan 'emp'" | $HBASE_HOME/bin/hbase shell | awk -F'=' '{print $2}' | awk -F ':' '{print $2}'|awk -F ',' '{print $1}'
... View more
07-01-2020
01:23 PM
The Phoenix-Hive storage handler as of v4.14.0 (CDH 5.12) seems buggy. I was able to get the Hive external wrapper table working for simple queries, after tweaking column mapping around upper/lower case gotchas. However, it fails to work when I tried the "INSERT OVERWRITE DIRECTORY ... SELECT ..." command to export to file: org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): Undefined column. columnName=<table name> This is a known problem that no one is apparently looking at: https://issues.apache.org/jira/browse/PHOENIX-4804
... View more
05-17-2020
08:41 PM
Hi @kettle
As this thread was marked 'Solved' in June of 2016 you would have a better chance of receiving a useful response by starting a new thread. This will also provide you with the opportunity to provide details specific to your use of the PutSQL processor and/or Phoenix that could aid others in providing a more tailored answer to your question.
... View more
05-10-2020
06:16 AM
Hi All Based on the configs here are the settings that helped me. I am using Dbeaver to connect to Phoenix and queries were timing out as mentioned. In the hbase-conf.xml file i did the following changes hbase.rpc.timeout = 1200000 phoenix.query.timeoutMs=1800000 hbase.regionserver.lease.period = 1200000 hbase.client.scanner.caching = 1000 hbase.client.scanner.timeout.period = 1200000 I believe only phoenix.query.timeoutMs should work in the problem posed by original poster. Please try and see if it helps you.
... View more
04-15-2020
05:10 AM
I am using ConstantSizeRegionSplitPolicy and MaxFileSize is set to 30 GB. But, I found that file is not split across regions when file size reaches 30 GB. Some of my file size is 300 GB across particular regions. Can you please help me to solve this probelm. I have huge volume of data 10 TB.
... View more
01-18-2020
01:40 PM
Have you resolved your issue? I am facing the same issue. Any suggestion would be very helpful. Thanks in advance!
... View more
01-02-2020
08:27 AM
1 Kudo
@o20021106 You can check if you are missing ZK quorum property under hive config [hive-site.xml]. hbase.zookeeper.quorum= Because of missing quorum hive is trying to connect to default hbase znode [/hbase] and not getting anything cause it is not even existing.
... View more
12-30-2019
05:53 AM
If i am having more than one columns as primary key.How shall i proceed.
... View more
09-05-2019
03:16 PM
Hi @elserj while broadly agreeing with the principle of what you are saying, I would amend your earlier comment: "Without enabling Kerberos authentication for HBase, any authorization checks you make are pointless." Knox, in fact offers a HeaderPreAuth Provider for pre-authenticated use cases. It is another matter that it is half baked as far as user groups are concerned. Without belabouring the point any further, impersonation is for users who are not authenticated and so need to piggy back on an authenticated super user. The permissions and ACL still have to be granted to such pre-authenticated users for the resources that they need. It is not the permission and ACL of the super user that is (or should be) utilized for authorization checks for the real user. If the perimeter security provides strong authentication, there should not be a need to further authenticate the same user and too via Krb. Many a resources time is wasted by documentation and commentary recommending this route.
... View more
09-02-2019
08:33 PM
Can you please elaborate in detail with commands you used to resolve?
... View more
06-18-2019
02:27 PM
This is not an error that will cause any kind of problem with your system. RegionServers are known to be reporting the wrong version string. They should give the appropriate HDP-suffixed version string, but do not.
... View more
04-16-2018
07:39 PM
ok thanks Josh I figured out my mistake . I didn't realize that phoenix automatically find the primary key and I don't have to specify the primary key column name explicitly . thanks for the guidance .
... View more
01-24-2018
09:15 AM
@yassine sihi Looks like the port is being utilized by some other process and hence HMaster is not able to bind to that port. Please check the output of the following commadn to determine if any other PID (Process) is consuming any of the following port. On HMaster Host: # netstat -tnlpa | grep 16010
# netstat -tnlpa | grep 16000
. If any other process is already using any of the mentioned port then please kill please check why is it using the port which is supposed to be used by HMaster. The kill the process which is using this port (based on need) and then freshly try starting HMaster.
... View more
11-14-2017
04:37 PM
Session expiration is often hard to track down. It can be a factor of JVM pauses (due to Garbage Collection) on either the client (HBase Master) or server (ZK Server) or it could be a result of a ZNode which has an inordinately large number of children. The brute-force operation would be to disable your replication process, (potentially) drop the root znode, and re-enable replication, and then sync up the tables with an ExportSnapshot or CopyTable. This would eliminate the data in ZooKeeper being a problem. The other course of action would be looking more at the Master log and ZooKeeper server log to understand why the ZK session is expiring (See https://zookeeper.apache.org/doc/trunk/images/state_dia.jpg for more details on the session lifecycle). A good first step would be checking the number of znodes under /hbase-unsecure/replication.
... View more
12-28-2018
01:36 PM
What helped me was, first copying the file to the $NIFI_HOME/lib folder then giving the full path of the jar file in the ExecuteStreamCommand processor. So the config looked like "-jar; /opt/nifi-1.7.1/lib/mycode.jar". Couple things to ensure is that the jar is owned by the same user that NiFi is running as and the jar could be located anywhere as long as you give full path you should be fine.
... View more
12-27-2017
02:48 PM
I deleted all the snapshots and data after getting a go-ahead from the developers...
... View more
08-08-2017
12:52 AM
That's an incorrect approach. You don't need to add xml files to the jars. As I already mentioned before, you need to add directories where those files located, not files themselves. That's how java classpath work. It accepts jars and directories only. So if you need a resource in the java classpath, you need to have it in a jar file (like you did) OR put the parent directory to the classpath. In Squirrel it can be done in the Extra classpath tab of the Driver configuration:
... View more
07-20-2017
08:18 PM
You need to install phoenix-server.jar to all Region and Master servers. MetaDataEndpointImpl
... View more
06-21-2017
04:03 PM
1 Kudo
Hi @Sami Ahmad Normally, master services can be spread across the master nodes to ensure proper resource allocation depending on the cluster. If you have two datanodes/worker nodes that you do not want to run master services on, then, no problem, just allocate the host you want and move on to the next step. In Ambari, you click on the Hosts tab to see what services are installed on what host, but, you may need to go through them host by host.
... View more
06-01-2017
05:00 PM
20 Kudos
I was recently involved with, quite possibly, the worst HBase performance debugging issue so far in my lifetime. The issue first arose with a generic problem statement: after X hours of processing, tasks accessing HBase begin to take over 10 times longer than prior. Upon restarting HBase, performance returned to expected levels. There were no obvious errors in HBase logs, HDFS logs, or the host's syslog. This problem would manifest itself on a near-constant period: every X hours after restart. It affected different types of client tasks (those reading and writing), and was not limited to a specific node or set of nodes. Strangely, despite all inspection of HBase logs and profiling information, HBase seemed to be functioning perfectly fine. Just, slower. This lead us to investigate numerous operating system configuration changes and monitoring, none of which completely described the circumstances and symptoms of the problem. After many long days of investigation and some JVM options, we stumbled onto the first answer which satisfied (or, at least, didn't invalidate) the circumstances: a known, unfixed bug in Java 7 in which the JIT code compilation is disabled after the JIT's code cache executes a flush to reclaim space. https://bugs.openjdk.java.net/browse/JDK-8051955 The JIT (just-in-time) compiler runs behind the scenes in Java compiling Java byte-code into native machine code. Code compilation is a tool designed to help long-lived Java applications run fast without negatively affecting the start-up time of short-lived applications. After methods are invoked, they are compiled from Java byte code into machine code and cached by the JVM. Subsequent invocations of a method which are cached can directly invoke the machine code instead of having to deal with Java byte-code. Analysis:
On a 64-bit JVM with Java 7, this cache has a size of 50MB which is a sufficient amount of size for most applications. Methods which are not used frequently are evicted from this cache; this helps avoid the JVM from quickly reaching the limit. However, with sufficient time, this cache can still become full and trigger a temporary halting of JIT compilation and caching to flush the cache. However in Java 7, there is an unresolved issue in that JIT compilation is not re-enabled after the code cache is flushed. While the process continues to run, no machine code will be cached which means that code is constantly being re-compiled from byte code into machine code. We were able to confirm that this is what was happening by enabling two JVM options for the HBase services in hbase-env.sh:
-XX:+PrintCompilation
-XX:+PrintSafepointStatistics
The first option prints a log message for every compilation, every method marked as "not entrant" (the method is candidate to be removed from the cache), and every method marked as "zombie" (removed from the cache). This is helpful in determining when the JIT compilation is happening. The second option prints debugging information about JVM safepoints which are invoked. A JVM safepoint scan be thought of as a low-level "lock" -- the safepoint is taken to provide mutual exclusion at the JVM level. A common use for enabling this option is to analyze the frequency and time taken by garbage collection operations. For example, the concurrent-mark-and-sweep (CMS) collector takes safepoints for various points in its execution. When the code cache becomes full and a flushing event occurs, a safepoint is taken named "HandleFullCodeCache". The combination of these two options can show that a Java process is performing JIT compilation up until the point that the "HandleFullCodeCache" safepoint is executed, and then not further JIT compilation happens after that point. In our case, the time after JIT compilation was not happening was near within one hour of when the tasks reportedly began to see performance issues. In our case, we did not observe the following log message which was meant to make this obtuse issue more obvious. We missed it because we were working remotely and on a decent sized installation which made it not feasible to collect and analyze all logs: Java HotSpot(TM) 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled. Solution: There are two solutions to this problem: one short-term and one long-term. The short-term solution is to increase the size of the JVM Code Cache from the default of 50MB on 64-bit JVMs. This can be accomplished via the -XX:ReservedCodeCacheSize JVM option. Increasing this to a larger value can ultimately prevent the code cache from ever becoming completely full. export HBASE_SERVER_OPTS="$HBASE_SERVER_OPTS -XX:ReservedCodeCacheSize=256m" On HDP releases <=2.6, it is necessary to set HBASE_REGIONSERVER_OPTS variable explicitly instead. export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -XX:ReservedCodeCacheSize=256m" The implication of this configuration is that it would remove available on-heap memory, but this is typically quite minor (100's of MB when we typically consider 1's of GB). The long-term solution is to upgrade to Java 8. Java 7 is long end-of-life'ed by Oracle and this is a prime example of known issues which were never patched in Java 7. It is strongly recommended that any user still on Java 7 have a plan to move to Java 8 as soon as possible. No other changes would be required on Java 8 as it is not subject to this bug.
... View more
- Find more articles tagged with:
- Data Processing
10-27-2017
08:16 PM
1 Kudo
select DISTINCT("TABLE_NAME") from SYSTEM.CATALOG; this works from zeppelin , select DISTINCT("TABLE_NAME") from SYSTEM.CATALOG;
... View more
05-08-2017
03:04 PM
Thank you for your fast answer 🙂 Here is a piece from logs: handler.OpenRegionHandler: Failed open of region=tsdb,\x00\x02OX\xB2`\xD0\x00\x00\x01\x00\x97D\x00\x00\x03\x00\x00_\x00\x00\x04\x00\x004,1489718792446.a38ea9c28bd1a11574e831668d80c19f., starting to roll back the global memstore size.
org.apache.hadoop.hbase.DoNotRetryIOException: Compression algorithm 'snappy' previously failed test.
at org.apache.hadoop.hbase.util.CompressionTest.testCompression(CompressionTest.java:91)
at org.apache.hadoop.hbase.regionserver.HRegion.checkCompressionCodecs(HRegion.java:6560)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6512)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6479)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6450)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6406)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6357)
at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:362)
at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:129)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
... View more
09-12-2017
10:00 AM
2 Kudos
Just in case someone is interested, here is the systemd unit file I use for the Thrift service on CentOS 7.3. [Unit]
Description=Thrift Service
After=network.target
[Service]
User=hbase
Type=forking
PIDFile=/var/run/hbase/hbase-hbase-thrift.pid
ExecStart=/usr/hdp/current/hbase-master/bin/hbase-daemon.sh start thrift
ExecStop=/usr/hdp/current/hbase-master/bin/hbase-daemon.sh stop thrift
Restart=on-abort
[Install]
WantedBy=default.target
... View more
05-08-2017
04:27 PM
Set DEBUG logging on HBase -- it's probably a misconfiguration on your side with NiFi. DEBUG logging on HBase would prove this. Did you include a copy of hbase-site.xml and core-site.xml when providing the JAR to NiFi? If you didn't, the client isn't going to try to use Kerberos auth; there are many articles already covering how to do this.
... View more
04-05-2017
04:27 PM
4 Kudos
The Phoenix Query Server is an HTTP server which expects very specific request data data. Sometimes, in the process of connecting different clients, the various configuration options of both client and server can create confusion about what data is actually being sent over the wire. This confusion leads to questions like "did my configuration property take effect" and "is my client operating as I expect". Linux systems often have a number of tools available for analyzing network traffic on a node. We can use one of these tools, ngrep, to analyze the traffic flowing into the Phoenix Query Server. From a host running the Phoenix Query Server, the following command would dump all traffic from any source to the Phoenix Query Server. $ sudo ngrep -t -d any port 8765 The above command will listen to any incoming network traffic on the current host and filter out any traffic which is not to the port 8765 (the default port for the Phoenix Query Server). A specific network interface (e.g. eth0) can be provided instead of "any" to further filter traffic. When connecting a client to the server, you should be able to see the actual HTTP requests and responses sent between client and server. T 2017/04/05 12:49:07.041213 127.0.0.1:60533 -> 127.0.0.1:8765 [AP]
POST / HTTP/1.1..Content-Length: 137..Content-Type: application/octet-stream..Host: localhost:8765..Connection: Keep-Alive..User-Agent: Apache-HttpClient/4.5.2 (Java/1.8.0_45)..Accept-Encoding: gzip,deflate.....?org.apache.calcite.avatica.proto.Requests$OpenConnectionRequest.F.$2ba8e796-1a29-4484-ac88-6075604152e6....password..none....user..none
##
T 2017/04/05 12:49:07.052011 127.0.0.1:8765 -> 127.0.0.1:60533 [AP]
HTTP/1.1 200 OK..Date: Wed, 05 Apr 2017 16:49:07 GMT..Content-Type: application/octet-stream;charset=utf-8..Content-Length: 91..Server: Jetty(9.2.z-SNAPSHOT).....Aorg.apache.calcite.avatica.proto.Responses$OpenConnectionResponse......hw10447.local:8765
## The data above is in ProtocolBuffers which is not a fully-human readable format; however, "string" data is stored as-is which makes reading it a reasonable task.
... View more
- Find more articles tagged with:
- How-ToTutorial
- ngrep
- Phoenix
- pqs
- Sandbox & Learning
Labels: