Member since
06-12-2018
19
Posts
1
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6732 | 07-10-2018 05:31 PM | |
843 | 06-12-2018 06:52 PM |
07-23-2018
07:00 PM
@rinu shrivastav If you want to test before change the parameters you can do it using beeline cli. set mapred.reduce.tasks=0
... View more
07-23-2018
02:57 PM
Example. I run "show databases;" 132 rows selected (56.052 seconds)
0: jdbc:hive2://zk3>
I run again using the same server/parameter 132 rows selected (0.232 seconds)
0: jdbc:hive2://zk3>
56 seconds was ok, sometimes we wait for hours.
... View more
07-23-2018
02:36 PM
Hello dude @Vinicius Higa Murakami I tried to enable debug mode but the problem is after I run some beeline command, so it is being stuck without log. jstack -l hiveserverPID results have too many information but the final result that I got was: JNI global references: 399
Zookeeper: echo wchs | nc 127.0.0.1 2181
335 connections watching 2300 paths
Total watches:22724 There few Errors on HS2 as below: 2018-07-23 07:01:19,321 ERROR [pool-7-thread-188]: server.TThreadPoolServer (TThreadPoolServer.java:run(297)) - Error occurred during processing of message.
2018-07-23 07:01:24,324 ERROR [pool-7-thread-188]: server.TThreadPoolServer (TThreadPoolServer.java:run(297)) - Error occurred during processing of message.
2018-07-23 07:01:29,327 ERROR [pool-7-thread-188]: server.TThreadPoolServer (TThreadPoolServer.java:run(297)) - Error occurred during processing of message.
2018-07-23 08:54:09,306 WARN [pool-7-thread-198]: conf.HiveConf (HiveConf.java:initialize(3093)) - HiveConf of name hive.log.file does not exist
2018-07-23 08:54:09,411 WARN [pool-7-thread-198]: conf.HiveConf (HiveConf.java:initialize(3093)) - HiveConf of name hive.log.dir does not exist
2018-07-23 08:54:09,411 WARN [pool-7-thread-198]: conf.HiveConf (HiveConf.java:initialize(3093)) - HiveConf of name hive.log.file does not exist
2018-07-23 11:00:04,463 ERROR [pool-7-thread-192]: security.JniBasedUnixGroupsMapping (JniBasedUnixGroupsMapping.java:logError(73)) - error looking up the name of group 1000000000: No such file or directory
... View more
07-18-2018
12:19 PM
Hello everyone. I face a random problem when I use beeline passing a ZK as a connection. eg: beeline -u "jdbc:hive2://myZK1:2181,myZK2:2181,myZK3:2181,myZK4:2181,myZK5:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" What I noticed is, using this string I always connect on myZK1, as follow: 0: jdbc:hive2://myZK1> Sometimes every query gets stuck, taking too long to show something. I changed beeline string connection, removing the first ZK (myZK1) and I was able to query normally (using ZK2). 0: jdbc:hive2://myZK2> What I suspect, some overload using the first Zk. I used the following command and I got almost 400 connection on ZK1 and about 250 on ZK2 # echo cons | nc localhost 2181 | wc -l
Does zookeeper has an internal load balancer? Why when I connect using that string connection do I always connect on the first ZK? If my suspicion about overload is right, what can I do? A Load Balancer such as HA Proxy can solve this? Thanks in advance!
... View more
Labels:
- Labels:
-
Apache Hive
07-12-2018
08:10 PM
@Eric Richardson Try to use your ip address or hostname instead of localhost, you can check if the port 6667 is associated to your IP or 0.0.0.0 using the command below. Netstat -atnp | grep 6667
... View more
07-12-2018
02:51 PM
@Miguel Guirao make sure the docker version is the same as required. https://br.hortonworks.com/tutorial/sandbox-deployment-and-install-guide/section/3/ Try to reinstall docker https://docs.docker.com/install/linux/docker-ce/ubuntu/#set-up-the-repository
... View more
07-11-2018
06:01 PM
@Xianshun Chen I tried on my sandbox and this password works for me. [root@sandbox-hdp ~]# mysql -uroot -phadoop -hlocalhost
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 5.6.39 MySQL Community Server (GPL)
You can try the following link to reset mysql password. https://www.howtoforge.com/setting-changing-resetting-mysql-root-passwords
... View more
07-11-2018
05:46 PM
@Michael Bronson Is not an obligation but if you are using a swap your process will be slower, is very recommend that you disable your swap and THP as follow. https://community.hortonworks.com/articles/55637/operating-system-os-optimizations-for-better-clust.html
... View more
07-11-2018
04:39 PM
@jeffin jacob Make sure that you can reach this webpage (you can try a telnet from your machine to your "IP Port"), as you are using azure, you might have some problems with internal/external IP Address. After the answer of "netstat" as @Vinicius Higa Murakami has mentioned above, check if the bind address (local address) is "0.0.0.0", if the service is associated in a different IP you need to check the route rules.
... View more
07-11-2018
12:13 PM
@Anjali Shevadkar you are right, that's why I asked you to check hive cli, so, seems to be some configuration in your ranger. Did you try to connect using ZK hosts on your connection string? I suggest you check this following document, check the permissions on HDFS. Let me know if this works for you. Make sure the user that you configure as the same as the unix user (or ldap, whatever). Try to configure another user to test. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/configure_ranger_authentication.html Another important thing, check the permissions on your HDFS, because when you are using ranger you need to change the owner/group and permissions https://br.hortonworks.com/blog/best-practices-in-hdfs-authorization-with-apache-ranger/
... View more
07-10-2018
05:48 PM
@Anurag Mishra can you check your kerberos ticket? # klist -e
... View more
07-10-2018
05:48 PM
@Anurag Mishra can you check your kerberos ticket? # klist -e
... View more
07-10-2018
05:31 PM
@Vinit Pandey I suggest on your HDFS server you start a process (using shellscript for example) to execute kinit and after that, get these remote files using sftp or scp example # scp user@remoteserver:/remotepath/files localpath/ and # hdfs dfs -put localpath/files /hdfspath Note: To automate this process you can create a private/public ssh between these servers and create a crontab entry.
... View more
07-10-2018
05:22 PM
Did you already try to use hive directly instead of hue? Are you using ranger? If yes, have you check the policies?
... View more
07-10-2018
05:05 PM
@Anjali Shevadkar If you do the same thing using hive cli instead beeline, can you see the databases? Can other people see something? In my case I am used to use the connection as below: beeline -u "jdbc:hive2://zookeeperfqdn1:2181,zookeeperfqdn2:2181,zookeeperfqdn3:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" Try use ZK FQDN instead hiveserver2 fqdn.
... View more
06-12-2018
07:06 PM
@Adi Jabkowsky Hi there. The main benefit using MTU 9000 is because hadoop works better with large files and you can transfer larger packages than use MTU 1500 (default). I think for detailed results you can use iperf tool. https://support.cumulusnetworks.com/hc/en-us/articles/216509388-Throughput-Testing-and-Troubleshooting I found some best practices too. https://community.hortonworks.com/articles/8563/typical-hdp-cluster-network-configuration-best-pra.html I hope this helps.
... View more
06-12-2018
06:52 PM
@Vinay K I Had this problem before, doing a research I found something related to "When you use MapReduce Process, all of your machines needs to establish a communication" even this DB that you mentioned. I´m not sure if you can do this without public IP. I Suggest you, use a machine (that can communicate with your network and have a floating IP) as a "stage area" and then proceed using mapreduce process, or if possible you can check if you are able to create an internal network on this DB machine to create this communication. I hope I could help you.
... View more
06-12-2018
06:06 PM
@Guru Kamath Considering Edge node's will just execute your jobs as a gateway, the main process will run on your slaves. In my opinion, you don´t need too much storage on Edge nodes, unless you need to use this space to store scripts, process, logs, and other files that you need to use as a part of your process. I suggest you use a separated filesystem to store your files. And if possible use Edge node exclusively for gateway services, installing Ambari on another machine.
... View more
06-12-2018
05:30 PM
1 Kudo
Hello @Rahul Kumar. Could you check your /etc/hosts? I saw you are using "localhost" as your broker/bootstrap. Try to change it to your "hostname" instead of "localhost". Some softwares doesn´t use "loopback" as a listener port by default. Besides that, if you are using the native Kafka from HortonWorks, the default port for brokers is 6667 or 6668 (when using Kerberos) and not 9092. I hope this works.
... View more