Member since
09-13-2016
31
Posts
5
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2709 | 06-12-2017 07:05 AM | |
1629 | 02-27-2017 06:42 PM |
06-13-2017
08:11 AM
Hi @Jayadeep Jayaraman That is great - thanks for letting me know
... View more
06-12-2017
07:05 AM
I resolved it by removing the column on which the table was partitioned from the dataframe
... View more
12-26-2017
10:26 AM
This setting (use.hive.interactive.mode=true) is available in Hive View 2.0 since it's inception in HDP 2.5
... View more
02-27-2017
06:42 PM
I resolved it, by adding few extra lines in the rewrite rule to handle websocket interaction RewriteRule ^/ws(.*)$ ws://localhost:9995/ws [P]
... View more
10-06-2016
04:29 PM
2 Kudos
It is clear you've done some research already on YARN, so I'll try to respond briefly to each to see if my responses are what you are needing to proceed. 1. In general, the RM picks a worker node that can run the AM container, so yes, good to visualize it as a round-robin approach. 2. The RM has a subtask who is responsible to watch the AM's and if one of them dies can restart it's container on another node. The AM itself needs to be written in such a way that it is restartable, but generally speaking the ones we all use (MR, Tez, Spark, etc) are all restartable. That said, "the client" may, or may not, be affected, but the "job" itself can run to completion despite an AM failure. 3. This is up to the AM to request of the RM what container sizes it needs (and how many). So, yes, theoretically they could be of different sizes, but often will be the same size. Spark is a good example as we usually ask for N containers, but want them to all have the same # of cores and amount of memory. 4. I believe the RM is going to ensure all jars are shipped to the needed NodeManager (NM) instances on the worker nodes where your containers will be at. I also believe you have some options about pre-placing your jars on HDFS, but that's been a while since I toyed with than and it was with MapReduce. Hope this helps your understanding some. Good luck!
... View more
09-17-2016
04:56 PM
3 Kudos
@Jayadeep Jayaraman Please see my inline comments. 1. If I want to use separate tez configurations for hive is changing hive-site.xml sufficient? take advantage to hive session level property settings or use --hiveconf to set the session level param to override the default hive/tez params 2. How to setup tez on this new node? install tez-client on this new node 3. How will hive or any other application know about tez that has been newly setup on this node? I am trying to basically understand how an application (hive, pig etc) is aware of tez. tez.lib.uris point to tez jars on hdfs which tells hive client to use tez library at runtime. to see this location you can use set tez.lib.uris; from your client
... View more
12-09-2016
10:31 PM
No I dont think Spark will uncache a different data set when a new one is cached. How are you going to load balance or failover from one STS to another?
... View more
09-13-2016
08:25 PM
1 Kudo
@Kirk Haslbeck Michael is correct you will get 5 total executors
... View more