About jayadeep_jayara

dkozlowski · ‎06-13-2017

Hi @Jayadeep Jayaraman That is great - thanks for letting me know

jayadeep_jayara · ‎06-12-2017

I resolved it by removing the column on which the table was partitioned from the dataframe

abilgi · ‎12-26-2017

This setting (use.hive.interactive.mode=true) is available in Hive View 2.0 since it's inception in HDP 2.5

jayadeep_jayara · ‎02-27-2017

I resolved it, by adding few extra lines in the rewrite rule to handle websocket interaction RewriteRule ^/ws(.*)$ ws://localhost:9995/ws [P]

LesterMartin · ‎10-06-2016

It is clear you've done some research already on YARN, so I'll try to respond briefly to each to see if my responses are what you are needing to proceed. 1. In general, the RM picks a worker node that can run the AM container, so yes, good to visualize it as a round-robin approach. 2. The RM has a subtask who is responsible to watch the AM's and if one of them dies can restart it's container on another node. The AM itself needs to be written in such a way that it is restartable, but generally speaking the ones we all use (MR, Tez, Spark, etc) are all restartable. That said, "the client" may, or may not, be affected, but the "job" itself can run to completion despite an AM failure. 3. This is up to the AM to request of the RM what container sizes it needs (and how many). So, yes, theoretically they could be of different sizes, but often will be the same size. Spark is a good example as we usually ask for N containers, but want them to all have the same # of cores and amount of memory. 4. I believe the RM is going to ensure all jars are shipped to the needed NodeManager (NM) instances on the worker nodes where your containers will be at. I also believe you have some options about pre-placing your jars on HDFS, but that's been a while since I toyed with than and it was with MapReduce. Hope this helps your understanding some. Good luck!

rajkumar_singh · ‎09-17-2016

@Jayadeep Jayaraman Please see my inline comments. 1. If I want to use separate tez configurations for hive is changing hive-site.xml sufficient? take advantage to hive session level property settings or use --hiveconf to set the session level param to override the default hive/tez params 2. How to setup tez on this new node? install tez-client on this new node 3. How will hive or any other application know about tez that has been newly setup on this node? I am trying to basically understand how an application (hive, pig etc) is aware of tez. tez.lib.uris point to tez jars on hdfs which tells hive client to use tez library at runtime. to see this location you can use set tez.lib.uris; from your client

bikas · ‎12-09-2016

No I dont think Spark will uncache a different data set when a new one is cached. How are you going to load balance or failover from one STS to another?

azeltov · ‎09-13-2016

@Kirk Haslbeck Michael is correct you will get 5 total executors

Online	Offline
Last Visited	‎03-05-2018 01:41 PM

Member Since	‎09-13-2016 03:55 PM
Last Visited	‎03-05-2018 01:41 PM
Posts	31
Kudos received	5

Cloudera Community

Re: Spark 2.1 Hive Partition Adding Issue ORC Form...

Re: Zeppelin Home Page Blank

Re: ORC Table Timestamp PySpark 2.1 CASTIssue

Re: Spark 2.1 Hive Partition Adding Issue ORC Form...

Re: Hive LLAP in Hive View Ambari

Re: Zeppelin Home Page Blank

Re: YARN Questions

Re: Hive + Hiveserver2 + Tez

Re: Integration Spark with Tableau

Re: Spark num-executors setting