About jyadav

jyadav · ‎06-02-2016

@R Wys Can you please try this? just remove .snapshot directory. hdfs dfs -deleteSnapshot /path/ s201605-17-115857.294

jyadav · ‎06-01-2016

@omar harb There is no separate workers when spark runs on YARN, it will run normal as a normal app within YARN in the form of YARN containers. please refer below doc for spark on yarn architecture. https://spark-summit.org/2014/wp-content/uploads/2014/07/Spark-on-YARN-A-Deep-Dive-Sandy-Ryza.pdf http://spark.apache.org/docs/latest/running-on-yarn.html

jyadav · ‎06-01-2016

@omar harb I think HDP comes with Spark on Yarn not standalone unless you installed it manually therefore you won't find the spark master UI inside sandbox. Please start with below doc. http://hortonworks.com/hadoop-tutorial/a-lap-around-apache-spark/

jyadav · ‎06-01-2016

Can you please share your Ambari & HDP version?

jyadav · ‎06-01-2016

Hi @Roberto Sancho Can you test the performance after disabling vectorization and running it on tez? set hive.vectorized.execution.enabled=false; set hive.vectorized.execution.reduce.enabled=false; set hive.execution.engine=tez; Also, please refer below doc for hive orc optimization. https://streever.atlassian.net/wiki/display/HADOOP/Optimizing+ORC+Files+for+Query+Performance

jyadav · ‎06-01-2016

@Roberto Sancho Great.!! :), please accept the answer which helped you to close this thread.

jyadav · ‎05-31-2016

@Timothy Spann Did you tried with this syntax? var Rddtb= objHiveContext.sql("select * from sample") val dfTable = Rddtb.toDF() dfTable.write.format("orc").mode(SaveMode.Overwrite).saveAsTable("db1.test1")

jyadav · ‎05-31-2016

@hari kiran You have lots of transmitted packet errors which will definitely cause performance degrade. It might be occurring due to many issues like faulty NIC, faulty cable,RJ5 connector, duplex setting or some other network layer things. Please ask your OS team to rectify this issue and see if you will get improvement in hdfs writes. Meantime can you share the duplex setting? ethtool eth0

jyadav · ‎05-31-2016

@Sri Bandaru Below should solve this error after adding in below path. Ambari -> HDFS -> Config -> Core Site. hadoop.proxyuser.HTTP.groups = *

jyadav · ‎05-31-2016

@Roberto Sancho Did you added any external jar recently inside cluster or specifically in hive?.

Online	Offline
Last Visited	‎06-02-2017 09:42 PM

Member Since	‎02-02-2016 09:29 AM
Last Visited	‎06-02-2017 09:42 PM
Posts	583
Kudos received	518

Cloudera Community

Re: Ambari release versioning

Re: Atlas application log showing below error

Re: failed to start hive from root

Re: corrupted block issue..i have 100+ corrupted b...

Re: What's causing ClassNotFound: RangerHiveAuthor...

Re: Can't delete snapshots - snapshot read only mo...

Re: Spark Master UI

Re: Spark Master UI

Re: Pig view error for sample script "java.lang.Il...

Re: slow query in hive

Re: hive error connection

Re: Can you create a hive table in ORC Format from...

Re: Copying Data to HDFS takes too much time

Re: Pig view error for sample script "java.lang.Il...

Re: hive error connection