About nyadav

tribbloid · ‎01-19-2022

What's the name of the bug/ticket? Is it solved?

pmj · ‎10-30-2018

@nyadav Hi, we are having same issue ... even though by default gpl compression libraries are installed, still i am getting this error. please let me know what you did to resolve this.

nyadav · ‎06-28-2017

@prsingh You need to pass databricks csv dependencies, either you need to download the jar or pass dependencies at run time. 1) download the dependency at run time pyspark --packages com.databricks:spark-csv_2.10:1.2.0 df = sqlContext.read.load('file:///root/file.csv',format='com.databricks.spark.csv',header='true',inferSchema='true') or 2) pass the jars while starting a) downloaded the jars as follow: wget http://search.maven.org/remotecontent?filepath=org/apache/commons/commons-csv/1.1/commons-csv-1.1.jar -O commons-csv-1.1.jar wget http://search.maven.org/remotecontent?filepath=com/databricks/spark-csv_2.10/1.0.0/spark-csv_2.10-1.0.0.jar -O spark-csv_2.10-1.0.0.jar b) then start the python spark shell with the arguments: ./bin/pyspark --jars "spark-csv_2.10-1.0.0.jar,commons-csv-1.1.jar" c) load as dataframe df = sqlContext.read.load('file:///root/file.csv',format='com.databricks.spark.csv',header='true',inferSchema='true') Let me know if above helps!

prsingh1 · ‎06-28-2017

@nyadav That message is not causing your workflow to fail . Please find below article which explains the same. https://community.hortonworks.com/questions/57384/oozie-mysql-error-hortonworks-25oozie-mysql-error.html Can you please provide oozie logs for the workflow scheduled oozie job -oozie http://<oozie host>:<oozie port>/oozie -log <WF_ID>

kramakrishnan · ‎04-02-2017

@nyadav To add to @krajguru's answer: You can use query parameter is_current=true to retrieve the latest configuration version in case you have multiple versions of configurations. E.g http://<Ambari-hostname>:8080/api/v1/clusters/<cluster-name>/configurations/service_config_versions?service_name=HDFS&is_current=true

nyadav · ‎03-27-2017

Thanks @prsingh, able to run the job after the change

nyadav · ‎03-22-2017

Thanks @prsingh, able to add service after correcting the permission

michal_baran · ‎03-08-2018

Hi guys, I still don't get the point of specifying the variable while you provide entire path to the spark2 client. Could you please give me a reason for doing so? On HDP 2.6.2 I use there is enough to specify a path to appropriate spark client and then the version is chosen automatically.

nyadav · ‎01-04-2017

Was an interesting issue faced last week. Putting here for bigger audience, might be helpful to others too. PROBLEM On one of the node, datanode and nodemanager were not coming up. Below is the error after starting from ambari. resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start datanode'' returned 1. starting datanode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-datanode-ny-node3.hwxblr.com.out Error: Could not find or load main class org.apache.hadoop.hdfs.server.datanode.DataNode As datanode process itself wasn't loaded, so nothing was printed in datanode logs. Only thing we see in .out file is Error: Could not find or load main class org.apache.hadoop.hdfs.server.datanode.DataNode Verified in the jar, DataNode class was present. /usr/jdk64/jdk1.8.0_77/bin/jar -tvf /usr/hdp/2.5.0.0-1245/hadoop-hdfs/hadoop-hdfs-2.7.3.2.5.0.0-1245.jar | grep DataNode.class org/apache/hadoop/hdfs/server/datanode/DataNode.class ROOT CAUSE @nvadivelu came to rescue. We used below utility to figure out which class was missing. public class Sample { public static void main(String[] args) { try { org.apache.hadoop.hdfs.server.datanode.DataNode.main(args); } catch (Throwable ex) { ex.printStackTrace(); } } } We ran the above code, and it printed the exact class which wasn't able to load. /usr/jdk64/jdk1.8.0_77/bin/javac -cp `hadoop classpath` Sample.java Sample.java:5: error: cannot access TraceAdminProtocol org.apache.hadoop.hdfs.server.datanode.DataNode.main(args); ^ class file for org.apache.hadoop.tracing.TraceAdminProtocol not found 1 error TraceAdminProtocol clas is present hadoop-common jar. We grep this class in the hadoop-common jar, we didn't find. But on other host, where datanode was running fine, we got below results. grep "TraceAdminProtocol" /usr/hdp/2.5.0.0-1245/hadoop/hadoop-common-2.7.3.2.5.0.0-1245.jar Binary file /usr/hdp/2.5.0.0-1245/hadoop/hadoop-common-2.7.3.2.5.0.0-1245.jar matches Also we verified size of this jar was less compared to the working one. RESOLUTION We copied this jar from the working host and datanode and nodemanager came up fine. We had no clue, from where this jar came, even of same version. But it was a good learning experience.

krajguru · ‎01-03-2017

@nyadav You can change this default value of 300 seconds in the [libdefaults] section of the krb5.conf file. But for security reasons, do not increase the clock skew beyond 300 seconds.

Online	Offline
Last Visited	‎08-15-2019 06:35 AM

Member Since	‎04-20-2016 12:05 PM
Last Visited	‎08-15-2019 06:35 AM
Posts	61
Kudos received	17

Cloudera Community

Re: NN failed during setting up cross realm trust ...

Re: java.io.IOException: Cannot find AWS access ke...

Re: Storm impersonation is not working. Appreciate...

Re: Unable to see Hive tables in Atlas UI after ...

Re: I am trying to insert into HIVE table through ...

Re: Spark Thrift Servers are failing to renew TGTs...

Re: ERROR lzo.GPLNativeCodeLoader: Could not load ...

Re: Pysprak issue

Re: oozie workflow failing

Re: accessing variable names in Ambari through RE...

Re: Falcon Mirror issue

Re: Failure while adding new hadoop services throu...

Re: how to choose which version of spark be used i...

Datanode is not coming up, failing with Error: Cou...

Re: Why I'm able to access the hdfs even after ker...