Member since
04-20-2016
61
Posts
17
Kudos Received
13
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
891 | 06-30-2017 01:16 PM | |
897 | 06-30-2017 01:03 PM | |
1159 | 06-30-2017 12:50 PM | |
1139 | 06-30-2017 12:40 PM | |
13224 | 06-30-2017 12:36 PM |
08-21-2017
09:54 AM
All nodes we have lzo installed, and lzo is configured properly. Spark and mr jobs are running fine. But we are seeing exception while running spark job using java-action through oozie. As it is not able to load hadoop-native lib, verified it is present in spark-default.conf. Even in jobs we are not using lzo compression, not sure why it is picking lzo? spark.driver.extraLibraryPath=/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64
spark.executor.extraLibraryPath=/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64
Below is the exception in yarn application logs ERROR lzo.GPLNativeCodeLoader: Could not load native gpl library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
at java.lang.Runtime.loadLibrary0(Runtime.java:870)
at java.lang.System.loadLibrary(System.java:1122)
at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32)
at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2147)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2112)
at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:132)
at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:179)
at org.apache.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:185)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:234)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:275)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
ERROR lzo.LzoCodec: Cannot load native-lzo without native-hadoop
... View more
Labels:
- Labels:
-
Apache Oozie
-
Apache Spark
06-30-2017
01:16 PM
@rakanchi Currently Ambari does not support the definition of multiple Nameservices.
It always assumed hdfs_site['dfs.nameservices'] is just a string defining one nameservice. Known ambari bug : https://issues.apache.org/jira/browse/AMBARI-15506 . Fixed in 2.4.0
... View more
06-30-2017
01:03 PM
@rakanchi It is a known bug in Multi-threaded access to CredentialProviderFactory is not thread-safe. I had similar case with one of the customer. Had to apply hdfs patch : HADOOP-14195
... View more
06-30-2017
12:50 PM
@rakanchi You need to configure storm_jaas.conf with client properties, and pass to storm topology storm_jaas.conf
StormClient {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
keyTab="/etc/security/keytabs/hdfs.headless.keytab"
storeKey=true
useTicketCache=false
serviceName="nimbus"
principal="hdfs@example.com";
};
Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
keyTab="/etc/security/keytabs/hdfs.headless.keytab"
storeKey=true
useTicketCache=false
serviceName="zookeeper"
principal="hdfs@example.com";
};
And pass jaas file with -c option storm jar /usr/hdp/current/storm-client/contrib/storm-starter/storm-starter-*-jar-with-dependencies.jar storm.starter.WordCountTopology wordcount -c java.security.auth.login.config=/my/custom/jaas/path Let me know if it helps!
... View more
06-30-2017
12:40 PM
@rakanchi Look like permission are not set correctly for topics, verify the Kafka policies in Ranger are correct for ATLAS_HOOK topic, ATLAS_ENTITIES topic check follow link, https://github.com/emaxwell-hw/Atlas-Ranger-Tag-Security. Also provide permissions to 'hive' and 'atlas' users on atlas kafka topics ATLAS_HOOK and ATLAS_ENTITIES. Let me know if it helps!
... View more
06-30-2017
12:36 PM
@rakanchi Spark currently doesn't support "insert into feature", you need to create dataframe and append to the table. var data = sqlContext.createDataFrame(Seq(("ZZ", "m:x", 34.0))).toDF("pv", "metric", "value")
data.show()
data.write.mode("append").saveAsTable("results_test_hive")
println(sqlContext.sql("select * from results_test_hive").count())
... View more
06-28-2017
08:45 AM
1 Kudo
@prsingh You need to pass databricks csv dependencies, either you need to download the jar or pass dependencies at run time. 1) download the dependency at run time pyspark --packages com.databricks:spark-csv_2.10:1.2.0
df = sqlContext.read.load('file:///root/file.csv',format='com.databricks.spark.csv',header='true',inferSchema='true') or 2) pass the jars while starting a) downloaded the jars as follow: wget http://search.maven.org/remotecontent?filepath=org/apache/commons/commons-csv/1.1/commons-csv-1.1.jar -O commons-csv-1.1.jar
wget http://search.maven.org/remotecontent?filepath=com/databricks/spark-csv_2.10/1.0.0/spark-csv_2.10-1.0.0.jar -O spark-csv_2.10-1.0.0.jar b) then start the python spark shell with the arguments: ./bin/pyspark --jars "spark-csv_2.10-1.0.0.jar,commons-csv-1.1.jar" c) load as dataframe df = sqlContext.read.load('file:///root/file.csv',format='com.databricks.spark.csv',header='true',inferSchema='true') Let me know if above helps!
... View more
06-28-2017
08:26 AM
Facing issues with oozie workflow. It is failing after we changed schema on one of the table. ERROR SchemaCheckXCommand:517 -
Found [1] extra indexes for columns in table [WF_ACTIONS]: [wf_id]
... View more
Labels:
- Labels:
-
Apache Oozie
03-31-2017
02:52 PM
Thanks @krajguru, that was exactly I was looking for 🙂
... View more
03-31-2017
02:46 PM
How can I list the existing variable names and their values used across various configurations of Ambari through REST API for example {{namenode_heapsize}} ?
... View more
Labels:
- Labels:
-
Apache Ambari