Member since
05-30-2017
22
Posts
2
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
444 | 08-16-2018 05:44 AM |
08-26-2019
11:09 PM
Thank you very much for your inputs.
... View more
08-26-2019
10:28 PM
@jsensharma Thank you for your quick reply. It would be very helpful for me, that you can explain a bit changes means only POM dependency entry changes or actual code related changes? Do I need to rewrite any function?
... View more
08-26-2019
09:40 PM
Greetings,
We are using HDP 2.6.5 mainly for Storm, Kafka, Zookeeper, Spark etc services. We have developed our application in Java and we are deploying our application using JBoss EAP.
Now If I change HDP version 2.6.5 to CDH 6.3 or upgrade it to HDP 3.1.0, do I need to change anything in JBoss configuration or JBoss development perspective? Will there be any development changes from JBoss and Java side if all my code for storm and kafka is in Java.
How JBoss application server will use HDP/CDH services?
... View more
01-11-2019
07:42 AM
Dear All, I have 5 Nodes HDP 3.1.0 with Ambari 2.7.3 cluster. I have installed HDFS, Hive, Hbase, Spark service on cluster.
ser1.dev.local - HDFS, YARN ser2.dev.local - Hive ser3.dev.local - HBase ser4.dev.local - Zookeeper ser5.dev.local - Spark I have 2 workstations, one is for Development and another is having MongoDB: cpu1.dev.local - Spark client, Anaconda, python, Jupyter notebook cpu2.dev.local - MongoDB I have installed spark client on my workstation to access HDFS and spark from cluster using following command: sudo yum install spark2_3_1_0_0_78* I have copied all configuration files from cluster nodes to workstation. I can able to connect to spark and retrieve data from the HDFS cluster.
Following is code that I am using for connecting to MongoDB using Pyspark: import pyspark
from pyspark.sql import SQLContext
from pyspark.sql import SparkSession
sparkConf = pyspark.SparkConf().setMaster("spark://ser5.dev.local:7077")
.setAppName("SparkSr638").setAll([('spark.executor.memory', '16g'),
('spark.executor.cores', '8'), ('spark.cores.max', '32'),
('spark.driver.memory','16g'),('spark.driver.maxResultSize','3g')])
sparkConf.set("spark.mongodb.input.uri", "mongodb://cpu2.dev.local/gkfrm.DayTime")
sc = pyspark.SparkContext(conf = sparkConf)
sqlContext =SQLContext(sc)
df = sqlContext.read.format("com.mongodb.spark.sql.DefaultSource").load() As mentioned above I have created spark application using following configuration: spark executor memory: 16 GB Allocated memory: 16 GB
Cores allocated: 24 And other configuration is default set by HDP and Ambari. The database which I connected is having around 6000000 records. Now my question is, when I run following code in jupyter notebook using pyspark: df.collect() to run above code pyspark takes around 3-4 hours. Why is it taking so much time to process the data. Is there any configuration that need tweaking from HDP
Spark or YARN? Also Just to mention after installing YARN and HBase, Ambari showing one warning for YARN TIMELINE SERVICE V2.0 READER: ATSv2 HBase Application
The HBase application reported a 'STARTED' state. Check took 2.253s is spark performance depends on this warning? How to resolve this? How should I boost spark performance? Please Help.
Thank You in advance.
... View more
Labels:
08-28-2018
07:12 AM
Hi, we have 5 node HDP 3.0 cluster installing spark 2.3.1 on yarn. Following is the structure: spark master: ser5.dev.local:7077. other 4 used as worker nodes. We are trying to run K-Means model using pyspark.ml.clustering.KMeans package. I am able to run the model. in the dataset when i am viewing the columns it shows newly created prediction column. but when I am trying to show the rdd object in which prediction column created it gives me error. Please find below error log: Py4JJavaError: An error occurred while calling o188.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 30.0 failed 4 times, most recent failure: Lost task 0.3 in stage 30.0 (TID 225, ser2.dev.local, executor 2): java.io.InvalidClassException: org.apache.spark.ml.PipelineStage; local class incompatible: stream classdesc serialVersionUID = 3275105016155696140, local class serialVersionUID = 7330592925129616646
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:616)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1781)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1714)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:80)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1602)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1590)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1589)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1589)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1823)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1772)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1761)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2055)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2074)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:363)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3273)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2484)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2484)
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3254)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3253)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2484)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2698)
at org.apache.spark.sql.Dataset.showString(Dataset.scala:254)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.InvalidClassException: org.apache.spark.ml.PipelineStage; local class incompatible: stream classdesc serialVersionUID = 3275105016155696140, local class serialVersionUID = 7330592925129616646
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:616)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1781)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1714)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:80)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) Can any one tell me what configuration i am missing? Please help me. Note: Same code is running fine on Spark Cluster configured without Yarn. not running on Spark 2.3.1 with Yarn installed in HDP 3.0
... View more
Labels:
08-16-2018
05:44 AM
As HDF 3.2 released. which can be installed with Ambari 2.7. Thank you Hortonworks.
... View more
08-16-2018
05:42 AM
Thank you very much @Felix Albani, I copied yarn-site.xml, core-site.xml, hdfs-site.xml to standalone spark instance. and started spark on HDP, and connection established successfully. issue got resolved. Thanks..
... View more
08-10-2018
06:48 AM
I have installed HDP 3.0 cluster on 5 nodes. and installed Spark 2.3.1 using Ambari service on one of the node. Spark installed node is: ser5.dev.local I am trying to access this spark from other system which is not part of the cluster say cpu686.dev.local using pyspark in jupyter notebook. please find below code for reference: import pyspak
from pyspark import SQLContext
conf = pyspark.SparkConf().setMaster("spark://ser5.dev.local:7077").setAppName("SparkServer1").setAll([('spark.executor.memory', '16g'), ('spark.executor.cores', '8'), ('spark.cores.max', '8'), ('spark.driver.memory','16g')])
sc = pyspark.SparkContext(conf=conf)
rddFile = sc.textFile("Filterd_data.csv")
rddFile = rddFile.mapPartitions(lambda x: csv.reader(x))
rddFile.collect()
Now, all connection is proper. spark context is created using the spark://ser5.dev.local:7077 url. RDD rddFile is also ran successfully. but when I ran rddFile.collect() then it keeps running. no output no error. Even we tried to upload csv file with less than 10 records. still it kept on running the code. Is there any way that i can configure Spark, or where i can get master url to check running application in spark. when i click on spark UI in ambari it opens spark-history-server. We tried csv file upload from HDFS using following code conf = pyspark.SparkConf().setMaster("spark://ser5.dev.local:7077").setAppName("SparkServer1").setAll([('spark.executor.memory', '16g'), ('spark.executor.cores', '8'), ('spark.cores.max', '8'), ('spark.driver.memory','16g')])
sc = pyspark.SparkContext(conf=conf)
sqlC = SQLContext(sc)
df = sqlC.read.csv("hdfs://ser2.dev.local:8020/UnusualTime/Filterd_data.csv") Still issue remains same. Note: I installed spark using following documentation: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/installing-spark/content/installing_spark_using_ambari.html
... View more
Labels:
07-20-2018
01:12 PM
I will check if I can make application work with storm with current version of storm. Thank you for your comment...
... View more
07-20-2018
12:21 PM
there is some version compatibility issues with application. so want to install lower versions.
... View more
07-20-2018
11:04 AM
I have two questions: 1) How to install HDF 3.1.2 on Ambari 2.7, window stuck when I enter cluster name and click on next button. It will not ask for version like it ask for HDP 3.0 2) I want to know that whether services installed HDF 3.2.1 on Ambari 2.6.x is compatible with services installed on HDP 3.0 on Ambari 2.7?
... View more
Labels:
- Labels:
-
Apache Ambari
-
Cloudera DataFlow (CDF)
07-17-2018
08:01 AM
Thank you @Harald Berghoff I will surely check that.
... View more
03-14-2018
10:42 AM
Hi all, I have installed Ambari 2.6.1 with HDP 2.6.4 cluster on 4 nodes in which storm version 1.1 is available, but I want storm 0.10.2 to be installed in HDP 2.6.4. How do I downgrade Storm 1.1.0 to Storm 0.10.2 on my HDP 2.6.4 cluster? Please suggest what to do to install storm 0.10.2 on HDP 2.6.4
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Storm
02-16-2018
03:03 PM
@bmasna Thank you so much, issue got resolved. I did remove everything related to storm-slider, remove every folder related to storm slider and checked repo file too then issue got resolved.
... View more
02-13-2018
04:37 AM
I am having 4 Nodes HDP cluster with 3 nodes HDF cluster. I am having streaming analytics manager (SAM) installed. I want to create UDAF (User Defined Aggregate Function), I am confused how to write UDAF code? Which programming language should I use for creating jar file for UDAF. I am familiar with Python, Scala, R. Please suggest some links to refer.
... View more
Labels:
- Labels:
-
Cloudera DataFlow (CDF)
01-31-2018
01:26 PM
I have installed HDP 2.6.4 with Ambari 2.6.1 on 6 node cluster. I am trying to install zookeeper, kafka, hdfs, hive, services. but while installing services it throws an following error while installing client: Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/SLIDER/0.60.0.2.2/package/scripts/slider_client.py", line 62, in <module>
SliderClient().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 375, in execute
method(env)
File "/var/lib/ambari-agent/cache/common-services/SLIDER/0.60.0.2.2/package/scripts/slider_client.py", line 45, in install
self.install_packages(env)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 811, in install_packages
name = self.format_package_name(package['name'])
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 546, in format_package_name
raise Fail("Cannot match package for regexp name {0}. Available packages: {1}".format(name, self.available_packages_in_repos))
resource_management.core.exceptions.Fail: Cannot match package for regexp name storm_${stack_version}-slider-client. Available packages: ['accumulo', 'accumulo-conf-standalone', 'accumulo-source', 'accumulo_2_6_4_0_91', 'accumulo_2_6_4_0_91-conf-standalone', 'accumulo_2_6_4_0_91-source', 'atlas-metadata', 'atlas-metadata-falcon-plugin', 'atlas-metadata-hive-plugin', 'atlas-metadata-sqoop-plugin', 'atlas-metadata-storm-plugin', 'atlas-metadata_2_6_4_0_91', 'atlas-metadata_2_6_4_0_91-falcon-plugin', 'atlas-metadata_2_6_4_0_91-sqoop-plugin', 'atlas-metadata_2_6_4_0_91-storm-plugin', 'bigtop-tomcat', 'datafu', 'druid', 'druid_2_6_4_0_91', 'falcon', 'falcon-doc', 'falcon_2_6_4_0_91', 'falcon_2_6_4_0_91-doc', 'flume', 'flume-agent', 'flume_2_6_4_0_91', 'flume_2_6_4_0_91-agent', 'hadoop', 'hadoop-client', 'hadoop-conf-pseudo', 'hadoop-doc', 'hadoop-hdfs', 'hadoop-hdfs-datanode', 'hadoop-hdfs-fuse', 'hadoop-hdfs-journalnode', 'hadoop-hdfs-namenode', 'hadoop-hdfs-secondarynamenode', 'hadoop-hdfs-zkfc', 'hadoop-httpfs', 'hadoop-httpfs-server', 'hadoop-libhdfs', 'hadoop-mapreduce', 'hadoop-mapreduce-historyserver', 'hadoop-source', 'hadoop-yarn', 'hadoop-yarn-nodemanager', 'hadoop-yarn-proxyserver', 'hadoop-yarn-resourcemanager', 'hadoop-yarn-timelineserver', 'hadoop_2_6_4_0_91-conf-pseudo', 'hadoop_2_6_4_0_91-doc', 'hadoop_2_6_4_0_91-hdfs-datanode', 'hadoop_2_6_4_0_91-hdfs-fuse', 'hadoop_2_6_4_0_91-hdfs-journalnode', 'hadoop_2_6_4_0_91-hdfs-namenode', 'hadoop_2_6_4_0_91-hdfs-secondarynamenode', 'hadoop_2_6_4_0_91-hdfs-zkfc', 'hadoop_2_6_4_0_91-httpfs', 'hadoop_2_6_4_0_91-httpfs-server', 'hadoop_2_6_4_0_91-mapreduce-historyserver', 'hadoop_2_6_4_0_91-source', 'hadoop_2_6_4_0_91-yarn-nodemanager', 'hadoop_2_6_4_0_91-yarn-proxyserver', 'hadoop_2_6_4_0_91-yarn-resourcemanager', 'hadoop_2_6_4_0_91-yarn-timelineserver', 'hbase', 'hbase-doc', 'hbase-master', 'hbase-regionserver', 'hbase-rest', 'hbase-thrift', 'hbase-thrift2', 'hbase_2_6_4_0_91', 'hbase_2_6_4_0_91-doc', 'hbase_2_6_4_0_91-master', 'hbase_2_6_4_0_91-regionserver', 'hbase_2_6_4_0_91-rest', 'hbase_2_6_4_0_91-thrift', 'hbase_2_6_4_0_91-thrift2', 'hive', 'hive-hcatalog', 'hive-hcatalog-server', 'hive-jdbc', 'hive-metastore', 'hive-server', 'hive-server2', 'hive-webhcat', 'hive-webhcat-server', 'hive2', 'hive2-jdbc', 'hive_2_6_4_0_91-hcatalog-server', 'hive_2_6_4_0_91-metastore', 'hive_2_6_4_0_91-server', 'hive_2_6_4_0_91-server2', 'hive_2_6_4_0_91-webhcat-server', 'hue', 'hue-beeswax', 'hue-common', 'hue-hcatalog', 'hue-oozie', 'hue-pig', 'hue-server', 'kafka', 'knox', 'knox_2_6_4_0_91', 'livy', 'livy2', 'livy2_2_6_4_0_91', 'livy_2_6_4_0_91', 'mahout', 'mahout-doc', 'mahout_2_6_4_0_91', 'mahout_2_6_4_0_91-doc', 'oozie', 'oozie-client', 'oozie-common', 'oozie-sharelib', 'oozie-sharelib-distcp', 'oozie-sharelib-hcatalog', 'oozie-sharelib-hive', 'oozie-sharelib-hive2', 'oozie-sharelib-mapreduce-streaming', 'oozie-sharelib-pig', 'oozie-sharelib-spark', 'oozie-sharelib-sqoop', 'oozie-webapp', 'oozie_2_6_4_0_91', 'oozie_2_6_4_0_91-client', 'oozie_2_6_4_0_91-common', 'oozie_2_6_4_0_91-sharelib', 'oozie_2_6_4_0_91-sharelib-distcp', 'oozie_2_6_4_0_91-sharelib-hcatalog', 'oozie_2_6_4_0_91-sharelib-hive', 'oozie_2_6_4_0_91-sharelib-hive2', 'oozie_2_6_4_0_91-sharelib-mapreduce-streaming', 'oozie_2_6_4_0_91-sharelib-pig', 'oozie_2_6_4_0_91-sharelib-spark', 'oozie_2_6_4_0_91-sharelib-sqoop', 'oozie_2_6_4_0_91-webapp', 'phoenix', 'phoenix_2_6_4_0_91', 'pig', 'ranger-admin', 'ranger-atlas-plugin', 'ranger-hbase-plugin', 'ranger-hdfs-plugin', 'ranger-hive-plugin', 'ranger-kafka-plugin', 'ranger-kms', 'ranger-knox-plugin', 'ranger-solr-plugin', 'ranger-storm-plugin', 'ranger-tagsync', 'ranger-usersync', 'ranger-yarn-plugin', 'ranger_2_6_4_0_91-admin', 'ranger_2_6_4_0_91-atlas-plugin', 'ranger_2_6_4_0_91-hbase-plugin', 'ranger_2_6_4_0_91-kms', 'ranger_2_6_4_0_91-knox-plugin', 'ranger_2_6_4_0_91-solr-plugin', 'ranger_2_6_4_0_91-storm-plugin', 'ranger_2_6_4_0_91-tagsync', 'ranger_2_6_4_0_91-usersync', 'shc', 'shc_2_6_4_0_91', 'slider', 'spark', 'spark-master', 'spark-python', 'spark-worker', 'spark-yarn-shuffle', 'spark2', 'spark2-master', 'spark2-python', 'spark2-worker', 'spark2-yarn-shuffle', 'spark2_2_6_4_0_91', 'spark2_2_6_4_0_91-master', 'spark2_2_6_4_0_91-python', 'spark2_2_6_4_0_91-worker', 'spark_2_6_4_0_91', 'spark_2_6_4_0_91-master', 'spark_2_6_4_0_91-python', 'spark_2_6_4_0_91-worker', 'spark_llap', 'spark_llap_2_6_4_0_91', 'sqoop', 'sqoop-metastore', 'sqoop_2_6_4_0_91', 'sqoop_2_6_4_0_91-metastore', 'storm', 'storm-slider-client', 'storm_2_6_4_0_91', 'superset', 'superset_2_6_4_0_91', 'tez', 'tez_hive2', 'zeppelin', 'zeppelin_2_6_4_0_91', 'zookeeper', 'zookeeper-server', 'hadooplzo', 'hadooplzo-native', 'hadooplzo_2_6_4_0_91', 'hadooplzo_2_6_4_0_91-native', 'openblas', 'openblas-Rblas', 'openblas-devel', 'openblas-openmp', 'openblas-openmp64', 'openblas-openmp64_', 'openblas-serial64', 'openblas-serial64_', 'openblas-static', 'openblas-threads', 'openblas-threads64', 'openblas-threads64_', 'snappy', 'snappy-devel', 'atlas-metadata_2_6_4_0_91-hive-plugin', 'bigtop-jsvc', 'datafu_2_6_4_0_91', 'hadoop_2_6_4_0_91', 'hadoop_2_6_4_0_91-client', 'hadoop_2_6_4_0_91-hdfs', 'hadoop_2_6_4_0_91-libhdfs', 'hadoop_2_6_4_0_91-mapreduce', 'hadoop_2_6_4_0_91-yarn', 'hdp-select', 'hive2_2_6_4_0_91', 'hive2_2_6_4_0_91-jdbc', 'hive_2_6_4_0_91', 'hive_2_6_4_0_91-hcatalog', 'hive_2_6_4_0_91-jdbc', 'hive_2_6_4_0_91-webhcat', 'kafka_2_6_4_0_91', 'pig_2_6_4_0_91', 'ranger_2_6_4_0_91-hdfs-plugin', 'ranger_2_6_4_0_91-hive-plugin', 'ranger_2_6_4_0_91-kafka-plugin', 'ranger_2_6_4_0_91-yarn-plugin', 'slider_2_6_4_0_91', 'spark2_2_6_4_0_91-yarn-shuffle', 'spark_2_6_4_0_91-yarn-shuffle', 'tez_2_6_4_0_91', 'tez_hive2_2_6_4_0_91', 'zookeeper_2_6_4_0_91', 'zookeeper_2_6_4_0_91-server'] above error is occurred while installing slider client. same error occurred while installing every client service like zookeeper client, hdfs client, and so on. Kindly suggest on the above.
... View more
Labels:
06-27-2017
07:45 AM
2 Kudos
Hi All, I have installed Ambari 2.5 with HDP 2.6 with some services like HDFS, Zookeeper, Kafka, etc. Now requirement is I want to install some opensource application like Apache Flink 1.3, Apache Ignite 2.0 as a service on my installed environment (Ambari 2.5 with HDP 2.6). Can anyone help me to do so? Is there any way to install these services on HDP 2.6?
... View more
Labels:
06-07-2017
12:22 PM
@Palanivelrajan Chellakutty Thanks for your comment, i changed repository file and issue got resolved. thank you very much for your help.
... View more
05-31-2017
05:50 AM
I have tried setting up local repository, also i have installed ambari.repo from hortonworks site. but no luck. same error coming up again and again. please help.
... View more
05-30-2017
08:33 AM
stderr: /var/lib/ambari-agent/data/errors-836.txt
Failed to execute command: /usr/sbin/hst activity-analyzer setup root:root '/etc/rc.d/init.d'; Exit code: 127; stdout: ; stderr: /usr/sbin/hst: line 322: install-activity-analyzer.sh: command not found
stdout: /var/lib/ambari-agent/data/output-836.txt
2017-05-30 13:47:55,937 - Stack Feature Version Info: stack_version=2.6, version=None, current_cluster_version=None -> 2.6
2017-05-30 13:47:55,937 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
User Group mapping (user_group) is missing in the hostLevelParams
2017-05-30 13:47:55,938 - Group['livy'] {}
2017-05-30 13:47:55,939 - Group['spark'] {}
2017-05-30 13:47:55,939 - Group['hadoop'] {}
2017-05-30 13:47:55,939 - Group['users'] {}
2017-05-30 13:47:55,939 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-05-30 13:47:55,939 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-05-30 13:47:55,940 - User['infra-solr'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-05-30 13:47:55,940 - User['ams'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-05-30 13:47:55,940 - User['tez'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['users']}
2017-05-30 13:47:55,941 - User['accumulo'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-05-30 13:47:55,941 - User['livy'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-05-30 13:47:55,942 - User['spark'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-05-30 13:47:55,942 - User['ambari-qa'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['users']}
2017-05-30 13:47:55,942 - User['kafka'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-05-30 13:47:55,943 - User['hdfs'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-05-30 13:47:55,943 - User['yarn'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-05-30 13:47:55,943 - User['mapred'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-05-30 13:47:55,944 - User['hcat'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-05-30 13:47:55,944 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2017-05-30 13:47:55,945 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
2017-05-30 13:47:55,950 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] due to not_if
2017-05-30 13:47:55,950 - Group['hduser'] {}
2017-05-30 13:47:55,951 - User['hdfs'] {'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'hduser']}
2017-05-30 13:47:55,951 - FS Type:
2017-05-30 13:47:55,951 - Directory['/etc/hadoop'] {'mode': 0755}
2017-05-30 13:47:55,963 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'}
2017-05-30 13:47:55,964 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 01777}
2017-05-30 13:47:55,975 - Initializing 2 repositories
2017-05-30 13:47:55,976 - Repository['HDP-2.6'] {'base_url': 'http://172.65.0.188/HDP/centos6', 'action': ['create'], 'components': ['HDP', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP', 'mirror_list': None}
2017-05-30 13:47:55,981 - File['/etc/yum.repos.d/HDP.repo'] {'content': '[HDP-2.6]\nname=HDP-2.6\nbaseurl=http://172.21.0.188/HDP/centos6\n\npath=/\nenabled=1\ngpgcheck=0'}
2017-05-30 13:47:55,981 - Repository['HDP-UTILS-1.1.0.21'] {'base_url': 'http://172.65.0.188/HDP-UTILS/', 'action': ['create'], 'components': ['HDP-UTILS', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP-UTILS', 'mirror_list': None}
2017-05-30 13:47:55,985 - File['/etc/yum.repos.d/HDP-UTILS.repo'] {'content': '[HDP-UTILS-1.1.0.21]\nname=HDP-UTILS-1.1.0.21\nbaseurl=http://172.65.0.188/HDP-UTILS/\n\npath=/\nenabled=1\ngpgcheck=0'}
2017-05-30 13:47:55,985 - Package['unzip'] {'retry_on_repo_unavailability': False, 'retry_count': 5}
2017-05-30 13:47:56,044 - Skipping installation of existing package unzip
2017-05-30 13:47:56,044 - Package['curl'] {'retry_on_repo_unavailability': False, 'retry_count': 5}
2017-05-30 13:47:56,052 - Skipping installation of existing package curl
2017-05-30 13:47:56,052 - Package['hdp-select'] {'retry_on_repo_unavailability': False, 'retry_count': 5}
2017-05-30 13:47:56,061 - Skipping installation of existing package hdp-select
installing using command: {sudo} rpm -qa | grep smartsense- || {sudo} yum -y install smartsense-hst || {sudo} rpm -i /var/lib/ambari-agent/cache/stacks/HDP/2.1/services/SMARTSENSE/package/files/rpm/*.rpm
Command: rpm -qa | grep smartsense- || yum -y install smartsense-hst || rpm -i /var/lib/ambari-agent/cache/stacks/HDP/2.1/services/SMARTSENSE/package/files/rpm/*.rpm
Exit code: 0
Std Out: smartsense-hst-1.4.0.2.5.0.3-7.x86_64
Std Err: None
('ignore_groupsusers_create', False)
2017-05-30 13:47:56,712 - User['activity_analyzer'] {'gid': 'hadoop'}
Created user without additional group hdfs
Deploying activity analyzer
Command: /usr/sbin/hst activity-analyzer setup root:root '/etc/rc.d/init.d'
Exit code: 127
Std Out: None
Std Err: /usr/sbin/hst: line 322: install-activity-analyzer.sh: command not found
Command failed after 1 tries
... View more
Labels:
- Labels:
-
Apache Ambari