Member since
09-17-2016
31
Posts
2
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1482 | 03-23-2020 11:38 PM | |
12892 | 07-27-2018 08:45 AM | |
2264 | 05-09-2018 08:28 AM | |
725 | 10-21-2016 06:29 AM |
03-25-2022
06:20 AM
Hi @JLo_Hernandez I am having same question as of you. If you got the answer please let me know. Thanks
... View more
03-23-2020
11:38 PM
Solved this issue by running below commands on the corresponding node. You need root/sudo access for this. 1) yum list installed | grep spark2 2) yum-complete-transaction 3) yum remove spark2* 4) Goto Ambari and install Spark2 client again. If still fails just refreshed tez config then tried 1 more time of step 4. This is issue is happening for almost to any component due to break/killed yum yum remove spark2* output looks like below Removed: spark2.noarch 0:2.3.2.3.1.0.0-78.el7 spark2_3_0_0_0_1634-yarn-shuffle.noarch 0:2.3.1.3.0.0.0-1634 spark2_3_1_0_0_78.noarch 0:2.3.2.3.1.0.0-78 spark2_3_1_0_0_78-master.noarch 0:2.3.2.3.1.0.0-78 spark2_3_1_0_0_78-python.noarch 0:2.3.2.3.1.0.0-78 spark2_3_1_0_0_78-worker.noarch 0:2.3.2.3.1.0.0-78 spark2_3_1_0_0_78-yarn-shuffle.noarch 0:2.3.2.3.1.0.0-78 spark2_3_1_4_0_315.noarch 0:2.3.2.3.1.4.0-315 spark2_3_1_4_0_315-python.noarch 0:2.3.2.3.1.4.0-315 spark2_3_1_4_0_315-yarn-shuffle.noarch 0:2.3.2.3.1.4.0-315 Dependency Removed: datafu_3_0_0_0_1634.noarch 0:1.3.0.3.0.0.0-1634 hadoop_3_0_0_0_1634.x86_64 0:3.1.0.3.0.0.0-1634 hadoop_3_0_0_0_1634-client.x86_64 0:3.1.0.3.0.0.0-1634 hadoop_3_0_0_0_1634-hdfs.x86_64 0:3.1.0.3.0.0.0-1634 hadoop_3_0_0_0_1634-libhdfs.x86_64 0:3.1.0.3.0.0.0-1634 hadoop_3_0_0_0_1634-mapreduce.x86_64 0:3.1.0.3.0.0.0-1634 hadoop_3_0_0_0_1634-yarn.x86_64 0:3.1.0.3.0.0.0-1634 hadoop_3_1_0_0_78.x86_64 0:3.1.1.3.1.0.0-78 hadoop_3_1_0_0_78-client.x86_64 0:3.1.1.3.1.0.0-78 hadoop_3_1_0_0_78-hdfs.x86_64 0:3.1.1.3.1.0.0-78 hadoop_3_1_0_0_78-libhdfs.x86_64 0:3.1.1.3.1.0.0-78 hadoop_3_1_0_0_78-mapreduce.x86_64 0:3.1.1.3.1.0.0-78 hadoop_3_1_0_0_78-yarn.x86_64 0:3.1.1.3.1.0.0-78 hadoop_3_1_4_0_315.x86_64 0:3.1.1.3.1.4.0-315 hadoop_3_1_4_0_315-client.x86_64 0:3.1.1.3.1.4.0-315 hadoop_3_1_4_0_315-hdfs.x86_64 0:3.1.1.3.1.4.0-315 hadoop_3_1_4_0_315-libhdfs.x86_64 0:3.1.1.3.1.4.0-315 hadoop_3_1_4_0_315-mapreduce.x86_64 0:3.1.1.3.1.4.0-315 hadoop_3_1_4_0_315-yarn.x86_64 0:3.1.1.3.1.4.0-315 hbase_3_0_0_0_1634.noarch 0:2.0.0.3.0.0.0-1634 hbase_3_1_0_0_78.noarch 0:2.0.2.3.1.0.0-78 hbase_3_1_4_0_315.noarch 0:2.0.2.3.1.4.0-315 hive_3_0_0_0_1634.noarch 0:3.1.0.3.0.0.0-1634 hive_3_0_0_0_1634-hcatalog.noarch 0:3.1.0.3.0.0.0-1634 hive_3_0_0_0_1634-jdbc.noarch 0:3.1.0.3.0.0.0-1634 hive_3_0_0_0_1634-webhcat.noarch 0:3.1.0.3.0.0.0-1634 hive_3_1_0_0_78.noarch 0:3.1.0.3.1.0.0-78 hive_3_1_0_0_78-hcatalog.noarch 0:3.1.0.3.1.0.0-78 hive_3_1_0_0_78-jdbc.noarch 0:3.1.0.3.1.0.0-78 hive_3_1_4_0_315.noarch 0:3.1.0.3.1.4.0-315 hive_3_1_4_0_315-hcatalog.noarch 0:3.1.0.3.1.4.0-315 hive_3_1_4_0_315-jdbc.noarch 0:3.1.0.3.1.4.0-315 livy2_3_1_0_0_78.noarch 0:0.5.0.3.1.0.0-78 livy2_3_1_4_0_315.noarch 0:0.5.0.3.1.4.0-315 pig_3_0_0_0_1634.noarch 0:0.16.0.3.0.0.0-1634 tez_3_0_0_0_1634.noarch 0:0.9.1.3.0.0.0-1634 tez_3_1_0_0_78.noarch 0:0.9.1.3.1.0.0-78 tez_3_1_4_0_315.noarch 0:0.9.1.3.1.4.0-315 Installing package spark2_3_1_0_0_78 ('/usr/bin/yum -y install spark2_3_1_0_0_78')
... View more
11-04-2019
11:36 PM
You need to stop ambari metrics service via ambari and then remove all temp files. Go to Ambari Metrics collector service host. and execute the below command. mv /var/lib/ambari-metrics-collector /tmp/ambari-metrics-collector_OLD Now you can restart ams service again and now you should be good with Ambari Metrics.
... View more
07-25-2018
05:00 AM
@Pedro Andrade thanks for your reply. I checked the permissions it was fine (The file owned by "hdfs:hdfs" and permissions set to 644). The node was out of service for an extended time, So I followed the below steps
delete all data and directories in the dfs.datanode.data.dir (keep that directory, though). or Move the data for example : $ mv /mnt/dn/sdl/datanode/current /mnt/dn/sdl/datanode/current.24072018 restart the data node daemon or service Later we can delete the backup data $ rm -rf /mnt/dn/sdl/datanode/current.24072018 Now Datanode is up and live.... Thanks for hortonworks help and contribution. Reference : https://community.hortonworks.com/questions/192751/databode-uuid-unassigned.html
... View more
05-10-2018
07:05 AM
I tried and verified in my 10 node cluster. It worked perfectly.
... View more
10-03-2018
10:58 AM
Hi, Please find the below steps for moving zookeeper data directory. change dataDir conf in ambari ( Go to Ambari -> ZooKeeper -> Configs -> ZooKeeper Server -> ZooKeeper directory /mnt/scratch/zookeeper) Stop
all zookeeper servers ( Zookeeper -> service actions -> stop ) copy contents to new dir, change permission of folder (myid and version-2/ ) . Login to zookeeper1 node $ cp -r /mnt/sda/zookeeper/* /mnt/scratch/zookeeper/ $ chown -R zookeeper:hadoop /mnt/scratch/zookeeper/ start only zookeeper1 node zookeeper server from ambari UI repeat 2-4 for other two zookeeper servers (zookeeper2 and zookeeper3) Restart all services if require.
... View more
12-29-2016
08:17 PM
See this working example of a Kafka Producer package com.dataflowdeveloper.kafka
import java.io.ByteArrayOutputStream
import java.util.HashMap
import com.google.gson.Gson
import org.apache.avro.SchemaBuilder
import org.apache.avro.io.EncoderFactory
import org.apache.avro.specific.SpecificDatumWriter
import org.apache.kafka.clients.producer.{KafkaProducer, ProducerConfig, ProducerRecord}
import org.apache.log4j.{Level, Logger}
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.serializer.KryoSerializer
import org.apache.spark.streaming.twitter._
import org.apache.spark.streaming.{Seconds, StreamingContext}
import java.io.{IOException, File, ByteArrayOutputStream}
import org.apache.avro.file.{DataFileReader, DataFileWriter}
import org.apache.avro.generic.{GenericDatumReader, GenericDatumWriter, GenericRecord, GenericRecordBuilder}
import org.apache.avro.io.EncoderFactory
import org.apache.avro.SchemaBuilder
import org.apache.avro.Schema
/**
* Created by timothyspann on 4/4/16.
*/
object TwitterKafkaProducer {
private var gson = new Gson()
def main(args: Array[String]) {
Logger.getLogger("org.apache.spark").setLevel(Level.WARN)
Logger.getLogger("org.apache.spark.storage.BlockManager").setLevel(Level.ERROR)
val logger: Logger = Logger.getLogger("com.dataflowdeveloper.kafka.KafkaSimulator")
// build a Tweet schema
val schema = SchemaBuilder
.record("tweet")
.fields
.name("tweet").`type`().stringType().noDefault()
.name("timestamp").`type`().longType().noDefault()
.endRecord
val Array(consumerKey, consumerSecret, accessToken, accessTokenSecret) = args.take(4)
val filters = Array("hadoop", "hortonworks", "#hadoop", "#bigdata", "#spark", "#hortonworks", "#HDP")
// Set the system properties so that Twitter4j library used by twitter stream
// can use them to generat OAuth credentials
System.setProperty("twitter4j.oauth.consumerKey", consumerKey)
System.setProperty("twitter4j.oauth.consumerSecret", consumerSecret)
System.setProperty("twitter4j.oauth.accessToken", accessToken)
System.setProperty("twitter4j.oauth.accessTokenSecret", accessTokenSecret)
val sparkConf = new SparkConf().setAppName("Spark Streaming Twitter to Avro to Kafka Producer")
sparkConf.set("spark.cores.max", "24")
sparkConf.set("spark.serializer", classOf[KryoSerializer].getName)
sparkConf.set("spark.sql.tungsten.enabled", "true")
sparkConf.set("spark.eventLog.enabled", "true")
sparkConf.set("spark.app.id", "TwitterKafkaProducer")
sparkConf.set("spark.io.compression.codec", "snappy")
sparkConf.set("spark.rdd.compress", "true")
sparkConf.set("spark.streaming.backpressure.enabled", "true")
val sc = new SparkContext(sparkConf)
val ssc = new StreamingContext(sc, Seconds(2))
val stream = TwitterUtils.createStream(ssc, None, filters).map(gson.toJson(_))
try {
stream.foreachRDD((rdd, time) => {
val count = rdd.count()
if (count > 0) {
val props = new HashMap[String, Object]()
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "brokerip:6667")
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
"org.apache.kafka.common.serialization.ByteArraySerializer")
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
"org.apache.kafka.common.serialization.StringSerializer")
val producer = new KafkaProducer[String, Array[Byte]](props)
val topList = rdd.collect()
topList.foreach(a => {
val tweets = serializeTwitter(schema, new GenericRecordBuilder(schema)
.set("tweet", a)
.set("timestamp", time.milliseconds)
.build)
val message = new ProducerRecord[String, Array[Byte]]("meetup", null, tweets)
producer.send(message)
println("Sent %s".format(a.substring(0, 512)))
})
}
})
} catch {
case e: Exception =>
println("Twitter. Writing files after job. Exception:" + e.getMessage);
e.printStackTrace();
}
ssc.start()
ssc.awaitTermination()
}
// convert to avro byte array
def serializeTwitter(schema: Schema, tweet: GenericRecord): Array[Byte] = {
val out = new ByteArrayOutputStream()
try {
val encoder = EncoderFactory.get.binaryEncoder(out, null)
val writer = new GenericDatumWriter[GenericRecord](schema)
writer.write(tweet, encoder)
encoder.flush
out.close
} catch {
case e: Exception => None;
}
out.toByteArray
}
}
// scalastyle:on println
... View more
10-21-2016
06:29 AM
Now it's working I made one change in workflow.xml replaced <file>
/dev/datalake/app/mce/oozie/rchamaku/mercureaddin_2.10-0.1-SNAPSHOT.jar#mercureaddin.jar
</file>
with
<file>
/dev/datalake/app/mce/oozie/rchamaku/mercureaddin_2.10-0.1-SNAPSHOT.jar#mercureaddin_2.10-0.1-SNAPSHOT.jar
</file> Regards, Rambabu.
... View more
09-24-2018
02:53 PM
Request for @Ancil McBarnett (or anyone else who knows): Please flesh out a little on ... "You do not want Derby in your cluster."
... View more