Member since
08-01-2017
65
Posts
3
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
26171 | 01-22-2018 10:19 AM | |
3009 | 01-22-2018 10:18 AM | |
2829 | 07-05-2017 02:33 PM | |
3228 | 05-26-2017 09:01 AM |
01-22-2018
10:18 AM
It was actually a problem in the twitter JSON. When we get a tweet wich is actually a retweet, flume truncates it. Problem solved 🙂
... View more
06-01-2018
02:25 PM
I had a similar problem with pyhive on my horton setup. The failure was always immediate and so it was not a timeout issue that some people on the net were pointing out. It turned out to be hive.server2.transport.mode. If this is set to binary, it works like a Charm. If it is 'http', PyHive does not work. Also found https://github.com/dropbox/PyHive/issues/69 which talks about this. HTH.
... View more
09-04-2017
08:34 AM
@Sindhu Many thanks for your answer... it didn't work but apparently worked with disable database check. It completed the upgrade at least. Many thanks!
... View more
10-04-2017
09:58 AM
@Dan Zaratsian I'm still around this problem... the query only crashes when I query 2 or more columns with the text column. If I don't query the text column, or query it alone it works... Do you have any suggestion, please? Many thanks in advance.
... View more
07-06-2017
09:17 AM
@Dinesh Chitlangia
Many thanks once more. I'm not using spark-shell... I'm using the Zeppelin view and loading the jars with z.load. However the spark-streaming jar is not on central (is hortonworks) and therefore I can't add it with z.load ... Tried with the old jar that I had 2_10:1.4.1 and I'm getting the same error on Zeppelin 0.6 (Ambari) when I run the first sql comand of the tutorial: %sql select * from tweets where text like '%girl%' limit 10
Can it be from the jar? How I add it on Zeppelin since it's not on Central? Many thanks once more.
... View more
07-05-2017
02:33 PM
Now it's working... import org.apache.spark._
import org.apache.spark.streaming._
import org.apache.spark.streaming.twitter._
import org.apache.spark.storage.StorageLevel
import scala.io.Source
import scala.collection.mutable.HashMap
import java.io.File
import org.apache.log4j.Logger
import org.apache.log4j.Level
import sys.process.stringSeqToProcess
import org.apache.spark.sql.DataFrame
import org.apache.spark.SparkConf
import org.apache.spark.rdd.RDD
import org.apache.spark.sql._
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.streaming.StreamingContext._
/** Configures the Oauth Credentials for accessing Twitter */
def configureTwitterCredentials(apiKey: String, apiSecret: String, accessToken: String, accessTokenSecret: String) {
val configs = new HashMap[String, String] ++= Seq(
"apiKey" -> apiKey, "apiSecret" -> apiSecret, "accessToken" -> accessToken, "accessTokenSecret" -> accessTokenSecret)
println("Configuring Twitter OAuth")
configs.foreach{ case(key, value) =>
if (value.trim.isEmpty) {
throw new Exception("Error setting authentication - value for " + key + " not set")
}
val fullKey = "twitter4j.oauth." + key.replace("api", "consumer")
System.setProperty(fullKey, value.trim)
println("\tProperty " + fullKey + " set as [" + value.trim + "]")
}
println()
}
// Configure Twitter credentials
val apiKey = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
val apiSecret = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
val accessToken = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
val accessTokenSecret = "XXXXXXXXXXXXXXXXXXXXXXX"
configureTwitterCredentials(apiKey, apiSecret, accessToken, accessTokenSecret)
import org.apache.spark.streaming.twitter._
val sc = new SparkConf().setMaster("local[2]").setAppName("tweets").set("spark.cleaner.ttl","2000")
val ssc = new StreamingContext(sc, Seconds(2))
val tweets = TwitterUtils.createStream(ssc, None)
val twt = tweets.window(Seconds(60))
case class Tweet(createdAt:Long, text:String)
twt.map(status=>
Tweet(status.getCreatedAt().getTime()/1000, status.getText())
).foreachRDD { rdd=>
// Below line works only in spark 1.3.0.
// For spark 1.1.x and spark 1.2.x,
// use rdd.registerTempTable("tweets") instead.
//rdd.toDF().registerAsTable("tweets")
//val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val sqlContext = SQLContext.getOrCreate(rdd.sparkContext)
import sqlContext._
import sqlContext.implicits._
rdd.toDF.registerTempTable("tweets")
}
twt.print
ssc.start()
... View more
06-22-2017
06:09 PM
@Hugo Felix
Try this: ###### on Zeppelin Host run the below commans ######
# mv /var/lib/ambari-agent/cache /var/lib/ambari-agent/cache_OLD
# mkdir /var/lib/ambari-agent/cache
# ambari-agent restart
... View more
05-26-2017
03:13 PM
@nshelke
Many thanks for your quick answer.
Changed the port to 8050... get almost an instant kill.: 2017-05-26 15:05:40,929 WARN HiveActionExecutor:523 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@hive-add-partition] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [40000] 2017-05-26 15:05:19,223 INFO ActionStartXCommand:520 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@:start:] Start action [0000010-170525174610830-oozie-oozi-W@:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2017-05-26 15:05:19,225 INFO ActionStartXCommand:520 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@:start:] [***0000010-170525174610830-oozie-oozi-W@:start:***]Action status=DONE
2017-05-26 15:05:19,225 INFO ActionStartXCommand:520 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@:start:] [***0000010-170525174610830-oozie-oozi-W@:start:***]Action updated in DB!
2017-05-26 15:05:19,254 INFO WorkflowNotificationXCommand:520 - SERVER[sandbox.hortonworks.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[] No Notification URL is defined. Therefore nothing to notify for job 0000010-170525174610830-oozie-oozi-W
2017-05-26 15:05:19,254 INFO WorkflowNotificationXCommand:520 - SERVER[sandbox.hortonworks.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@:start:] No Notification URL is defined. Therefore nothing to notify for job 0000010-170525174610830-oozie-oozi-W@:start:
2017-05-26 15:05:19,261 INFO ActionStartXCommand:520 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@hive-add-partition] Start action [0000010-170525174610830-oozie-oozi-W@hive-add-partition] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2017-05-26 15:05:23,376 INFO HiveActionExecutor:520 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@hive-add-partition] Trying to get job [job_1495727958373_0003], attempt [1]
2017-05-26 15:05:23,392 INFO HiveActionExecutor:520 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@hive-add-partition] checking action, hadoop job ID [job_1495727958373_0003] status [RUNNING]
2017-05-26 15:05:23,393 INFO ActionStartXCommand:520 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@hive-add-partition] [***0000010-170525174610830-oozie-oozi-W@hive-add-partition***]Action status=RUNNING
2017-05-26 15:05:23,393 INFO ActionStartXCommand:520 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@hive-add-partition] [***0000010-170525174610830-oozie-oozi-W@hive-add-partition***]Action updated in DB!
2017-05-26 15:05:23,399 INFO WorkflowNotificationXCommand:520 - SERVER[sandbox.hortonworks.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@hive-add-partition] No Notification URL is defined. Therefore nothing to notify for job 0000010-170525174610830-oozie-oozi-W@hive-add-partition
2017-05-26 15:05:40,648 INFO CallbackServlet:520 - SERVER[sandbox.hortonworks.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@hive-add-partition] callback for action [0000010-170525174610830-oozie-oozi-W@hive-add-partition]
2017-05-26 15:05:40,776 INFO HiveActionExecutor:520 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@hive-add-partition] Trying to get job [job_1495727958373_0003], attempt [1]
2017-05-26 15:05:40,883 INFO HiveActionExecutor:520 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@hive-add-partition] action completed, external ID [job_1495727958373_0003]
2017-05-26 15:05:40,929 WARN HiveActionExecutor:523 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@hive-add-partition] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [40000]
2017-05-26 15:05:40,982 INFO ActionEndXCommand:520 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@hive-add-partition] ERROR is considered as FAILED for SLA
2017-05-26 15:05:41,010 INFO ActionStartXCommand:520 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@fail] Start action [0000010-170525174610830-oozie-oozi-W@fail] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2017-05-26 15:05:41,011 INFO ActionStartXCommand:520 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@fail] [***0000010-170525174610830-oozie-oozi-W@fail***]Action status=DONE
2017-05-26 15:05:41,011 INFO ActionStartXCommand:520 - SERVER[sandbox.hortonworks.com] USER[admin] GROUP[-] TOKEN[] APP[hive-add-partition-wf] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@fail] [***0000010-170525174610830-oozie-oozi-W@fail***]Action updated in DB!
2017-05-26 15:05:41,070 INFO WorkflowNotificationXCommand:520 - SERVER[sandbox.hortonworks.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@fail] No Notification URL is defined. Therefore nothing to notify for job 0000010-170525174610830-oozie-oozi-W@fail
2017-05-26 15:05:41,070 INFO WorkflowNotificationXCommand:520 - SERVER[sandbox.hortonworks.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[] No Notification URL is defined. Therefore nothing to notify for job 0000010-170525174610830-oozie-oozi-W
2017-05-26 15:05:41,070 INFO WorkflowNotificationXCommand:520 - SERVER[sandbox.hortonworks.com] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000010-170525174610830-oozie-oozi-W] ACTION[0000010-170525174610830-oozie-oozi-W@hive-add-partition] No Notification URL is defined. Therefore nothing to notify for job 0000010-170525174610830-oozie-oozi-W@hive-add-partition
... View more
04-20-2018
10:36 PM
Dear @Sagar Shimpi The problem I encountered was : The following 6 host component(s) have not been upgraded to version 1.1.5.0-235. Please install and upgrade the Stack Version on those hosts and try again.
Host components:
GLOBALMASTER on host e19e07452.et15sqa
LDSERVER on host e19e07452.et15sqa
LOCALMASTER on host e19e07452.et15sqa
LDSERVER on host e19e07466.et15sqa
LDSERVER on host e19e10465.et15sqa
LOCALMASTER on host e19e10465.et15sqa the "GLOBALMASTER" is my service component. Can you please help? Many thanks in advance.
... View more