1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1843 | 04-03-2024 06:39 AM | |
| 2872 | 01-12-2024 08:19 AM | |
| 1584 | 12-07-2023 01:49 PM | |
| 2348 | 08-02-2023 07:30 AM | |
| 3240 | 03-29-2023 01:22 PM |
08-17-2016
09:18 PM
that's great thanks
... View more
08-17-2016
07:51 PM
that was my second problem after fixing the stringdecoder, so I upped my ulimit and that removed that second issue. Thanks!
... View more
08-17-2016
07:50 PM
I was missing the StringDecoder, make sure you have those!!! val tweets = KafkaUtils.createStream[String, String, StringDecoder, StringDecoder](ssc, kafkaConf, topicMaps, StorageLevel.MEMORY_ONLY_SER)
... View more
08-17-2016
07:08 PM
Caused by: java.sql.SQLException: Exception during creation of file /tmp/spark-f4651a14-f0f5-422a-908a-1fb812c862ed/metastore/seg0/c2f0.dat for container
at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source)
at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
... 97 more
Caused by: java.sql.SQLException: Java exception: '/tmp/spark-f4651a14-f0f5-422a-908a-1fb812c862ed/metastore/seg0/c2f0.dat (Too many open files): java.io.FileNotFoundException'.
at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source)
at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.javaException(Unknown Source)
at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
... 98 more
Caused by: java.io.FileNotFoundException: /tmp/spark-f4651a14-f0f5-422a-908a-1fb812c862ed/metastore/seg0/c2f0.dat (Too many open files)
at java.io.RandomAccessFile.open0(Native Method)
at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
at org.apache.derby.impl.io.DirRandomAccessFile.<init>(Unknown Source)
at org.apache.derby.impl.io.DirFile4.getRandomAccessFile(Unknown Source)
at org.apache.derby.impl.store.raw.data.RAFContainer.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at org.apache.derby.impl.store.raw.data.RAFContainer.createContainer(Unknown Source)
at org.apache.derby.impl.store.raw.data.RAFContainer4.createContainer(Unknown Source)
at org.apache.derby.impl.store.raw.data.FileContainer.createIdent(Unknown Source)
at org.apache.derby.impl.store.raw.data.FileContainer.createIdentity(Unknown Source)
at org.apache.derby.impl.services.cache.ConcurrentCache.create(Unknown Source)
... View more
08-17-2016
06:47 PM
May be a coding error, left out the StringDecoder.
... View more
08-17-2016
05:43 PM
spark-submit --packages org.apache.spark:spark-streaming-kafka_2.10:1.6.0 --class TwitterSentimentAnalysis --master yarn-client sentimentstreaming.jar --verbose 16/08/17 17:04:54 ERROR ReceiverTracker: Deregistered receiver for stream 0: Error starting receiver 0 - java.lang.NoSuchMethodException: scala.runtime.Nothing$.<init>(kafka.utils.VerifiableProperties)
at java.lang.Class.getConstructor0(Class.java:3082)
at java.lang.Class.getConstructor(Class.java:1825)
at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:103)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$anonfun$9.apply(ReceiverTracker.scala:575)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$anonfun$9.apply(ReceiverTracker.scala:565)
at org.apache.spark.SparkContext$anonfun$37.apply(SparkContext.scala:1992)
at org.apache.spark.SparkContext$anonfun$37.apply(SparkContext.scala:1992)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) This is on submit: :05 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
[root@sandbox spark]#
[root@sandbox spark]# vi submitstreaming.sh
[root@sandbox spark]# ./submitstreaming.sh
Ivy Default Cache set to: /root/.ivy2/cache
The jars for the packages stored in: /root/.ivy2/jars
:: loading settings :: url = jar:file:/usr/hdp/2.4.0.0-169/spark/lib/spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.spark#spark-streaming-kafka_2.10 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found org.apache.spark#spark-streaming-kafka_2.10;1.6.0 in central
found org.apache.kafka#kafka_2.10;0.8.2.1 in central
found com.yammer.metrics#metrics-core;2.2.0 in central
found org.slf4j#slf4j-api;1.7.10 in central
found org.apache.kafka#kafka-clients;0.8.2.1 in central
found net.jpountz.lz4#lz4;1.3.0 in central
found org.xerial.snappy#snappy-java;1.1.2 in central
found com.101tec#zkclient;0.3 in central
found log4j#log4j;1.2.17 in central
found org.spark-project.spark#unused;1.0.0 in central
downloading https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka_2.10/1.6.0/spark-streaming-kafka_2.10-1.6.0.jar ...
[SUCCESSFUL ] org.apache.spark#spark-streaming-kafka_2.10;1.6.0!spark-streaming-kafka_2.10.jar (146ms)
downloading https://repo1.maven.org/maven2/org/apache/kafka/kafka_2.10/0.8.2.1/kafka_2.10-0.8.2.1.jar ...
[SUCCESSFUL ] org.apache.kafka#kafka_2.10;0.8.2.1!kafka_2.10.jar (1068ms)
downloading https://repo1.maven.org/maven2/org/spark-project/spark/unused/1.0.0/unused-1.0.0.jar ...
[SUCCESSFUL ] org.spark-project.spark#unused;1.0.0!unused.jar (21ms)
downloading https://repo1.maven.org/maven2/com/yammer/metrics/metrics-core/2.2.0/metrics-core-2.2.0.jar ...
[SUCCESSFUL ] com.yammer.metrics#metrics-core;2.2.0!metrics-core.jar (33ms)
downloading https://repo1.maven.org/maven2/org/apache/kafka/kafka-clients/0.8.2.1/kafka-clients-0.8.2.1.jar ...
[SUCCESSFUL ] org.apache.kafka#kafka-clients;0.8.2.1!kafka-clients.jar (106ms)
downloading https://repo1.maven.org/maven2/com/101tec/zkclient/0.3/zkclient-0.3.jar ...
[SUCCESSFUL ] com.101tec#zkclient;0.3!zkclient.jar (31ms)
downloading https://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.7.10/slf4j-api-1.7.10.jar ...
[SUCCESSFUL ] org.slf4j#slf4j-api;1.7.10!slf4j-api.jar (26ms)
downloading https://repo1.maven.org/maven2/net/jpountz/lz4/lz4/1.3.0/lz4-1.3.0.jar ...
[SUCCESSFUL ] net.jpountz.lz4#lz4;1.3.0!lz4.jar (66ms)
downloading https://repo1.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.2/snappy-java-1.1.2.jar ...
[SUCCESSFUL ] org.xerial.snappy#snappy-java;1.1.2!snappy-java.jar(bundle) (166ms)
downloading https://repo1.maven.org/maven2/log4j/log4j/1.2.17/log4j-1.2.17.jar ...
[SUCCESSFUL ] log4j#log4j;1.2.17!log4j.jar(bundle) (117ms)
:: resolution report :: resolve 3263ms :: artifacts dl 1790ms
:: modules in use:
com.101tec#zkclient;0.3 from central in [default]
com.yammer.metrics#metrics-core;2.2.0 from central in [default]
log4j#log4j;1.2.17 from central in [default]
net.jpountz.lz4#lz4;1.3.0 from central in [default]
org.apache.kafka#kafka-clients;0.8.2.1 from central in [default]
org.apache.kafka#kafka_2.10;0.8.2.1 from central in [default]
org.apache.spark#spark-streaming-kafka_2.10;1.6.0 from central in [default]
org.slf4j#slf4j-api;1.7.10 from central in [default]
org.spark-project.spark#unused;1.0.0 from central in [default]
org.xerial.snappy#snappy-java;1.1.2 from central in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 10 | 10 | 10 | 0 || 10 | 10 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
10 artifacts copied, 0 already retrieved (6067kB/20ms)
16/08/17 17:04:28 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/08/17 17:04:29 INFO Slf4jLogger: Slf4jLogger started
16/08/17 17:04:29 INFO Remoting: Starting remoting
16/08/17 17:04:29 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.0.2.15:42397]
16/08/17 17:04:30 INFO Server: jetty-8.y.z-SNAPSHOT
16/08/17 17:04:30 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
16/08/17 17:04:30 INFO Server: jetty-8.y.z-SNAPSHOT
16/08/17 17:04:30 INFO AbstractConnector: Started SocketConnector@0.0.0.0:38910
spark.yarn.driver.memoryOverhead is set but does not apply in client mode.
16/08/17 17:04:31 INFO TimelineClientImpl: Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
16/08/17 17:04:32 INFO RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.0.2.15:8050
16/08/17 17:04:32 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
16/08/17 17:04:33 INFO YarnClientImpl: Submitted application application_1471437048143_0007
16/08/17 17:04:44 INFO TwitterSentimentAnalysis: Using Twitter topic
16/08/17 17:04:52 ERROR ReceiverTracker: Deregistered receiver for stream 0: Error starting receiver 0 - java.lang.NoSuchMethodException: scala.runtime.Nothing$.<init>(kafka.utils.VerifiableProperties)
at java.lang.Class.getConstructor0(Class.java:3082)
at java.lang.Class.getConstructor(Class.java:1825)
at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:103)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:575)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:565)
at org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1992)
at org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1992)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Scala Source Snippit val topicMaps = Map("tweets" -> 1)
logger.info("Using Twitter topic")
// Create a new stream for JSON String
val tweets = KafkaUtils.createStream(ssc, kafkaConf, topicMaps, StorageLevel.MEMORY_ONLY_SER)
try {
tweets.foreachRDD((rdd, time) => {
... View more
Labels:
- Labels:
-
Apache Kafka
-
Apache Spark
08-16-2016
07:19 PM
Out of 14,000+ files, 3 had the wrong old schema. So Spark picked that schema. Did a skipTrash delete on those and restarted job. Now it works.
... View more
08-16-2016
07:17 PM
I think the main thing for you to do is to set an INT value for hive.llap.daemon.vcpus.per.instance the wrong name one may just be a warning
... View more
08-16-2016
07:11 PM
is it under custom? where does it show up?
... View more
08-16-2016
05:40 PM
Restarted YARN services and that did not help.
... View more