Member since
04-05-2016
36
Posts
8
Kudos Received
9
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1210 | 07-30-2019 11:52 PM | |
3268 | 06-07-2019 01:01 AM | |
8196 | 04-14-2017 08:31 PM | |
4145 | 08-03-2016 12:52 AM | |
1992 | 06-22-2016 02:10 AM |
07-31-2020
05:54 AM
In a CDH 6.3.2 cluster have an Anaconda parcel distributed and activated, which of course has the numpy module installed. However the Spark nodes seem to ignore the CDH configuration and keep using the system wide Python from /usr/bin/python. Nevertheless I have installed numpy in system wide Python across all cluster nodes. However I still experience the " ImportError: No module named numpy". Would appreciate any further advice how to solve the problem. Not sure how to implement the solution referred in https://stackoverflow.com/questions/46857090/adding-pyspark-python-path-in-oozie.
... View more
02-05-2020
10:12 PM
while i running this programming i got these error can anyone help me with this [cloudera@quickstart ~]$ spark-shell --master yarn-client Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/flume-ng/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/parquet/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.13.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.6.0 /_/ Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_67) Type in expressions to have them evaluated. Type :help for more information. 20/02/05 22:03:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 20/02/05 22:03:42 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. Spark context available as sc (master = yarn-client, app id = application_1580968178673_0001). SQL context available as sqlContext.
... View more
07-30-2019
11:52 PM
1 Kudo
Since the "list" commands gets the apps from the ResourceManager and doesn't set any explicit filters and limits (except those provided with it) on the request, technically it returns all the applications which are present with RM at the moment. That number is controlled by "yarn.resourcemanager.max-completed-applications" config. Hope that clarifies.
... View more
06-07-2019
01:01 AM
1 Kudo
As your intent seems to capture the driver logs in a separate file while executing the app in the cluster mode, make sure that ' /some/path/to/edgeNode/' dir is present on all of the NodeManager essentially as in cluster mode the driver will be running in the Yarn app's application master. If you can't make sure that follow a general practice to provide log file path to some pre-existing paths e.g. "/var/log/SparkDriver.log".
... View more
09-24-2018
11:19 PM
Hello Experts,
We are upgrading Our Cloudera Hive from 1.3 to 2.0, Could you please let us know, if there is known issues related to this, i did a search in Tableau and Cloudera Community, but i didn't found any issues.
Thanks in Advance!!!
Regards,
Muthu Venkatesh
... View more
08-20-2018
09:01 AM
Hi, @Utkarsh. CDS 2.2r3 has not been released yet. If you need a fix for SPARK-22306, is there a reason you cannot use CDS 2.3, which does not have the problem? Brian
... View more
03-06-2018
10:25 PM
I believe you can achieve this by following the below sequence: spark.sql("SET spark.sql.shuffle.partitions=12") Execute operations on small table spark.sql("SET spark.sql.shuffle.partitions=500") Execute operations on larger table
... View more
02-20-2018
12:47 AM
hiveContext.sql("select * from test.automation_inp3").show 18/02/19 08:53:55 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 18/02/19 08:53:55 INFO ObjectStore: ObjectStore, initialize called 18/02/19 08:53:55 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/cloudera/parcels/CDH-5.4.10-1.cdh5.4.10.p0.16/jars/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/cloudera/parcels/CDH-5.4.10-1.cdh5.4.10.p0.16/jars/datanucleus-api-jdo-3.2.1.jar." 18/02/19 08:53:55 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/cloudera/parcels/CDH-5.4.10-1.cdh5.4.10.p0.16/jars/datanucleus-core-3.2.2.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/cloudera/parcels/CDH-5.4.10-1.cdh5.4.10.p0.16/jars/datanucleus-core-3.2.10.jar." 18/02/19 08:53:55 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/cloudera/parcels/CDH-5.4.10-1.cdh5.4.10.p0.16/jars/datanucleus-rdbms-3.2.1.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/cloudera/parcels/CDH-5.4.10-1.cdh5.4.10.p0.16/jars/datanucleus-rdbms-3.2.9.jar." 18/02/19 08:53:55 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored 18/02/19 08:53:55 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored 18/02/19 08:53:56 WARN HiveMetaStore: Retrying creating default database after error: Error creating transactional connection factory javax.jdo.JDOFatalInternalException: Error creating transactional connection factory at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965) at java.security.AccessController.doPrivileged(Native Method) at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701) at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:390) at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:419) at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:314) at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:281) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:56) at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:65) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:584) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:562) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:611) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:453) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:78) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5716) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:198) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1486) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:64) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:74) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2895) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2914) at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3139) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:205) at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:192) at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:302) at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:263) at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:238) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:491) at org.apache.spark.sql.hive.HiveContext.sessionState$lzycompute(HiveContext.scala:229) at org.apache.spark.sql.hive.HiveContext.sessionState(HiveContext.scala:225) at org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:241) at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:240) at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:86) at $line25.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:29) at $line25.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:34) at $line25.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:36) at $line25.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:38) at $line25.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:40) at $line25.$read$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:42) at $line25.$read$$iwC$$iwC$$iwC$$iwC.<init>(<console>:44) at $line25.$read$$iwC$$iwC$$iwC.<init>(<console>:46) at $line25.$read$$iwC$$iwC.<init>(<console>:48) at $line25.$read$$iwC.<init>(<console>:50) at $line25.$read.<init>(<console>:52) at $line25.$read$.<init>(<console>:56) at $line25.$read$.<clinit>(<console>) at $line25.$eval$.<init>(<console>:7) at $line25.$eval$.<clinit>(<console>) at $line25.$eval.$print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:856) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:901) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:813) at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:656) at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:664) at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:669) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:996) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:944) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1058) at org.apache.spark.repl.Main$.main(Main.scala:31) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) NestedThrowablesStackTrace: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631) at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325) at org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282) at org.datanucleus.store.AbstractStoreManager.<init>(AbstractStoreManager.java:240) at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:286) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631) at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301) at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187) at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333) at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965) at java.security.AccessController.doPrivileged(Native Method) at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808) at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701) at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:390) at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:419) at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:314) at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:281) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:56) at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:65) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:584) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:562) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:611) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:453) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:78) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5716) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:198) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
... View more
- Tags:
- Spark
07-28-2017
12:20 PM
How do I query hive tables from spark 2.0 . Could you share the steps.
... View more
05-01-2017
11:11 PM
Hi, i am facing the below mentioned issue. Please help me to solve it 17/05/02 11:07:13 ERROR ShutdownHookManager: Exception while deleting Spark temp dir: C:\Users\arpitbh\AppData\Local\Temp\spark-07d9637a-2eb8-4a32-8490-01e106a80d6b java.io.IOException: Failed to delete: C:\Users\arpitbh\AppData\Local\Temp\spark-07d9637a-2eb8-4a32-8490-01e106a80d6b at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1010) at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:65) at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:62) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:62) at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1951) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
... View more
04-14-2017
08:31 PM
It is the below line which is setting the data types for both the fields as StringType: val schema =
StructType(
schemaString.split(" ").map(fieldName => StructField(fieldName, StringType, true))) You can define your custom schema as follows : val customSchema = StructType(Array(
StructField("name", StringType, true),
StructField("age", IntegerType, true))) You can add additional fields as well in the above schema definition. And then you can use this customSchema while creating the dataframe as follows: val peopleDataFrame = sqlContext.createDataFrame(rowRDD, customSchema) Also for details, please see this page.
... View more
04-05-2017
07:28 PM
1 Kudo
The cause of my case was described in Message 4-5 of the thread. Here are some possible solutions set spark.local.dir to somewhere else outside /tmp . Refer to Spark Configuration for how to configure the value. disable housekeeping of /tmp/spark-... periodic restart your spark streaming job
... View more
08-31-2016
12:20 PM
Thanks, it did fix the issue. Got rid of the variable from the code, I am able to execute it in cluster mode.
... View more
07-18-2016
12:22 AM
Even I am facing the same issue while trying to process a avro file in pyspark “java.lang.NoClassDefFoundError: org/apache/avro/mapred/AvroWrapper”. Spark version 1.5.0-cdh5.5.2
... View more
06-22-2016
04:46 AM
Your tip was spot on Umesh! We realised that as a part of installing CDH 5.5 and Spark, the default configuration for the parameters you mentioned, were changed (not sure how). We restored them to the default values and were able to run Pyspark and Spark-shell. Many thanks for your help.
... View more
06-10-2016
04:16 AM
I am using Python 2.7.11 and i followed Version 5.7 same document and it worked.. Thanks
... View more
05-11-2016
03:56 AM
Yes it's working now, thanks for your help.
... View more
04-07-2016
04:26 AM
1 Kudo
Thats because you have no new files arriving in the directory after streaming application starts. You can try "cp" to drop files in the directory after starting the streaming application.
... View more