Member since
11-12-2017
7
Posts
1
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4267 | 12-24-2017 01:05 PM | |
5784 | 11-16-2017 09:48 AM |
12-24-2017
01:05 PM
1 Kudo
solved after getting the maven depdencies from cloudera repo. <dependencies> <!-- Scala and Spark dependencies --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>1.6.0-cdh5.9.2</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.10</artifactId> <version>1.6.0-cdh5.9.2</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-hive_2.10</artifactId> <version>1.6.0-cdh5.9.2</version> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.hive/hive-exec --> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>1.1.0-cdh5.9.2</version> </dependency> <dependency> <groupId>org.scalatest</groupId> <artifactId>scalatest_2.10</artifactId> <version>3.0.0-SNAP4</version> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.11</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-mllib_2.10</artifactId> <version>1.4.1</version> </dependency> <dependency> <groupId>commons-dbcp</groupId> <artifactId>commons-dbcp</artifactId> <version>1.2.2</version> </dependency> <dependency> <groupId>com.databricks</groupId> <artifactId>spark-csv_2.10</artifactId> <version>1.4.0</version> </dependency> <dependency> <groupId>com.databricks</groupId> <artifactId>spark-xml_2.10</artifactId> <version>0.2.0</version> </dependency> <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk</artifactId> <version>1.0.12</version> </dependency> <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk-s3</artifactId> <version>1.11.172</version> </dependency> <dependency> <groupId>com.github.scopt</groupId> <artifactId>scopt_2.10</artifactId> <version>3.2.0</version> </dependency> <dependency> <groupId>javax.mail</groupId> <artifactId>mail</artifactId> <version>1.4</version> </dependency> </dependencies> <repositories> <repository> <id>maven-hadoop</id> <name>Hadoop Releases</name> <url>https://repository.cloudera.com/content/repositories/releases/</url> </repository> <repository> <id>cloudera-repos</id> <name>Cloudera Repos</name> <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url> </repository> </repositories>
... View more
12-24-2017
10:26 AM
Hi All, I am having issue on cloudera 5.9 when trying to load data to partition table using hive context, I did tried which is mentioned here (http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Failing-to-save-dataframe-to/m-p/50909) but no luck. so I raised a question in stackoverflow (https://stackoverflow.com/questions/47963059/spark-1-6-hive-context-setconf-issue). Exception in thread "main" [Loaded java.lang.Throwable$PrintStreamOrWriter from /AZ/bin/java_8/jre1.8.0_131/lib/rt.jar] [Loaded java.lang.Throwable$WrappedPrintStream from /AZ/bin/java_8/jre1.8.0_131/lib/rt.jar] java.lang.NoSuchMethodException: org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(org.apache.hadoop.fs.Path, java.lang.String, java.util.Map, boolean, int, boolean, boolean, boolean) at java.lang.Class.getMethod(Class.java:1786) at org.apache.spark.sql.hive.client.Shim.findMethod(HiveShim.scala:114) at org.apache.spark.sql.hive.client.Shim_v0_14.loadDynamicPartitionsMethod$lzycompute(HiveShim.scala:404) at org.apache.spark.sql.hive.client.Shim_v0_14.loadDynamicPartitionsMethod(HiveShim.scala:403) at org.apache.spark.sql.hive.client.Shim_v0_14.loadDynamicPartitions(HiveShim.scala:455) at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$loadDynamicPartitions$1.apply$mcV$sp(ClientWrapper.scala:564) at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$loadDynamicPartitions$1.apply(ClientWrapper.scala:564) at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$loadDynamicPartitions$1.apply(ClientWrapper.scala:564) at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState$1.apply(ClientWrapper.scala:282) at org.apache.spark.sql.hive.client.ClientWrapper.liftedTree1$1(ClientWrapper.scala:228) at org.apache.spark.sql.hive.client.ClientWrapper.retryLocked(ClientWrapper.scala:227) at org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:270) at org.apache.spark.sql.hive.client.ClientWrapper.loadDynamicPartitions(ClientWrapper.scala:563) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:225) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:127) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:276) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55) at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:145) at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130) at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:817) at com.az.spark.ingestion.loadtable.LoadTable$$anonfun$main$1.apply(LoadTable.scala:258) at com.az.spark.ingestion.loadtable.LoadTable$$anonfun$main$1.apply(LoadTable.scala:102) at scala.Option.map(Option.scala:145) at com.az.spark.ingestion.loadtable.LoadTable$.main(LoadTable.scala:102) at com.az.spark.ingestion.loadtable.LoadTable.main(LoadTable.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Thanks Sri
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
11-16-2017
09:48 AM
Hi Srowen, check below issue yes its with the library..... https://github.com/springml/spark-salesforce/issues/18 Thanks Sri
... View more
11-14-2017
12:00 PM
Hi Srowen, yes look like firewall rules might causing this issue, we were unable to find any connection log in salesforce application coming from Hadoop edge node we can see only succesfull log coming from IntelliJ Idea (windows PC). Error on Edge Node:- Caused by: java.net.ConnectException: Connection refused looks like firewall is preventing the connection to go out of edge node need to check with our cloudera Hadoop Admin. Thanks Sri
... View more
11-12-2017
11:15 AM
Below is my run book , I will try wtih jssecacerts and cacerts and let you know. the library works fine on laptop or from outside cluster but not from inside cluster also as I said I can curl or wget on the url using http not with https is there a work around for this ? export http_proxy=https://${ipaddress}:${port} export no_proxy="localhost,127.0.0.0/8,ipadress/port,::1" export HADOOP_CONF_DIR=/etc/hadoop/conf export HADOOP_HOME=/opt/cloudera/parcels/CDH-5.9.2-1.cdh5.9.2.p0.3 export HADOOP_MAPRED_HOME=/opt/cloudera/parcels/CDH-5.9.2-1.cdh5.9.2.p0.3/lib/hadoop-0.20-mapreduce spark-submit --class SalesForceTest3 --master local --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 /AZ/bin/myjar-1.0-SNAPSHOT-jar-with-dependencies.jar "https://test.salesforce.com/services/Soap/u/35.0" "yarn-client" Thanks Sri
... View more
11-12-2017
06:32 AM
Hi All, I am trying to connect spark with salesforce using this library (https://github.com/springml/spark-salesforce) , its a small scala application to connect salesforce. I am able to connect spark from my laptop but when I move the code (jar) to cluster I am getting exception, I am able to ping the salesforce URL after seeting up proxy but still unable to connect using spark. export http_proxy=http://${ipaddress}:${port} export no_proxy="localhost,127.0.0.0/8,ipadress/port,::1" curl http://somesalefroce.com/services/Soap/u/35.0 the api owner says he was able to test on hortonworks cluster , does cloudera cluster lets applications to connect outside world (internet) from with in cluster ? Error:- Exception while creating connection com.sforce.ws.ConnectionException: Failed to send request to http Thanks Sri
... View more
Labels:
- Labels:
-
Apache Spark