Member since
07-24-2017
14
Posts
3
Kudos Received
0
Solutions
07-29-2017
04:13 AM
1 Kudo
The spark request is now getting submitted but now i am getting following error: hive> select count(*) from kaggle.test_house; Query ID = ec2-user_20170729070303_887365d6-ce92-4ec3-bc8a-2adf3cfec117 Total jobs = 1 Launching Job 1 out of 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Spark Job = 614015ef-31f9-4e14-9b71-c161f64916db Job hasn't been submitted after 61s. Aborting it. Possible reasons include network issues, errors in remote driver or the cluster has no available resources, etc. Please check YARN or Spark driver's logs for further information. Status: SENT FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
... View more
07-29-2017
12:57 AM
Thank you for the reply. I did not have the spark folder in the location. I had SPARK2. After I run the command. I get the below error. [ec2-user@ip-172-31-37-124 jars]$ spark-submit --class org.apache.spark.examples.SparkPi --master yarn --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 /opt/cloudera/parcels/SPARK2/lib/spark2/examples/jars/spark-examples_2.11-2.2.0.cloudera1.jar WARNING: User-defined SPARK_HOME (/opt/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/lib/spark) overrides detected (/usr/lib/spark). WARNING: Running spark-class from user-defined location. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/SparkSession$ at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:28) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:730) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.SparkSession$ at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 11 more
... View more
07-28-2017
02:51 PM
1 Kudo
I have installed Spark and configure Hive to use it as execution engine. Select * from table name works fine. But select count(*) from table name fails with following error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask At times also got an error stating "failed to create spark client" I have also tried to modify the memort parameters but to no avail. Can you please tell me what should be the ideal memory setting? Below is the directory structure from hdfs drwxr-xr-x - admin admin 0 2017-07-28 16:36 /user/admin drwx------ - ec2-user supergroup 0 2017-07-28 17:50 /user/ec2-user drwxr-xr-x - hdfs hdfs 0 2017-07-28 11:37 /user/hdfs drwxrwxrwx - mapred hadoop 0 2017-07-16 06:03 /user/history drwxrwxr-t - hive hive 0 2017-07-16 06:04 /user/hive drwxrwxr-x - hue hue 0 2017-07-28 10:16 /user/hue drwxrwxr-x - impala impala 0 2017-07-16 07:13 /user/impala drwxrwxr-x - oozie oozie 0 2017-07-16 06:05 /user/oozie drwxr-x--x - spark spark 0 2017-07-28 17:17 /user/spark drwxrwxr-x - sqoop2 sqoop 0 2017-07-16 06:37 /user/sqoop2 the /user directory has owner as ec2-user and group as supergroup. I tried running the query from CLI: WARNING: Hive CLI is deprecated and migration to Beeline is recommended. hive> select count(*) from kaggle.test_house; Query ID = ec2-user_20170728174949_aa9d7be9-038c-44a0-a42b-1b210a37f4ec Total jobs = 1 Launching Job 1 out of 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
07-25-2017
12:57 PM
Did it !! Yohuuuu !!! I can see the page and am logged in!!! Thank you so so much !! You saved my day !!
... View more
07-25-2017
12:43 PM
Thanks for the quick reply. This is my personal setup and I am a starter in this area. Will you be able to help me with baby steps ? APologies for being too demanding 🙂 what i have now done is created a DNS is route53 in public zone. And i have updated the name servers into my domain at godaddy. Now what should i do next in AWS? Create a record set in AWS? And what should i do on my EC2 instance where the Workbench has been installed?
... View more
07-25-2017
12:28 PM
Hi Thank you so much for your information. So i am currently testing it. But I have hosted my cloudera single node environment on AWS EC2 instance. I also own a domain on godaddy www.datacloudera.com. I have changed the nameserver to the one in AWS DNS Zone. Should the hosted zone be public ? or private in a vpc? ns-0.awsdns-00.com ns-1024.awsdns-00.org ns-512.awsdns-00.net ns-1536.awsdns-00.co.uk But i am not sure how to proceed next. What are the steps which i should take post this one ?
... View more
07-25-2017
08:09 AM
1 Kudo
hi do you have a step by step detail for the Wildcarding of DNS? I am stuck at that point. I have an AWS EC2 instance on which I have installed the workbench. However i am unable to open the URL.
... View more
07-25-2017
04:57 AM
ok. I had tried Route 53. But did not work. I guess my concepts on networking need bit of refresh.
... View more