Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

SPARK-SQL running with yarn failed creating table from existing table

SPARK-SQL running with yarn failed creating table from existing table

New Contributor

I have a 4 node cluster. I ran spark-sql, created a table with 5 rows. I am trying to duplicate this table using "create table new_table as (select * from old_table);". The new table is created, but no content. But if I start spark-sql in client mode (without --master yarn option), the new table has all data from the old table. Following is my screen dump with no YARN shows correct table created. Then repeated with YARN option, the new table is empty.

I do not see errors from Ambari dashboard

spark@msl-dpe-perf88:/home/harry.li/TPC/benchmarks/spark-tpc-ds-performance-test$ spark-sql  --executor-memory 20G --num-executors 4   --driver-java-options -Dlog4j.configuration=file:////home/harry.li/TPC/benchmarks/spark-tpc-ds-performance-test/work/log4j.properties --conf spark.executor.extraJavaOptions=-Dlog4j.configuration=file:////home/harry.li/TPC/benchmarks/spark-tpc-ds-performance-test/work/log4j.properties --conf spark.sql.catalogImplementation=hive
SPARK_MAJOR_VERSION is set to 2, using Spark2
spark-sql> use tpcds;
Time taken: 1.164 seconds
spark-sql> select * from base_table;
1AAAAAAAABAAAAAAAConventional childr9777876516th ParkwaySuite 470FairviewWilliamson CountyTN35709United States-5.0
2AAAAAAAACAAAAAAAImportant issues liv138504600View FirstAvenueSuite PFairviewWilliamson CountyTN35709United States-5.0
3AAAAAAAADAAAAAAADoors canno294242534Ash LaurelDr.Suite 0FairviewWilliamson CountyTN35709United States-5.0
4AAAAAAAAEAAAAAAABad cards must make.621234368Wilson ElmDriveSuite 80FairviewWilliamson CountyTN35709United States-5.0
5AAAAAAAAFAAAAAAANULLNULLNULLNULLNULLNULLFairviewWilliamson CountyTN35709United StatesNULL
Time taken: 1.125 seconds, Fetched 5 row(s)
spark-sql> create table test1 as (select * from base_table);
Time taken: 0.777 seconds
spark-sql> select * from test1;
1AAAAAAAABAAAAAAAConventional childr9777876516th ParkwaySuite 470FairviewWilliamson CountyTN35709United States-5.0
2AAAAAAAACAAAAAAAImportant issues liv138504600View FirstAvenueSuite PFairviewWilliamson CountyTN35709United States-5.0
3AAAAAAAADAAAAAAADoors canno294242534Ash LaurelDr.Suite 0FairviewWilliamson CountyTN35709United States-5.0
4AAAAAAAAEAAAAAAABad cards must make.621234368Wilson ElmDriveSuite 80FairviewWilliamson CountyTN35709United States-5.0
5AAAAAAAAFAAAAAAANULLNULLNULLNULLNULLNULLFairviewWilliamson CountyTN35709United StatesNULL
Time taken: 0.14 seconds, Fetched 5 row(s)
spark-sql> quit;
spark@msl-dpe-perf88:/home/harry.li/TPC/benchmarks/spark-tpc-ds-performance-test$ spark-sql --master yarn  --executor-memory 20G --num-executors 4   --driver-java-options -Dlog4j.configuration=file:////home/harry.li/TPC/benchmarks/spark-tpc-ds-performance-test/work/log4j.properties --conf spark.executor.extraJavaOptions=-Dlog4j.configuration=file:////home/harry.li/TPC/benchmarks/spark-tpc-ds-performance-test/work/log4j.properties --conf spark.sql.catalogImplementation=hive
SPARK_MAJOR_VERSION is set to 2, using Spark2
spark-sql> use tpcds;
Time taken: 0.39 seconds
spark-sql> select * from base_table;
1AAAAAAAABAAAAAAAConventional childr9777876516th ParkwaySuite 470FairviewWilliamson CountyTN35709United States-5.0
2AAAAAAAACAAAAAAAImportant issues liv138504600View FirstAvenueSuite PFairviewWilliamson CountyTN35709United States-5.0
3AAAAAAAADAAAAAAADoors canno294242534Ash LaurelDr.Suite 0FairviewWilliamson CountyTN35709United States-5.0
4AAAAAAAAEAAAAAAABad cards must make.621234368Wilson ElmDriveSuite 80FairviewWilliamson CountyTN35709United States-5.0
5AAAAAAAAFAAAAAAANULLNULLNULLNULLNULLNULLFairviewWilliamson CountyTN35709United StatesNULL
Time taken: 2.334 seconds, Fetched 5 row(s)
spark-sql> create table test_with_yarn as (select * from base_table);
Time taken: 2.36 seconds
spark-sql> select * from test_with_yarn;
Time taken: 0.121 seconds
spark-sql> 
3 REPLIES 3

Re: SPARK-SQL running with yarn failed creating table from existing table

Hello I have to Write My Dissertation For Me on SQL table creating and I am researching select query of sql and I just have to ask where can I get complete guideline?

Re: SPARK-SQL running with yarn failed creating table from existing table

New Contributor

Basically, Spark SQL is utilized for organized information handling in Apache Spark. By utilizing SQL, the significant preferred standpoint is we get more data of the structure of information.
regards: online dissertation writer

Re: SPARK-SQL running with yarn failed creating table from existing table

New Contributor

I think you should use Spark2 submit command line successfully ,yarn cluster mode in CDH 5.12.

I am not a developer but i asked to my friend, he suggest me this answer .

thanks
Best Dissertation Help UK