Support Questions
Find answers, ask questions, and share your expertise

SPARK-SQL running with yarn failed creating table from existing table

SPARK-SQL running with yarn failed creating table from existing table

Explorer

I have a 4 node cluster. I ran spark-sql, created a table with 5 rows. I am trying to duplicate this table using "create table new_table as (select * from old_table);". The new table is created, but no content. But if I start spark-sql in client mode (without --master yarn option), the new table has all data from the old table. Following is my screen dump with no YARN shows correct table created. Then repeated with YARN option, the new table is empty.

I do not see errors from Ambari dashboard

spark@msl-dpe-perf88:/home/harry.li/TPC/benchmarks/spark-tpc-ds-performance-test$ spark-sql  --executor-memory 20G --num-executors 4   --driver-java-options -Dlog4j.configuration=file:////home/harry.li/TPC/benchmarks/spark-tpc-ds-performance-test/work/log4j.properties --conf spark.executor.extraJavaOptions=-Dlog4j.configuration=file:////home/harry.li/TPC/benchmarks/spark-tpc-ds-performance-test/work/log4j.properties --conf spark.sql.catalogImplementation=hive
SPARK_MAJOR_VERSION is set to 2, using Spark2
spark-sql> use tpcds;
Time taken: 1.164 seconds
spark-sql> select * from base_table;
1AAAAAAAABAAAAAAAConventional childr9777876516th ParkwaySuite 470FairviewWilliamson CountyTN35709United States-5.0
2AAAAAAAACAAAAAAAImportant issues liv138504600View FirstAvenueSuite PFairviewWilliamson CountyTN35709United States-5.0
3AAAAAAAADAAAAAAADoors canno294242534Ash LaurelDr.Suite 0FairviewWilliamson CountyTN35709United States-5.0
4AAAAAAAAEAAAAAAABad cards must make.621234368Wilson ElmDriveSuite 80FairviewWilliamson CountyTN35709United States-5.0
5AAAAAAAAFAAAAAAANULLNULLNULLNULLNULLNULLFairviewWilliamson CountyTN35709United StatesNULL
Time taken: 1.125 seconds, Fetched 5 row(s)
spark-sql> create table test1 as (select * from base_table);
Time taken: 0.777 seconds
spark-sql> select * from test1;
1AAAAAAAABAAAAAAAConventional childr9777876516th ParkwaySuite 470FairviewWilliamson CountyTN35709United States-5.0
2AAAAAAAACAAAAAAAImportant issues liv138504600View FirstAvenueSuite PFairviewWilliamson CountyTN35709United States-5.0
3AAAAAAAADAAAAAAADoors canno294242534Ash LaurelDr.Suite 0FairviewWilliamson CountyTN35709United States-5.0
4AAAAAAAAEAAAAAAABad cards must make.621234368Wilson ElmDriveSuite 80FairviewWilliamson CountyTN35709United States-5.0
5AAAAAAAAFAAAAAAANULLNULLNULLNULLNULLNULLFairviewWilliamson CountyTN35709United StatesNULL
Time taken: 0.14 seconds, Fetched 5 row(s)
spark-sql> quit;
spark@msl-dpe-perf88:/home/harry.li/TPC/benchmarks/spark-tpc-ds-performance-test$ spark-sql --master yarn  --executor-memory 20G --num-executors 4   --driver-java-options -Dlog4j.configuration=file:////home/harry.li/TPC/benchmarks/spark-tpc-ds-performance-test/work/log4j.properties --conf spark.executor.extraJavaOptions=-Dlog4j.configuration=file:////home/harry.li/TPC/benchmarks/spark-tpc-ds-performance-test/work/log4j.properties --conf spark.sql.catalogImplementation=hive
SPARK_MAJOR_VERSION is set to 2, using Spark2
spark-sql> use tpcds;
Time taken: 0.39 seconds
spark-sql> select * from base_table;
1AAAAAAAABAAAAAAAConventional childr9777876516th ParkwaySuite 470FairviewWilliamson CountyTN35709United States-5.0
2AAAAAAAACAAAAAAAImportant issues liv138504600View FirstAvenueSuite PFairviewWilliamson CountyTN35709United States-5.0
3AAAAAAAADAAAAAAADoors canno294242534Ash LaurelDr.Suite 0FairviewWilliamson CountyTN35709United States-5.0
4AAAAAAAAEAAAAAAABad cards must make.621234368Wilson ElmDriveSuite 80FairviewWilliamson CountyTN35709United States-5.0
5AAAAAAAAFAAAAAAANULLNULLNULLNULLNULLNULLFairviewWilliamson CountyTN35709United StatesNULL
Time taken: 2.334 seconds, Fetched 5 row(s)
spark-sql> create table test_with_yarn as (select * from base_table);
Time taken: 2.36 seconds
spark-sql> select * from test_with_yarn;
Time taken: 0.121 seconds
spark-sql>