About SparkNewbie

SparkNewbie · ‎07-07-2021

@Shelton Thank you for your response. I tried as you suggested but the table didn't show up in beeline. I checked in mysql metastore db as well. The new table info does not show up there. 21/07/07 23:07:56 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY I still see the above in the stderr file eventhough my hive-site.xml is set to use mysql. Not sure if I am missing something else. <name>hive.metastore.db.type</name> <value>mysql</value> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://<mysql end point>:3306/metastore_db</value>

SparkNewbie · ‎07-07-2021

I have a spark cluster with one master and two workers on different servers. I have copied hive-site.xml in spark conf on all 3 servers and started thrift server on master server pointing to spark master. Using this to connect beeline to thrift server and run spark sql queries. I created a scala application to load data from csv to dataframe and then to a spark sql table backed by S3 parquet. Metadata is in MySQL. I am using spark submit command to run the scala app on the cluster. I do not see the table created by the scala app in beeline. However, when I use spark shell on master connected to spark master and do the same loading of a csv file and creating a table, I can see it in beeline. Am I missing something with the scala app? Also do we need HDFS to make this work? hive.metastore.uris in hive-site.xml is not set currently. Not sure what I should set that to, since I dont have anything running on the 9083 port. Also I started thrift server(which runs on port 10000) from spark sbin directory like this : /opt/spark/sbin/start-thriftserver.sh --master spark://<master-ip>:7077 --total-executor-cores 1 This is my scala code: def main(args : Array[String]) { println( "Hello World!" ) val warehouseLocation = "/home/ubuntu/test/hive_warehouse" val spark = SparkSession .builder() .appName("Spark SQL basic example") .config("spark.sql.warehouse.dir", warehouseLocation) .enableHiveSupport() .getOrCreate() val df = spark.read.format("csv").load(args(0)) df.createOrReplaceTempView("my_temp_table") spark.sql("create table test1_0522 location 's3a://<test-bucket>/data/test1_0522' stored as PARQUET as select * from my_temp_table") spark.sql("SHOW TABLES").show() }

Online	Offline
Last Visited	‎07-07-2021 07:54 PM

Member Since	‎07-07-2021 07:59 AM
Last Visited	‎07-07-2021 07:54 PM
Posts	2

Cloudera Community

Re: Unable to see table created by scala app in sp...

Unable to see table created by scala app in spark ...