Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Can anybody help with the SPARK syntax for the schema , parsing , dataframe and registering the data in a table to be queried in SQL.

Can anybody help with the SPARK syntax for the schema , parsing , dataframe and registering the data in a table to be queried in SQL.

New Contributor

Here is the log data as textfile. Can anybody help with the syntax for the schema , parsing , dataframe and registering the data in a table to be queried in SQL.

45AEDRS_3423.XT,8/27/2013 5:12:52 PM,2.000000,150,locat

45AEDRS_3423.XT,8/27/2013 5:12:53 PM,2.000000,150,locat

45AEDRS_3423.XT,8/27/2013 5:12:54 PM,2.000000,150,locat

45AEDRS_3423.XT,8/27/2013 5:12:55 PM,2.000000,150,locat

45AEDRS_3423.XT,8/27/2013 5:12:56 PM,2.000000,150,locat

45AEDRS_3423.XT,8/27/2013 5:12:57 PM,2.000000,150,locat

45AEDRS_3423.XT,8/27/2013 5:12:58 PM,2.000000,150,locat

45AEDRS_3423.XT,8/27/2013 5:12:59 PM,2.000000,150,locat.

2 REPLIES 2
Highlighted

Re: Can anybody help with the SPARK syntax for the schema , parsing , dataframe and registering the data in a table to be queried in SQL.

Rising Star

If you want to read it in a dataframe, you can simply use the library by Databricks: https://github.com/databricks/spark-csv. If you want to have a table on which run SQL queries at any time via SparkSQL or Hive, you can create a table using Hive or SparkSQL thriftserver with the syntax:

CREATE EXTERNAL TABLE IF NOT EXISTS your_table (
        ...)
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ','
    STORED AS TEXTFILE
    location '<the HDFS location of your data>';

Re: Can anybody help with the SPARK syntax for the schema , parsing , dataframe and registering the data in a table to be queried in SQL.

New Contributor

Marco thank you very much , I will take a peek

Don't have an account?
Coming from Hortonworks? Activate your account here