Support Questions

Find answers, ask questions, and share your expertise
Announcements
Welcome to the upgraded Community! Read this blog to see What’s New!

Create Hive Table from HDFS files

avatar
Contributor

I sqooped serveral tables from DB2 to hadoop HDFS.  The data landed fine.  I created hive tables using the following format(follows an example table create):

-- CCCE_APL
CREATE EXTERNAL TABLE IF NOT EXISTS ccce_apl_csv(
    APL_LNK INT,
    UPDT_DTTM CHAR(26),
    UPDT_USER CHAR(8),
    RLS_ORDR_MOD_CD CHAR(12),
    RLS_ORDR_MOD_TXT VARCHAR(255) )
ROW FORMAT DELIMITED
STORED AS TEXTFILE
location '/hdfs/data-lake/master/criminal/csv/ccce_apl';

 

The table is successfully created.  Now when I excute the  following query(select * from ccce_apl_csv)  in Hue, I only see NULLs in all the cols:

crct_cs_csv.cs_lnkcrct_cs_csv.yrcrct_cs_csv.tpcrct_cs_csv.seqcrct_cs_csv.cnty_geo_lnkcrct_cs_csv.updt_usercrct_cs_csv.updt_dttmcrct_cs_csv.num_gnrtn_cdcrct_cs_csv.inttng_doc_lnk

  crct_cs_csv.cs_lnk crct_cs_csv.yr crct_cs_csv.tp crct_cs_csv.seq crct_cs_csv.cnty_geo_lnk crct_cs_csv.updt_user crct_cs_csv.updt_dttm crct_cs_csv.num_gnrtn_cd crct_cs_csv.inttng_doc_lnk

1NULLNULLNULLNULLNULLNULLNULLNULLNULL
2NULLNULLNULLNULLNULLNULLNULLNULLNULL
3NULLNULLNULLNULLNULLNULLNULLNULLNULL
4NULLNULLNULLNULLNULLNULLNULLNULLNULL
5NULLNULLNULLNULLNULLNULLNULLNULLNULL
6NULLNULLNULLNULLNULLNULLNULLNULLNULL

 

Any Ideas what went wrong.  Again the data on HDFS looks fine, below sample data:

 

1,2014-06-15 12:49:50.06,ICPSRK ,RMN SM ,\N 2,2014-06-15 15:15:42.424,ICPSRK ,RMN SM ,\N 3,2014-06-16 18:26:29.515,ICPSRK ,RMN SM ,\N 4,2014-06-17 08:20:52.825,C79AJL ,\N,\N 5,2014-06-17 08:56:04.507,C79TEW ,RMN SM ,\N 6,2014-06-17 09:00:02.569,C79TEW ,RMN SM ,\N 7,2014-06-17 10:42:15.601,C79EDN ,OTHR ,\N 8,2014-06-17 11:12:56.218,C79EDN ,OTHR ,SEE FILE #13CR56266 9,2014-06-17 12:43:40.972,C79WBS ,OTHR ,OTHER-$5000 SEC BOND;NUPILLCS/ALC; 22,2014-06-19 14:42:22.799,C79AJL ,RMN SM ,\N
 

 

1 ACCEPTED SOLUTION

avatar
Contributor

Your comment gave me the clue, when I generated the script, I missed the statment that follows:                   ROW FORMAT DELIMITED, namely, -FIELDS TERMINATED BY ','.  So the correct create statement would be:

CREATE EXTERNAL TABLE IF NOT EXISTS ccce_apl(
    APL_LNK INT,
    UPDT_DTTM CHAR(26),
    UPDT_USER CHAR(8),
    RLS_ORDR_MOD_CD CHAR(12),
    RLS_ORDR_MOD_TXT VARCHAR(255) )
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
location '/hdfs/data-lake/master/criminal/csv/ccce_apl';

 

Thanks.

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

Hi imad87,

 

What is the purpose of specifying "ROW FORMAT DELIMITED" without a delimiter character?

 

On first glance it looks like your data file contains the substring "\N" (the \ character followed by the N character) to delimit lines, instead of the "\n" character (ASCII 0xA). Can you double check the file in a hex editor?

 

Cheers, Lars

avatar
Contributor

Your comment gave me the clue, when I generated the script, I missed the statment that follows:                   ROW FORMAT DELIMITED, namely, -FIELDS TERMINATED BY ','.  So the correct create statement would be:

CREATE EXTERNAL TABLE IF NOT EXISTS ccce_apl(
    APL_LNK INT,
    UPDT_DTTM CHAR(26),
    UPDT_USER CHAR(8),
    RLS_ORDR_MOD_CD CHAR(12),
    RLS_ORDR_MOD_TXT VARCHAR(255) )
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
location '/hdfs/data-lake/master/criminal/csv/ccce_apl';

 

Thanks.

Labels