Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Can we create external hive table on top of Fixed width file?

Solved Go to solution
Highlighted

Can we create external hive table on top of Fixed width file?

Can we create external hive table on top of Fixed width file? If yes then how it can be done.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Can we create external hive table on top of Fixed width file?

Rising Star

Use Regex Serde to create the external table like below

CREATE EXTERNAL TABLE customers (userid STRING, fb_id STRING, twitter_id STRING, status STRING) ROW FORMAT SERDE ‘org.apache.hadoop.hive.contrib.serde2.RegexSerDe’ WITH SERDEPROPERTIES (“input.regex” = “(.{10})(.{10})(.{10})(.2})” )

LOCATION ‘path/to/data’;

Reference: http://www.confusedcoders.com/bigdata/hive/use-hive-serde-for-fixed-length-index-based-strings

View solution in original post

6 REPLIES 6
Highlighted

Re: Can we create external hive table on top of Fixed width file?

Super Collaborator

@Bala Vignesh N V what do you mean by 'Fixed width file' ? Can you give a more detailed example?

Highlighted

Re: Can we create external hive table on top of Fixed width file?

@Sergey SoldatovFixed width files where there will not be any delimiters available. Each column data will be avilable in the specific length but with no delimiters.

Re: Can we create external hive table on top of Fixed width file?

Rising Star

Use Regex Serde to create the external table like below

CREATE EXTERNAL TABLE customers (userid STRING, fb_id STRING, twitter_id STRING, status STRING) ROW FORMAT SERDE ‘org.apache.hadoop.hive.contrib.serde2.RegexSerDe’ WITH SERDEPROPERTIES (“input.regex” = “(.{10})(.{10})(.{10})(.2})” )

LOCATION ‘path/to/data’;

Reference: http://www.confusedcoders.com/bigdata/hive/use-hive-serde-for-fixed-length-index-based-strings

View solution in original post

Highlighted

Re: Can we create external hive table on top of Fixed width file?

Hi @Bala Vignesh N V.

Lester Martin has an excellent Pig script to do this type of work. It is not an external table solution but a good way to do this type of work... https://martin.atlassian.net/wiki/pages/viewpage.action?pageId=21299205

Highlighted

Re: Can we create external hive table on top of Fixed width file?

New Contributor

****** SERDE for MULTI LINE delimiter: ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'

WITH SERDEPROPERTIES ("field.delim"="\\s+") *************************************************************************** ====================================================================================================================================== CREATE EXTERNAL TABLE weather (USAF INT, WBAN INT, `Date` STRING, DIR STRING, SPD INT, GUS INT, CLG INT, SKC STRING, L STRING, M STRING, H STRING, VSB DECIMAL, MW1 STRING, MW2 STRING, MW3 STRING, MW4 STRING, AW1 STRING, AW2 STRING, AW3 STRING, AW4 STRING, W STRING, TEMP INT, DEWP INT, SLP DECIMAL, ALT DECIMAL, STP DECIMAL, MAX INT, MIN INT, PCP01 DECIMAL, PCP06 DECIMAL, PCP24 DECIMAL, PCPXX DECIMAL, SD INT) COMMENT 'weather table' ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delim"="\\s+") STORED AS TEXTFILE LOCATION '/data/Weather';

Highlighted

Re: Can we create external hive table on top of Fixed width file?

@bpreachukCan we not load the data into a single column of a stage table. And then use split() function to divide the string into multiple columns. Does this approach have any performance issues that would make us choose Pig / Serde approach instead ?Please share if you see any issues with this approach.

Don't have an account?
Coming from Hortonworks? Activate your account here