Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive External Table Partitioned by Columns (Zero Bytes Files are been generated) is not throwing results

Hive External Table Partitioned by Columns (Zero Bytes Files are been generated) is not throwing results

New Contributor

Hi,

Can any one faced issue while fetching data from external table. We are copying data from upstream system into our storage S3. As part of copy, directories along with Zero bytes files are been copied. Source File Format is in JSON format and Compress (Gz) . Below is Folder Hierarchy Structure

DATE --> <Folder>

<DAY=201803250> ---> Folder

1.json.gz --> File

2.json.gz

<DAY=201803250> ---> Empty Zero Bytes Files.

Please find below screenshot

72528-77z5s.png

We are trying to create external table with JSON Serde.

ADD JAR wasb://jsonserde@XYZ.blob.core.windows.net/json/json-serde-1.3.9.jar;
SET hive.mapred.supports.subdirectories=TRUE;
SET mapred.input.dir.recursive=TRUE;
SET hive.merge.mapfiles = true;
SET hive.merge.mapredfiles = true;
SET hive.merge.tezfiles = true;


DROP TABLE IF EXISTS Ext_STG1;
CREATE EXTERNAL TABLE Ext_STG1(Col1 String, Col2 String, Col3 String) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' WITH SERDEPROPERTIES ("case.insensitive" = "true", "ignore.malformed.json" = "true")
STORED AS TEXTFILE LOCATION 'wasb://container1@xyz.blob.core.windows.net/date/day=201803250/' TBLPROPERTIES ('serialization.null.format' = '');

select * from Ext_STG1 limit 100;