Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Impala select count(*) returns result, but select * return empty

Impala select count(*) returns result, but select * return empty

Contributor

Hi,

 

I created an exteranl table (HDFS in avor) in HIVE, i can select count in impala and it returns the number, but if i do select * it return 0 row. What might be the issue?

 

Thanks

Shannon

16 REPLIES 16
Highlighted

Re: Impala select count(*) returns result, but select * return empty

Contributor

Can you run refresh <table_name> and then run select * ?

Re: Impala select count(*) returns result, but select * return empty

Contributor


Query: refresh external_traffic_data_10_18_09

Fetched 0 row(s) in 1.09s

 

 

still no data.

 

 

 

Also select * has a warning,

 

WARNINGS: Field STATION_ID is nullable in the file schema but not the table schema.

 

is this related?

 

Thanks

Shannon

Re: Impala select count(*) returns result, but select * return empty

Contributor

Yes, this might be related to the problem

 

Warning suggest that avro schema in data file and avro schema in the table metadata are different.

 

1. download the avro file from hdfs to local 

 

hdfs dfs -get /<path_to_avro_file>

 

2. get the schema out of the file

 

avro-tools getschema <filename>

 

3. get the table schema 

 

show create table <table_name>

 

4. If you had used a avsc file for the schema, can you post its content in the comments and also the result of 2 and 3

 

Re: Impala select count(*) returns result, but select * return empty

Contributor

I pulled the data from oracle with jdbc connect, schema

 

 

{
"type" : "record",
"name" : "ConnectDefault",
"namespace" : "io.confluent.connect.avro",
"fields" : [ {
"name" : "TMS_DATA_ID",
"type" : {
"type" : "bytes",
"connect.version" : 1,
"connect.parameters" : {
"scale" : "0"
},
"connect.name" : "org.apache.kafka.connect.data.Decimal"
}
}, {
"name" : "STATION_ID",
"type" : [ "null", {
"type" : "bytes",
"connect.version" : 1,
"connect.parameters" : {
"scale" : "0"
},
"connect.name" : "org.apache.kafka.connect.data.Decimal"
} ]
}, {
"name" : "VOLUME",
"type" : [ "null", {
"type" : "bytes",
"connect.version" : 1,
"connect.parameters" : {
"scale" : "0"
},
"connect.name" : "org.apache.kafka.connect.data.Decimal"
} ]
}, {
"name" : "SPEED",
"type" : [ "null", {
"type" : "bytes",
"connect.version" : 1,
"connect.parameters" : {
"scale" : "0"
},
"connect.name" : "org.apache.kafka.connect.data.Decimal"
} ]
}, {
"name" : "DATE_CREATED",
"type" : {
"type" : "long",
"connect.version" : 1,
"connect.name" : "org.apache.kafka.connect.data.Timestamp"
}
}, {
"name" : "SAMPLE_TIME",
"type" : [ "null", {
"type" : "long",
"connect.version" : 1,
"connect.name" : "org.apache.kafka.connect.data.Timestamp"
} ]
}, {
"name" : "SEGMENT_ID",
"type" : [ "null", {
"type" : "bytes",
"connect.version" : 1,
"connect.parameters" : {
"scale" : "0"
},
"connect.name" : "org.apache.kafka.connect.data.Decimal"
} ]
}, {
"name" : "SEGMENT_NAME",
"type" : "string"
}, {
"name" : "STATION_NAME",
"type" : "string"
}, {
"name" : "SECTION_NAME",
"type" : "string"
}, {
"name" : "SECTION_ID",
"type" : [ "null", {
"type" : "bytes",
"connect.version" : 1,
"connect.parameters" : {
"scale" : "0"
},
"connect.name" : "org.apache.kafka.connect.data.Decimal"
} ]
} ],
"connect.version" : 1
}

Re: Impala select count(*) returns result, but select * return empty

Contributor

When i create external table in hive, i was using integer for _ID, but it complains when i do select, so i changed to string.

 

 

 

"fields":[
{"name":"TMS_DATA_ID","type":"string"},
{"name":"STATION_ID","type":"string"},
{"name":"VOLUME","type":"string"},
{"name":"SPEED","type":"string"},
{"name":"DATE_CREATED","type":"long"},
{"name":"SAMPLE_TIME","type":"long"},
{"name":"SEGMENT_ID","type":"string"},
{"name":"SEGMENT_NAME","type":"string"},
{"name":"STATION_NAME","type":"string"},
{"name":"SECTION_NAME","type":"string"},
{"name":"SECTION_ID","type":"string"}]}')
;

Re: Impala select count(*) returns result, but select * return empty

Contributor

When i create external table in hive, i was using integer for _ID, but it complains when i do select, so i changed to string.

 

 

 

"fields":[
{"name":"TMS_DATA_ID","type":"string"},
{"name":"STATION_ID","type":"string"},
{"name":"VOLUME","type":"string"},
{"name":"SPEED","type":"string"},
{"name":"DATE_CREATED","type":"long"},
{"name":"SAMPLE_TIME","type":"long"},
{"name":"SEGMENT_ID","type":"string"},
{"name":"SEGMENT_NAME","type":"string"},
{"name":"STATION_NAME","type":"string"},
{"name":"SECTION_NAME","type":"string"},
{"name":"SECTION_ID","type":"string"}]}')
;

Re: Impala select count(*) returns result, but select * return empty

Contributor

https://github.com/cloudera/Impala/blob/b669dfc9cf72c8faedd8b5eab27e7472b090dd1d/be/src/exec/hdfs-av...

per the code, impala expects if a column is nullable in file schema, it should be nullable in the table schema as well


You can have the type as string for table but can you make each column nullable in the avsc file.

For example:

instead of
{"name":"TMS_DATA_ID","type":"string"},

try

{"name":"TMS_DATA_ID","type":["null","string"]}

Do this for all columns

Then create the table using the reframed avsc file and do select * again?

Re: Impala select count(*) returns result, but select * return empty

Contributor

Sorry, not  for all columns.

 

make changes for columns that has nullable value in file schema. 

 

i.e 

 

"name" : "STATION_ID",
"name" : "VOLUME",
"name" : "SPEED",
"name" : "SAMPLE_TIME",
"name" : "SEGMENT_ID",
"name" : "SECTION_ID",

Re: Impala select count(*) returns result, but select * return empty

Contributor

Thank you! It is working now.

 

Now the question, how can i cast/convert the bytes to number?

 

also for time, i remember i can do 

 

cast(sample_time as timestamp)

 

but now it returns null.