<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Cannot read parquet files in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51452#M5933</link>
    <description>&lt;P&gt;This is the result:&lt;/P&gt;&lt;PRE&gt;/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/parquet/bin/parquet-tools cat /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0
File /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0 does not exist&lt;/PRE&gt;&lt;P&gt;But the file exists&lt;/P&gt;&lt;PRE&gt;sudo -u hdfs hdfs dfs -ls /user/spot/flow/hive/y=2017/m=02/d=24/h=20
Found 12 items
&lt;FONT color="#FF0000"&gt;-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:05 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0&lt;/FONT&gt;
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:10 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_1
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:55 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_10
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 18:00 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_11
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:15 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_2
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:20 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_3
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:25 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_4
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:30 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_5
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:35 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_6
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:40 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_7
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:45 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_8
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:50 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_9&lt;/PRE&gt;</description>
    <pubDate>Fri, 24 Feb 2017 21:42:20 GMT</pubDate>
    <dc:creator>JoaquinS</dc:creator>
    <dc:date>2017-02-24T21:42:20Z</dc:date>
    <item>
      <title>Cannot read parquet files</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51443#M5929</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a table that is pointing to a location in HDFS, like this:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;# col_name              data_type               comment

treceived               string
unix_tstamp             bigint
tryear                  int
trmonth                 int
trday                   int
trhour                  int
trminute                int
trsec                   int
tdur                    float
sip                     string
dip                     string
sport                   int
dport                   int
proto                   string
flag                    string
fwd                     int
stos                    int
ipkt                    bigint
ibyt                    bigint
opkt                    bigint
obyt                    bigint
input                   int
output                  int
sas                     int
das                     int
dtos                    int
dir                     int
rip                     string

# Partition Information
# col_name              data_type               comment

y                       int
m                       int
d                       int
h                       int

# Detailed Table Information
Database:               spotdb
Owner:                  spot
CreateTime:             Thu Feb 23 16:41:20 CLST 2017
LastAccessTime:         UNKNOWN
Protect Mode:           None
Retention:              0
Location:               hdfs://HDFS-namenode:8020/user/spot/flow/hive
Table Type:             EXTERNAL_TABLE
Table Parameters:
        EXTERNAL                TRUE
        avro.schema.literal     {\n    \"type\":   \"record\"\n  , \"name\":   \"FlowRecord\"\n  , \"namespace\" : \"com.cloudera.accelerators.flows.avro\"\n  , \"fields\": [\n        {\"name\": \"treceived\",                  \"type\":[\"string\",   \"null\"]}\n     ,  {\"name\": \"unix_tstamp\",                 \"type\":[\"long\",     \"null\"]}\n     ,  {\"name\": \"tryear\",                    \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"trmonth\",                    \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"trday\",                    \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"trhour\",                    \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"trminute\",                    \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"trsec\",                    \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"tdur\",                    \"type\":[\"float\",   \"null\"]}\n     ,  {\"name\": \"sip\",              \"type\":[\"string\",   \"null\"]}\n     ,  {\"name\": \"sport\",                 \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"dip\",         \"type\":[\"string\",   \"null\"]}\n     ,  {\"name\": \"dport\",        \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"proto\",            \"type\":[\"string\",   \"null\"]}\n     ,  {\"name\": \"flag\",            \"type\":[\"string\",   \"null\"]}\n     ,  {\"name\": \"fwd\",                 \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"stos\",                 \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"ipkt\",                 \"type\":[\"bigint\",   \"null\"]}\n     ,  {\"name\": \"ibytt\",                 \"type\":[\"bigint\",   \"null\"]}\n     ,  {\"name\": \"opkt\",                 \"type\":[\"bigint\",   \"null\"]}\n     ,  {\"name\": \"obyt\",                 \"type\":[\"bigint\",   \"null\"]}\n     ,  {\"name\": \"input\",                 \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"output\",                 \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"sas\",                 \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"das\",                 \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"dtos\",                 \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"dir\",                 \"type\":[\"int\",   \"null\"]}\n     ,  {\"name\": \"rip\",                    \"type\":[\"string\",   \"null\"]}\n  ]\n}
        transient_lastDdlTime   1487878880

# Storage Information
SerDe Library:          org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
InputFormat:            org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
OutputFormat:           org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
Compressed:             No
Num Buckets:            -1
Bucket Columns:         []
Sort Columns:           []
Storage Desc Params:
        field.delim             ,
        serialization.format    ,&lt;/PRE&gt;&lt;P&gt;But when I select that table it says that it's empty. I checked that HDFS loacation and it has parquet files:&lt;/P&gt;&lt;PRE&gt;sudo -u hdfs hdfs dfs -ls /user/spot/flow/hive/y=2017/m=02/d=23/h=23
Found 12 items
-rwxr-xr-x   3 spot supergroup        440 2017-02-23 20:05 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0
-rwxr-xr-x   3 spot supergroup        440 2017-02-23 20:10 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_1
-rwxr-xr-x   3 spot supergroup        440 2017-02-23 20:55 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_10
-rwxr-xr-x   3 spot supergroup        440 2017-02-23 21:00 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_11
-rwxr-xr-x   3 spot supergroup        440 2017-02-23 20:15 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_2
-rwxr-xr-x   3 spot supergroup        440 2017-02-23 20:20 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_3
-rwxr-xr-x   3 spot supergroup        440 2017-02-23 20:25 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_4
-rwxr-xr-x   3 spot supergroup        440 2017-02-23 20:30 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_5
-rwxr-xr-x   3 spot supergroup        440 2017-02-23 20:35 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_6
-rwxr-xr-x   3 spot supergroup        440 2017-02-23 20:40 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_7
-rwxr-xr-x   3 spot supergroup        440 2017-02-23 20:45 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_8
-rwxr-xr-x   3 spot supergroup        440 2017-02-23 20:50 /user/spot/flow/hive/y=2017/m=02/d=23/h=23/000000_0_copy_9&lt;/PRE&gt;&lt;P&gt;What did i do wrong?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Joaquín Silva&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 11:09:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51443#M5929</guid>
      <dc:creator>JoaquinS</dc:creator>
      <dc:date>2022-09-16T11:09:18Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot read parquet files</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51446#M5930</link>
      <description>What is the output of 'show partitions &amp;lt;tablename&amp;gt;;'?&lt;BR /&gt;&lt;BR /&gt;I suspect that HMS does not know about the partitions.&lt;BR /&gt;&lt;BR /&gt;If you don't see any partitions in the output you can add them manually using one of the below commands.&lt;BR /&gt;&lt;BR /&gt;This will add all missing partition that it finds in HDFS.&lt;BR /&gt;&lt;BR /&gt;msck repair &amp;lt;tablename&amp;gt;;&lt;BR /&gt;&lt;BR /&gt;This will add one at a time:&lt;BR /&gt;&lt;BR /&gt;alter table &amp;lt;tablename&amp;gt; add partition (y=2017,m=02,d=23,h=23);</description>
      <pubDate>Fri, 24 Feb 2017 21:05:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51446#M5930</guid>
      <dc:creator>mbigelow</dc:creator>
      <dc:date>2017-02-24T21:05:17Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot read parquet files</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51448#M5931</link>
      <description>&lt;P&gt;This is the result:&lt;/P&gt;&lt;PRE&gt;show partitions spotdb.flow;
OK
y=2017/m=02/d=23/h=20
y=2017/m=02/d=23/h=21
y=2017/m=02/d=23/h=22
y=2017/m=02/d=23/h=23
y=2017/m=02/d=24/h=00
y=2017/m=02/d=24/h=01
y=2017/m=02/d=24/h=02
y=2017/m=02/d=24/h=03
y=2017/m=02/d=24/h=04
y=2017/m=02/d=24/h=05
y=2017/m=02/d=24/h=06
y=2017/m=02/d=24/h=07
y=2017/m=02/d=24/h=08
y=2017/m=02/d=24/h=09
y=2017/m=02/d=24/h=10
y=2017/m=02/d=24/h=11
y=2017/m=02/d=24/h=12
y=2017/m=02/d=24/h=13
y=2017/m=02/d=24/h=14
y=2017/m=02/d=24/h=15
y=2017/m=02/d=24/h=16
y=2017/m=02/d=24/h=17
y=2017/m=02/d=24/h=18
y=2017/m=02/d=24/h=19
y=2017/m=02/d=24/h=20&lt;/PRE&gt;&lt;P&gt;As I can see it recognize the partitions.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 24 Feb 2017 21:08:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51448#M5931</guid>
      <dc:creator>JoaquinS</dc:creator>
      <dc:date>2017-02-24T21:08:49Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot read parquet files</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51451#M5932</link>
      <description>Try to read the file directly. There are other parquet tool options as well to validate the schema, etc.&lt;BR /&gt;&lt;BR /&gt;/opt/cloudera/parcels/CDH/lib/parquet/bin/parquet-tools cat /path/to/parquet.file</description>
      <pubDate>Fri, 24 Feb 2017 21:19:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51451#M5932</guid>
      <dc:creator>mbigelow</dc:creator>
      <dc:date>2017-02-24T21:19:23Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot read parquet files</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51452#M5933</link>
      <description>&lt;P&gt;This is the result:&lt;/P&gt;&lt;PRE&gt;/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/parquet/bin/parquet-tools cat /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0
File /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0 does not exist&lt;/PRE&gt;&lt;P&gt;But the file exists&lt;/P&gt;&lt;PRE&gt;sudo -u hdfs hdfs dfs -ls /user/spot/flow/hive/y=2017/m=02/d=24/h=20
Found 12 items
&lt;FONT color="#FF0000"&gt;-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:05 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0&lt;/FONT&gt;
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:10 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_1
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:55 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_10
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 18:00 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_11
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:15 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_2
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:20 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_3
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:25 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_4
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:30 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_5
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:35 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_6
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:40 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_7
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:45 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_8
-rwxr-xr-x   3 spot supergroup        440 2017-02-24 17:50 /user/spot/flow/hive/y=2017/m=02/d=24/h=20/000000_0_copy_9&lt;/PRE&gt;</description>
      <pubDate>Fri, 24 Feb 2017 21:42:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51452#M5933</guid>
      <dc:creator>JoaquinS</dc:creator>
      <dc:date>2017-02-24T21:42:20Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot read parquet files</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51454#M5934</link>
      <description>Are sure they parquet files? I haven't used the parquet tools in a bit so I don't know the behavior of trying to read a non-parquet file with them. I could see it view it as not existing if that is the case and it would also be why you can see the data through the table.&lt;BR /&gt;&lt;BR /&gt;How was this data added to HDFS/table?&lt;BR /&gt;Is this a Hive or Impala table?</description>
      <pubDate>Fri, 24 Feb 2017 21:50:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51454#M5934</guid>
      <dc:creator>mbigelow</dc:creator>
      <dc:date>2017-02-24T21:50:46Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot read parquet files</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51457#M5935</link>
      <description>I found the issue, the parquet files where empty.</description>
      <pubDate>Fri, 24 Feb 2017 22:32:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Cannot-read-parquet-files/m-p/51457#M5935</guid>
      <dc:creator>JoaquinS</dc:creator>
      <dc:date>2017-02-24T22:32:06Z</dc:date>
    </item>
  </channel>
</rss>

