I am trying to use the avro-tools to get the avro schema from avro file. But I am getting the message like saying "file doesn't exist". If I issue "Hadoop fs -ls" command I can see the file. Even I can see the avro file content by giving the path. But if I issue the "avro-tools getschema <avro file path> I am getting file doesn't exist error. Anybody got the same type of error earlier. I also attached the screen shot of the error.
Move the .avro hdfs file to local by using below command
[cloudera@quickstart~]hdfs dfs -get /user/Mahe/custom_retail_db/orders_AVRO/part-m-00000.avro
hdfs dfs -get command will copy file from hadoop to local file system.
[cloudera@quickstart~]ls-ltr //list all the files by sorting time modified [cloudera@quickstart~]avro-tools getschema part-m-00000.avro
Avro-tools utility will expects file to be in local file system not in hadoop. That's the reason why you are getting part-m-00000.avro does not exist.
If you want to get schema from HDFS file instead of copy the file to Local file system then
You need to download avro tools dependencies from below link
Click on Download button, once you download is completed then move avro-tools-1.8.1.jar to local file system..
[cloudera@quickstart~]hadoop jar <path-to>/avro-tools-1.8.1.jar getschema /user/Mahe/custom_retail_db/orders_AVRO/part-m-00000.avro |hdfs dfs -put -f - /user/Mahe/avro_schema.avsc
In this command we are using hadoop file path and extracting the schema and storing the schema to avro_schema.avsc file in /user/Mahe directory.
[cloudera@quickstart~]hdfs dfs -cat /user/Mahe/avro_schema.avsc //to cat the contents of avro_schema.avsc file
In this way you can get schema from the .avro file, you can choose the best way for your case.