Issue in using avro-tools


I am trying to use the avro-tools to get the avro schema from avro file. But I am getting the message like saying "file doesn't exist". If I issue "Hadoop fs -ls" command I can see the file. Even I can see the avro file content by giving the path. But if I issue the "avro-tools getschema <avro file path> I am getting file doesn't exist error. Anybody got the same type of error earlier. I also attached the screen shot of the error.




Method 1:-

Move the .avro hdfs file to local by using below command

[cloudera@quickstart~]hdfs dfs -get /user/Mahe/custom_retail_db/orders_AVRO/part-m-00000.avro 

hdfs dfs -get command will copy file from hadoop to local file system.


[cloudera@quickstart~]ls-ltr //list all the files by sorting time modified
[cloudera@quickstart~]avro-tools getschema part-m-00000.avro

Avro-tools utility will expects file to be in local file system not in hadoop. That's the reason why you are getting part-m-00000.avro does not exist.


Method 2:-

If you want to get schema from HDFS file instead of copy the file to Local file system then

You need to download avro tools dependencies from below link


Click on Download button, once you download is completed then move avro-tools-1.8.1.jar to local file system..

then run

[cloudera@quickstart~]hadoop jar <path-to>/avro-tools-1.8.1.jar getschema /user/Mahe/custom_retail_db/orders_AVRO/part-m-00000.avro |hdfs dfs -put -f - /user/Mahe/avro_schema.avsc

In this command we are using hadoop file path and extracting the schema and storing the schema to avro_schema.avsc file in /user/Mahe directory.

[cloudera@quickstart~]hdfs dfs -cat /user/Mahe/avro_schema.avsc  //to cat the contents of avro_schema.avsc file

In this way you can get schema from the .avro file, you can choose the best way for your case.

Re: Issue in using avro-tools


Thanks, Shu.. The problem is fixed after I move the file to local. Thanks for the response.

