Created 06-21-2016 06:19 AM
I am able to download a hdfs file /org/project/archived/data/hive/warehouse/Stats/2016_06_20.txt in a broswer thorugh knox using below URL
Now I have a file in a hadoop archive as below.
har:///org/project/archived/data/hive/warehouse/test.har/Stats/2016_06_20.txt
How can i do the same for the above file?
Created 06-21-2016 07:17 AM
First download the har's _index file located at /org/project/archived/data/hive/warehouse/test.har/_index. Then locate Stats/2016_06_20.txt in _index and its data-n file, the offset within the data file and its length. Suppose it's in data-0 and offset=125000 and file-length=8200, then you can access
http://hostname:8443/knox/nm1/webhdfs/v1/org/project/archived/data/hive/warehouse/test.har/data-0?op...
Check this nicely written blog for a full example and a PHP script which can automate the process.
Created 06-21-2016 06:49 AM
@pooja khandelwal: You may use following approach to get hadoop archived file in your local machine.
hadoop fs -text har:///org/project/archived/data/hive/warehouse/test.har/Stats/2016_06_20.txt > 2016_06_20.txt
Created 06-21-2016 07:06 AM
I want to download the file through browser only.
Created 06-21-2016 07:17 AM
First download the har's _index file located at /org/project/archived/data/hive/warehouse/test.har/_index. Then locate Stats/2016_06_20.txt in _index and its data-n file, the offset within the data file and its length. Suppose it's in data-0 and offset=125000 and file-length=8200, then you can access
http://hostname:8443/knox/nm1/webhdfs/v1/org/project/archived/data/hive/warehouse/test.har/data-0?op...
Check this nicely written blog for a full example and a PHP script which can automate the process.
Created 06-22-2016 03:10 AM
Thank you.