Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

The Product.tsv could not be load correctly using the the script in the tutorial

Highlighted

The Product.tsv could not be load correctly using the the script in the tutorial

New Contributor

Try it out yourself, the name.tsv is not even products.tsv, it is urlmap.tsv instead, even if I proceed to the end of this tut, it display wrong results.

http://hortonworks.com/hadoop-tutorial/loading-data-into-the-hortonworks-sandbox/

2 REPLIES 2

Re: The Product.tsv could not be load correctly using the the script in the tutorial

Super Guru
@bruce lee

I hope you have downloaded RefineDemoData.zip mentioned in Step 1.

As per step 2.6

2.6) Now, we navigate to /tmp/maria_dev, click on upload and browse the Omniture.0.tsv.
Repeat this procedure for users.tsv file and for products.tsv. 

You need to extract files from RefineDemoData.zip and upload required files.

See below:

3492-screen-shot-2016-04-17-at-103815-am.png

Re: The Product.tsv could not be load correctly using the the script in the tutorial

New Contributor

@bruce lee I found the same problem - are you using WinRAR or similar to extract the files?

Gzip files normally extract to the same name as the file, only losing the .gz extension, but the original filename is stored in the compressed version; the standard *ix commandline gunzip tool has a -N option to extract using the original filename, but WinRAR does this by default. The .gz files in RefineDemoData.zip have been renamed, and the demo steps require these names, not the original ones.

I can't see an option to disable this WinRAR behaviour, so I suspect you'll have to rename the files after extraction.

@Kuldeep Kulkarni as this is a pretty likely use case, perhaps you could update the zip to contain files which were named appropriately before compression?