Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Getting Started: Tutorial 1

Getting Started: Tutorial 1

New Contributor

Hi,

 

After I launched the sqoop job and verify the files in the categories directory, I get only 2 files instead of 4.

 

[cloudera@quickstart ~]$ hadoop fs -ls /user/hive/warehouse/categories
Found 2 items
-rw-r--r-- 1 cloudera hive 0 2017-02-06 10:34 /user/hive/warehouse/categories/_SUCCESS
-rw-r--r-- 1 cloudera hive 1344 2017-02-06 10:34 /user/hive/warehouse/categories/part-m-00000.avro
[cloudera@quickstart ~]$

 

Can any one tell me if this is normal or something went wrong? If something went wrong, what went wrong and how to fix it?

 

Thanks!

 

Geoffrey

1 REPLY 1

Re: Getting Started: Tutorial 1

Contributor

Hi Geoffrey,

 

That’s normal on a single node setup. In the tutorial, there’s a note underneath the screenshot that shows the output of hadoop fs -ls /user/hive/warehouse/categories:

 

Note: The number of .parquet files shown will be equal to the number of mappers used by Sqoop. On a single-node you will just see one, but larger clusters will have a greater number of files.

 

I do notice that your file extension is .avro although the commands in the tutorial imports the data as parquet files. But, it should still be one .avro file on a single-node setup.


Cheers