Reply
New Contributor
Posts: 1
Registered: ‎02-06-2017

Getting Started: Tutorial 1

Hi,

 

After I launched the sqoop job and verify the files in the categories directory, I get only 2 files instead of 4.

 

[cloudera@quickstart ~]$ hadoop fs -ls /user/hive/warehouse/categories
Found 2 items
-rw-r--r-- 1 cloudera hive 0 2017-02-06 10:34 /user/hive/warehouse/categories/_SUCCESS
-rw-r--r-- 1 cloudera hive 1344 2017-02-06 10:34 /user/hive/warehouse/categories/part-m-00000.avro
[cloudera@quickstart ~]$

 

Can any one tell me if this is normal or something went wrong? If something went wrong, what went wrong and how to fix it?

 

Thanks!

 

Geoffrey

Cloudera Employee
Posts: 17
Registered: ‎12-14-2016

Re: Getting Started: Tutorial 1

Hi Geoffrey,

 

That’s normal on a single node setup. In the tutorial, there’s a note underneath the screenshot that shows the output of hadoop fs -ls /user/hive/warehouse/categories:

 

Note: The number of .parquet files shown will be equal to the number of mappers used by Sqoop. On a single-node you will just see one, but larger clusters will have a greater number of files.

 

I do notice that your file extension is .avro although the commands in the tutorial imports the data as parquet files. But, it should still be one .avro file on a single-node setup.


Cheers

Announcements