Reply
New Contributor
Posts: 1
Registered: ‎02-06-2017

Getting Started: Tutorial 1

Hi,

 

After I launched the sqoop job and verify the files in the categories directory, I get only 2 files instead of 4.

 

[cloudera@quickstart ~]$ hadoop fs -ls /user/hive/warehouse/categories
Found 2 items
-rw-r--r-- 1 cloudera hive 0 2017-02-06 10:34 /user/hive/warehouse/categories/_SUCCESS
-rw-r--r-- 1 cloudera hive 1344 2017-02-06 10:34 /user/hive/warehouse/categories/part-m-00000.avro
[cloudera@quickstart ~]$

 

Can any one tell me if this is normal or something went wrong? If something went wrong, what went wrong and how to fix it?

 

Thanks!

 

Geoffrey

Highlighted
Cloudera Employee
Posts: 39
Registered: ‎12-14-2016

Re: Getting Started: Tutorial 1

Hi Geoffrey,

 

That’s normal on a single node setup. In the tutorial, there’s a note underneath the screenshot that shows the output of hadoop fs -ls /user/hive/warehouse/categories:

 

Note: The number of .parquet files shown will be equal to the number of mappers used by Sqoop. On a single-node you will just see one, but larger clusters will have a greater number of files.

 

I do notice that your file extension is .avro although the commands in the tutorial imports the data as parquet files. But, it should still be one .avro file on a single-node setup.


Cheers