Reply
New Contributor
Posts: 3
Registered: ‎12-15-2014

HDFS replication error

Hi,

I'm trying to load a file to a Live with Cloudera Cluster (5.1) using Pentaho Kettle aka PDI (5.2) and I'm getting this error:

File /user/pdi/weblogs/in/weblogs_parse.txt could only be replicated to 0 nodes instead of minReplication (=1).

Does anybody knows how to fix this? I've formated the data node but it's stil not working.

Thanks!

Cloudera Employee
Posts: 578
Registered: ‎01-20-2014

Re: HDFS replication error

Are you able to check if the datanodes have enough free space? By
default the datanode should have space for at least five new blocks
(default size of 128MB), otherwise the write operation will fail.

You can simply run the command "sudo -u hdfs hdfs dfsadmin -report" to
get this information

Regards,
Gautam Gopalakrishnan
New Contributor
Posts: 3
Registered: ‎12-15-2014

Re: HDFS replication error

Hi Gautam,

 

3 Datanodes have plenty space:

DFS Used%: 0.01%
DFS Remaining%: 95.18%

 

Could it be some sceurity issue? The ETL tool uses my SO user to load data. First I was getting an access error that I solved by changing the destination folder permission with hadoop chmod in the master node.

 

Regards

New Contributor
Posts: 4
Registered: ‎12-09-2015

Re: HDFS replication error

@GautamG

 

@bmasciarelli

Can you please tell me commands and steps. Following are my commands and steps

 

  1. create user test1
  2. add test1 user in supergroup
  3. make folder in hadoop [ sudo -u hdfs hadoop fs -mkdir /user/test1 ]
  4. sudo -u hdfs hadoop fs -chmod -R 1777 /user/test1

 movies.txt is 30 KB file.

 

My pentaho job create the file movies.txt unders /user/test1/ folder but Empty 0 bytes.

 

Following exception raised

 Caused by: File /user/test1/movies.txt could only be replicated to 0 nodes instead of minReplication (=1).  There are 3 datanode(s) running and 3 node(s) are excluded in this operation.

 

New Contributor
Posts: 3
Registered: ‎12-15-2014

Re: HDFS replication error

Hi technet,

 

It's been a long time since I tried this, I really don't remember what was causing the exception but these are the steps I ended up following:

 

Besides creating the folder and changing it to 777, I had to give ownership to root:

sudo -u hdfs hadoop fs -chown -R root:root /user/test1

 

Then the only thing that worked for me was uploading PDI to the master node and using carte to run the jobs:

http://wiki.pentaho.com/display/EAI/Carte+User+Documentation

 

Hope this helps, regards!

New Contributor
Posts: 4
Registered: ‎12-09-2015

Re: HDFS replication error

Hi bmasciarelli,

 

sudo -u hdfs hadoop fs -chown -R root:root /user/test1 not work in my case too, I will check Carte option to execute Transformations and Jobs remotely.

 

Thank for your help.

Announcements