12-15-2014 10:06 AM
I'm trying to load a file to a Live with Cloudera Cluster (5.1) using Pentaho Kettle aka PDI (5.2) and I'm getting this error:
File /user/pdi/weblogs/in/weblogs_parse.txt could only be replicated to 0 nodes instead of minReplication (=1).
Does anybody knows how to fix this? I've formated the data node but it's stil not working.
12-15-2014 09:05 PM
12-16-2014 04:28 AM
3 Datanodes have plenty space:
DFS Used%: 0.01%
DFS Remaining%: 95.18%
Could it be some sceurity issue? The ETL tool uses my SO user to load data. First I was getting an access error that I solved by changing the destination folder permission with hadoop chmod in the master node.
01-02-2016 10:57 AM
Can you please tell me commands and steps. Following are my commands and steps
movies.txt is 30 KB file.
My pentaho job create the file movies.txt unders /user/test1/ folder but Empty 0 bytes.
Following exception raised
Caused by: File /user/test1/movies.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
01-04-2016 04:38 AM
It's been a long time since I tried this, I really don't remember what was causing the exception but these are the steps I ended up following:
Besides creating the folder and changing it to 777, I had to give ownership to root:
sudo -u hdfs hadoop fs -chown -R root:root /user/test1
Then the only thing that worked for me was uploading PDI to the master node and using carte to run the jobs:
Hope this helps, regards!
01-04-2016 07:49 AM
sudo -u hdfs hadoop fs -chown -R root:root /user/test1 not work in my case too, I will check Carte option to execute Transformations and Jobs remotely.
Thank for your help.