Created on 06-03-2015 12:13 AM - edited 06-03-2015 12:17 AM
Hi,
I'm working with Cloudera Manager CDH 5.4.2, also installed Flume, I can not save the information that I get from Twitter,
When I run the flume agent, it starts okay but ends up in error when it attempts writing the new event data into hdfs.
I got the follow error:
INFO org.apache.flume.sink.hdfs.BucketWriter: Creating hdfs://192.168.109.6:8020/user/flume/tweets/2015/06/03/06//FlumeData.1433311217583.tmp
WARN org.apache.flume.sink.hdfs.HDFSEventSink: HDFS IO error
java.net.ConnectException: Call From cluster-05.xxxx.com/192.168.109.6 to cluster-05.xxxx.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
The configuration that I did was :
flume-conf.property:
TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://192.168.109.6:8020/user/flume/tweets/%Y/%m/%d/%H/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000
I using the follown pluggins:
(I replace the version of the twitter4j-*-3.0.3.jar for the twitter4j-*-2.2.6.jar)
also the directory using hdfs user
hadoop fs -ls /user/flume :
drwxrwxrwx - flume flume /user/flume/tweets
core-site.xml ( at /hadoop/conf ) i Add:
< property >
< name >fs.default.name< / name >
< value >hdfs://localhost:8020< / value >
< /property >
I also run hadoop dfsadmin -safemode leave on the host where I left the Flume Agent as HDFS user
I really appreciate your help, on this issue.
Regards,
AR
Created 06-03-2015 10:07 PM
I find the solution myself, and I left you here, .. in case anyone has the same error..
my error was ( because i was in a cluster )
I should point into the hadoop host.. so .. I change the address.. here
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://192.168.109.6:8020/user/flume/tweets/%Y/%m/%d/%H/
and everything was running smoothly
thanks
Created 06-03-2015 10:07 PM
I find the solution myself, and I left you here, .. in case anyone has the same error..
my error was ( because i was in a cluster )
I should point into the hadoop host.. so .. I change the address.. here
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://192.168.109.6:8020/user/flume/tweets/%Y/%m/%d/%H/
and everything was running smoothly
thanks
Created 03-29-2016 10:39 PM
Hai, as you explaned above you changed some address for solving HDFS IO error and i not see any chage in address which you given in solution can you explain clear what you done for solving above error
Created 07-24-2017 03:06 PM
In my case the problem was the port number which was incorrect. I ensured that I used the Namenode port.