Support Questions
Find answers, ask questions, and share your expertise

Flume + HDFS IO error + ConnectException

Solved Go to solution

Flume + HDFS IO error + ConnectException

Explorer

Hi,
I'm working with Cloudera Manager CDH 5.4.2, also installed Flume, I can not save the information that I get from Twitter,

When I run the flume agent, it starts okay but ends up in error when it attempts writing the new event data into hdfs.

 I got the follow error:

 

 INFO org.apache.flume.sink.hdfs.BucketWriter: Creating hdfs://192.168.109.6:8020/user/flume/tweets/2015/06/03/06//FlumeData.1433311217583.tmp

 

WARN org.apache.flume.sink.hdfs.HDFSEventSink: HDFS IO error
java.net.ConnectException: Call From cluster-05.xxxx.com/192.168.109.6 to cluster-05.xxxx.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

 

The configuration that I did was :

 

flume-conf.property:

 

TwitterAgent.sinks.HDFS.channel = MemChannel
TwitterAgent.sinks.HDFS.type = hdfs
TwitterAgent.sinks.HDFS.hdfs.path = hdfs://192.168.109.6:8020/user/flume/tweets/%Y/%m/%d/%H/
TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream
TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000

I using the follown pluggins:

  • flume-sources-1.0-SNAPSHOT.jar
  • twitter4j-core-2.2.6.jar
  • twitter4j-media-support-2.2.6.jar
  • twitter4j-stream-2.2.6.jar

(I replace the version of the twitter4j-*-3.0.3.jar for the twitter4j-*-2.2.6.jar)

 

also the directory using hdfs user

hadoop fs -ls /user/flume : 

drwxrwxrwx - flume flume  /user/flume/tweets

 

core-site.xml ( at /hadoop/conf ) i Add:

 

< property >
< name >fs.default.name< / name >
< value >hdfs://localhost:8020< / value >
< /property >

 

I also run hadoop dfsadmin -safemode leave on the host where I left the Flume Agent as HDFS user 

 

 

I really appreciate your help, on this issue.

 Regards,

AR

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Flume + HDFS IO error + ConnectException

Explorer

I find the solution myself, and I left you here, .. in case anyone has the same error..

my error was ( because i was in a cluster ) 

I should point into the hadoop host..  so .. I change the address.. here 

TwitterAgent.sinks.HDFS.hdfs.path = hdfs://192.168.109.6:8020/user/flume/tweets/%Y/%m/%d/%H/

 

and everything was running smoothly

thanks

View solution in original post

3 REPLIES 3

Re: Flume + HDFS IO error + ConnectException

Explorer

I find the solution myself, and I left you here, .. in case anyone has the same error..

my error was ( because i was in a cluster ) 

I should point into the hadoop host..  so .. I change the address.. here 

TwitterAgent.sinks.HDFS.hdfs.path = hdfs://192.168.109.6:8020/user/flume/tweets/%Y/%m/%d/%H/

 

and everything was running smoothly

thanks

View solution in original post

Re: Flume + HDFS IO error + ConnectException

Contributor

Hai, as you explaned above you changed some address for solving HDFS IO error and i not see any chage in address which you given in solution can you explain clear what you done for solving above error

 

Re: Flume + HDFS IO error + ConnectException

New Contributor

In my case the problem was the port number which was incorrect. I ensured that I used the Namenode port.