I keep gettting this error
14 Dec 2016 09:44:03,218 WARN [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:461) - HDFS IO error java.io.IOException: Callable timed out after 10000 ms on file: hdfs://tmp/flumetest/FlumeData.1481737433180.tmp at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:720) at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:266) at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:541) at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:424) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:745) Caused by: java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask.get(FutureTask.java:205) at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:713) ... 6 more 14 Dec 2016 09:44:08,219 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSSequenceFile.configure:63) - writeFormat = Text, UseRawLocalFileSystem = false 14 Dec 2016 09:44:08,251 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.BucketWriter.open:265) - Creating hdfs://tmp/flumetest/FlumeData.1481737448220.tmp
Below is the flume agent config through ambari
agent.sources = pstream agent.channels = memoryChannel
agent.channels.memoryChannel.type = memory
agent.sources.pstream.channels = memoryChannel
agent.sources.pstream.type = exec
agent.sources.pstream.command = tail -f /etc/passwd
agent.sinks = hdfsSink agent.sinks.hdfsSink.type = hdfs agent.sinks.hdfsSink.channel = memoryChannel agent.sinks.hdfsSink.hdfs.path = hdfs://tmp/flumetest
agent.sinks.hdfsSink.hdfs.fileType = SequenceFile
agent.sinks.hdfsSink.hdfs.writeFormat = Text
the hdfs path is writable
what version of Hadoop? Flume? JDK?
Older version issues? What is the timeout set to?
What user is running Flume, do they have write permissions? it seems that a firewall or permissions are blocking it.
Try with debugging
Look at this article: http://www.thecloudavenue.com/2013/03/analyse-tweets-using-flume-hadoop-and.html
Also consider trying the same thing in Apache NiFi.
what version of Hadoop? Flume? JDK? // hadoop 188.8.131.52.3 ; flume 184.108.40.206.3 jdk 1.8
Older version issues? What is the timeout set to? // it is default timeout, i did set timeout to 1000000 but same error
All the flume agent configuration was set up through Ambari
Firewall is turned off and the hdfs dir is at 777 permission
any other thoughts ?
what are local permissions on the flume and current directory? can you run tail -f /etc/passwd with that user?
It should be, but is HDFS server configured in the configuration for Flume so it points to the correct server?
What server does flume agent run on? It has Hadoop / HDFS client? you can do hdfs dfs -put /etc/password /tmp/flumetest/
is anything in the hdfs://tmp/flumetest/FlumeData.1481737448220.tmp file?
what are local permissions on the flume and current directory? can you run tail -f /etc/passwd with that user? // it is under root user
It should be, but is HDFS server configured in the configuration for Flume so it points to the correct server? //yes
What server does flume agent run on? It has Hadoop / HDFS client? you can do hdfs dfs -put /etc/password /tmp/flumetest/ //the flume agent runs on all the hdfs cluster (3 servers) it is configured through Ambari, i can put passwd file using hdfs dfs -put /etc/password /tmp/flumetest/
hdfs dfs -ls /tmp/flumetest/ Found 1 items -rw-r--r-- 3 root hdfs 2379 2016-12-14 11:22 /tmp/flumetest/passwd
is anything in the hdfs://tmp/flumetest/FlumeData.1481737448220.tmp file? // no