Support Questions

Find answers, ask questions, and share your expertise

flume + < secure cluster + insecure cluster >hdfs java.io.EOFException

avatar
Expert Contributor

HI

 

Very urgent!

 

We are working with tow CDH5.7.1 cluster , one is the secure and another is the insecure cluster.

We install flume agent service with secure cluster. 

1.  on the secure  we run the command access insecure cluster.

 a.  hdfs dfs -ls hdfs://cache01.dev1.fn:8020/flume/app_logs/
ls: End of File Exception between local host is: "arch-od-tracker04.beta1.fn/10.202.251.14"; destination host is: "cache01.dev1.fn":8020; : java.io.EOFException; For more details see: http://wiki.apache.org/hadoop/EOFException

 b. hdfs dfs -ls webhdfs://cache01.dev1.fn:50070/flume/app_logs/
Found 1 items
drwxrwxrwx - flume supergroup 0 2016-07-28 13:15 webhdfs://cache01.dev1.fn:50070/flume/app_logs/2016-07-28

 

So we run the flume.

 

1.

                            ----->   insecure cluster   (fail) with   exception of a.   java.io.EOFException;

secure cluster      -----> secure cluster self (success)

 

 

2. 

                            ----->   insecure cluster   (success)

secure cluster      -----> insecure cluster (success)

 

3. 

                            ----->   secure cluster   (success)

secure cluster      ----->secure cluster (success)

 

Now, my question is how to config the flume agent to let 1 to work fine.

 

Thanks in advance.

 

BR

Paul

2 ACCEPTED SOLUTIONS

avatar
Mentor
On your insecure cluster, even though it does not use security it may need
to still parse a secure username such as foo@REALM. To allow for this, you
need to edit on the insecure cluster's CM, the value of HDFS ->
Configuration -> Trusted Realms and add the value as the realm used on the
secure cluster. Save and restart the insecure cluster as marked by CM.

This change won't alter your security states, its only allowed to make
rules to parse such secure-incoming usernames and avoid the EOFException
(which happens when it closes the connection being unable to parse the
username from the secure accessor).

View solution in original post

avatar
Mentor
I've not tried it, but you should be able to use webhdfs:// instead of
hdfs:// in that config. You need to also change the 8020 to 50070 (or your
custom NN HTTP port).

View solution in original post

4 REPLIES 4

avatar
Mentor
On your insecure cluster, even though it does not use security it may need
to still parse a secure username such as foo@REALM. To allow for this, you
need to edit on the insecure cluster's CM, the value of HDFS ->
Configuration -> Trusted Realms and add the value as the realm used on the
secure cluster. Save and restart the insecure cluster as marked by CM.

This change won't alter your security states, its only allowed to make
rules to parse such secure-incoming usernames and avoid the EOFException
(which happens when it closes the connection being unable to parse the
username from the secure accessor).

avatar
Expert Contributor

HI, Harsh

Thank you a lot.

Is there any other solution? 

Because the insecure cluster is our product enviorment. we can not reboot it.

Can flume support the hftp or webhdfs potocal?

 

there is our configruation.

 

 

agent.sources = source1
agent.channels = channel1 channel2
agent.sinks = hdfs_sink1 hdfs_sink2

agent.sources.source1.selector.type = replicating
agent.sources.source1.channels = channel1 channel2
agent.sources.source1.type = spooldir
agent.sources.source1.spoolDir =/flumeDataTest
agent.sources.source1.interceptors = i1
agent.sources.source1.interceptors.i1.type = timestamp
agent.sources.source1.deserializer = LINE
agent.sources.source1.deserializer.maxLineLength = 65535
agent.sources.source1.decodeErrorPolicy=IGNORE

agent.channels.channel1.type = memory
agent.channels.channel1.capacity = 10000
agent.channels.channel1.transactionCapacity=10000

agent.channels.channel2.type = memory
agent.channels.channel2.capacity = 10000
agent.channels.channel2.transactionCapacity=10000

agent.sinks.hdfs_sink1.channel=channel1
agent.sinks.hdfs_sink1.type = hdfs
agent.sinks.hdfs_sink1.hdfs.path = hdfs://arch-od-data01.beta1.fn:8020/user/kai.he/app_logs/%Y-%m-%d
agent.sinks.hdfs_sink1.hdfs.fileType = DataStream
agent.sinks.hdfs_sink1.hdfs.writeFormat = TEXT
agent.sinks.hdfs_sink1.hdfs.useLocalTimeStamp=true
agent.sinks.hdfs_sink1.hdfs.filePrefix=ev
agent.sinks.hdfs_sink1.hdfs.inUsePrefix=.
agent.sinks.hdfs_sink1.hdfs.request-timeout=30000
agent.sinks.hdfs_sink1.hdfs.rollCount = 6000
agent.sinks.hdfs_sink1.hdfs.rollInterval = 60
agent.sinks.hdfs_sink1.hdfs.rollSize=0
agent.sinks.hdfs_sink1.hdfs.kerberosKeytab=/tmp/kai.keytab
agent.sinks.hdfs_sink1.hdfs.kerberosPrincipal=kai.he@OD.BETA

agent.sinks.hdfs_sink2.channel=channel2
agent.sinks.hdfs_sink2.type = hdfs
agent.sinks.hdfs_sink2.hdfs.path = hdfs://cache01.dev1.fn:8020/flume/app_logs/%Y-%m-%d
agent.sinks.hdfs_sink2.hdfs.fileType = DataStream
agent.sinks.hdfs_sink2.hdfs.writeFormat = TEXT
agent.sinks.hdfs_sink2.hdfs.useLocalTimeStamp=true
agent.sinks.hdfs_sink2.hdfs.filePrefix=f3
agent.sinks.hdfs_sink2.hdfs.inUsePrefix=.
agent.sinks.hdfs_sink2.hdfs.request-timeout=30000
agent.sinks.hdfs_sink2.hdfs.rollCount = 6000
agent.sinks.hdfs_sink2.hdfs.rollInterval = 60
agent.sinks.hdfs_sink2.hdfs.rollSize=0

 

Thank you again.

 

BR

Paul

avatar
Mentor
I've not tried it, but you should be able to use webhdfs:// instead of
hdfs:// in that config. You need to also change the 8020 to 50070 (or your
custom NN HTTP port).

avatar
Expert Contributor

HI, Harsh

I changed the  HDFS ->
Configuration -> Trusted Realms

So the issue gone.

Great!  Thanks for your excellent work.

BR

Paul