Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

nifi putHDFS processor fails to write the data into hdfs of my remote aws instance

New Contributor

Hi All,

I have nifi installed on my windows machine to transfer the data from oracle database to aws instance hdfs layer by using putHDFS. files are transferred to hdfs with 0 Bytes of memory.

In logs I could see this:

WARN [Thread-5441] org.apache.hadoop.hdfs.DFSClient DataStreamer Exception java.nio.channels.UnresolvedAddressException: null

at sun.nio.ch.Net.checkAddress(Unknown Source) at sun.nio.ch.SocketChannelImpl.connect(Unknown Source) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)

How can I resolve the UnresolvedAddressException issue,I have made an entry in hosts file of aws instance as :

Private-IP Public DNS

But this couldn't resolve my issue.

Thanks,

9 REPLIES 9

Mentor

@Sudheer Nulu

The hosts entry on your windows machine should be the public DNS as the private IP is only accessible within the AWS data centre.

Hope that helps

New Contributor

@Geoffrey Shelton Okot

Thanks for looking into this issue.

I have previously configured in the same way above mentioned and even now I tried again but the same issue is occurring.

please find below for my host file configuration.

On my windows machine(%systemroot%\system32\drivers\etc\hosts): I have made an entry for aws instance like

IPv4 Public -----------------------------> IP Public DNS (IPv4)

-------------------------------------------------------

on aws centOS machine also I have added the same entry in /etc/hosts file

Thanks

Mentor

@Sudheer Nulu

To be able to successfully communicate with your AWS cluster you will need to create a Firewall rule for outbound and inbound traffic that allows MYIP which will default to the IP of your Windows machine, this will allow your laptop/Desktop to communicate with your AWS cluster.

Please see attached screenshot

Please let me know if that worked!!!


aws-fw.jpg

New Contributor

@Geoffrey Shelton Okot

I have added my public IP in the both inbound and outbound rules of aws instances as attached.But still the issue is existing.I have attached a snap to show how my files are getting saved in hdfs with 0bytes

I have added the port ranges from 0-65535 to my ip address

64931-nifisnap.png

64932-hdfs-data.png

Thanks.

Mentor

@Sudheer Nulu

Now the connection issue has been resolved?UnresolvedAddressException ? Has the nifi use the permissions to write to that HDFS directory? If not can you create an all to allow that and retry?

New Contributor


@Geoffrey Shelton Okot

In the nifi logs,

could still see the UnresolvedAddressException warning.

coming to permission issue: nifi is running as an windows user "sudheer.n" .So I have created an user in hdfs aws instance as "sudheer.n" and gave complete access to that user for the directory,where I am placing my files.

but this didn't help in writing content to the files in the target folder

my target directory has the access like as shown in attached snapshot for "nifi" folder.

64934-hdfs-permissions.png

let me know your thoughts if I am doing wrong in creating user in hdfs?

Thanks

New Contributor

@Geoffrey Shelton Okot :

I need to have a few clarification,

1. when I give hostname command in my aws instance it is showing private DNS name .but, when I am giving configuration files to nifi in windows machine there I am replacing the all the private DNS name withn public DNS name in both core-site.xml and hdfs-site.xml -----> is this the right approach?

2. I changed the hostname in aws instance to my public DNS and also changed it in both core-site.xml and hdfs-site.xml,but my hostname is not reflecting ambari UI and in both the files the hostnmae is changing to Private DNS dynamically--->Could you please let me know,from where this hostname is being reffered?

I think, this also could be an issue for Hostname resolution issue

Mentor

@Sudheer Nulu

1. when I give hostname command in my aws instance it is showing private DNS name .but, when I am giving configuration files to nifi in windows machine there I am replacing the all the private DNS name withn public DNS name in both core-site.xml and hdfs-site.xml -----> is this the right approach?

YES ....You need to copy the core-site.xml and hdfs-site.xml from one of your hadoop nodes to the machine where NiFi is running. Then configure PutHDFS so that the configuration resources are "/path/to/core-site.xml,/path/to/hdfs-site.xml". That is all that is required from the NiFi perspective, those files contain all of the information it needs to connect to the Hadoop cluster. Ensure that the machine where NiFi is running has network access to all of the machines in your Hadoop cluster. You can look through those config files and find any hostnames and IP addresses and make sure they can be accessed from the machine where NiFi is running. 

2. I changed the hostname in aws instance to my public DNS and also changed it in both core-site.xml and hdfs-site.xml,but my hostname is not reflecting ambari UI and in both the files the hostnmae is changing to Private DNS dynamically--->Could you please let me know,from where this hostname is being referred? Have a look at this AWS documentation

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/set-hostname.html

https://aws.amazon.com/premiumsupport/knowledge-center/linux-static-hostname-rhel7-centos7/

Explorer

@Geoffrey Shelton Okot

Hi Geoffrey, I have the same issue.

I use NiFi in docker and HDP Sandbox in VM both in the same host PC.

The configuration of PutHDFS is shown below:

109817-1562897818853.png

The error shows: HDFS Configuration error:

109818-1562898010552.png

Could you please help me to address it? Thank you.

Kind regards,

Raymond