Support Questions

Find answers, ask questions, and share your expertise

Connect to HDFS from Microsoft PowerBI with Kerberos

avatar
Explorer

I am trying to connect Microsofts PowerBI desktop to HDFS, but I am unable to do so. I keep getting the error 400, I also try to connect using web instead of HDFS but get the same error. 

 

By default the HDFS data sorces set the port to 50070, the secure port is 50470 currently looking for a way to change that. I also have updated my JAAS settings to forward a kerberos ticket, but didnt have an effect. 

 

Has anyone tried this or made this to work on a secure cluster? Any ideas greatly appricaiated. 

 

Thanks,

1 ACCEPTED SOLUTION

avatar
Explorer

Ok I figured out the issue, the entire URL is needed for the full path of the files in HDFS

 

Port 8020 is for back end hdfs communication you cannot connect to that port directly.

 

The URL to connect ot HDFS is listed below, please keep in mind these are the default ports and can be changed.

 

  • Non-Secure: http://<namenode>:50070/webhdfs/v1/<directory>
  • Secure: http://<namenode>:50470/webhdfs/v1/<directory>

 

However my issue is when I go to create the connection from PowerBI I only have one option which is to input the server name no description or help of what goes in this field, or namenode where HDFS web is running. After inputting this info i get the below error.

 

DataSource.Error: HDFS cannot connect to server 'namenode01.test.com'. Unable to connect to the remote server.
Details:
    DataSourceKind=Hdfs
    DataSourcePath=http://namenode01.test.com:50070/webhdfs/v1
    Url=http://namenode01.test.com:50070/webhdfs/v1/

 

Instead of putting only the server name I placed "http://<namenode>:50470/webhdfs/v1/<directory>" in the server field replacing namenode with the name of the server and the HDFS path of where  I want to get the data.

 

The issue now is PowerBI does not support Parquet or sequence file format, /cry, only text or open formats currently seem to work which is not unexpected.

 

Thanks,

Rusty

View solution in original post

11 REPLIES 11

avatar
Contributor

Hi.

I find the solution for same.

just uninstall hive odbc driver and reinstall driver and then you try to connect with odbc and put the all cluster inforamation what you have.it will work.

 

 

 

Thanks

HadoopHelp

avatar
Contributor

Hi.

i got solution for same.

please map hostfile in local machine with cluster nodes.it will work without any issue.

 

 

 

Thanks

HadoopHelp