Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Cannot create table in Ambari Hive 2.0 view

avatar
Explorer

I am trying to run the HDP tutorial (trucking example) in a HDP 2.6.5 cluster.

I was able to upload the CSV data files into HDFS. When I am trying to upload new table from trucks.csv, the table preview works fine but I got a "ServiceFormattedException" when I clicked the "Create" button, with the following stack trace from the ambari server logs:

org.apache.ambari.view.utils.hdfs.HdfsApiException: HDFS020 Could not write file /user/admin/hive/jobs/hive-job-54-2018-11-25_10-49/logs
	at org.apache.ambari.view.utils.hdfs.HdfsUtil.putStringToFile(HdfsUtil.java:57)
	at org.apache.ambari.view.hive20.resources.jobs.viewJobs.JobControllerImpl.setupLogFile(JobControllerImpl.java:220)
	at org.apache.ambari.view.hive20.resources.jobs.viewJobs.JobControllerImpl.setupLogFileIfNotPresent(JobControllerImpl.java:189)
	at org.apache.ambari.view.hive20.resources.jobs.viewJobs.JobControllerImpl.afterCreation(JobControllerImpl.java:182)
	at org.apache.ambari.view.hive20.resources.jobs.viewJobs.JobResourceManager.create(JobResourceManager.java:56)
	at org.apache.ambari.view.hive20.resources.jobs.JobServiceInternal.createJob(JobServiceInternal.java:27)
	at org.apache.ambari.view.hive20.resources.browser.DDLProxy.createJob(DDLProxy.java:384)
	at org.apache.ambari.view.hive20.resources.browser.DDLProxy.createTable(DDLProxy.java:256)
	at org.apache.ambari.view.hive20.resources.browser.DDLService.createTable(DDLService.java:147)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        ...
Caused by: java.io.IOException: Unexpected HTTP response: code=504 != 201, op=CREATE, message=Gateway Timeout
	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:467)
	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:114)
	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$FsPathOutputStreamRunner$1.close(WebHdfsFileSystem.java:950)
	at org.apache.ambari.view.utils.hdfs.HdfsUtil$1.run(HdfsUtil.java:51)
	at org.apache.ambari.view.utils.hdfs.HdfsUtil$1.run(HdfsUtil.java:46)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.ambari.view.utils.hdfs.HdfsApi.execute(HdfsApi.java:513)
	at org.apache.ambari.view.utils.hdfs.HdfsUtil.putStringToFile(HdfsUtil.java:46)
	... 105 more
Caused by: java.io.IOException: Content-Type "text/html" is incompatible with "application/json" (parsed="text/html")
	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.jsonParse(WebHdfsFileSystem.java:443)
	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:465)
	... 114 more

Any ideas on what's causing the content type error? Why is the Ambari client not setting the content type correctly when calling the HDFS API?

Btw I have already added

hadoop.proxyuser.root.groups=*
hadoop.proxyuser.root.hosts=*

so the exception is not caused by the ambari user not able to write to the HDFS volume.

1 ACCEPTED SOLUTION

avatar

Hi @Eric Yuan ,

i see error :

Causedby: java.io.IOException:Unexpected HTTP response: code=504!=201, op=CREATE, message=GatewayTimeout

please check if you have enabled any Http Proxy / Network proxy at your end?

I am suspecting that the WebHDFS requests originated by the Hive View is actually passing through some Http Proxy configured on your cluster. You may need to either make the request bypass the proxy server or make the proxy work.

So please check the following:

1. Check the "environment" setting to find out if there is any Http Proxy added? (look for 'proxy')

# /var/lib/ambari-agent/ambari-sudo.sh su hdfs -l -s /bin/bash -c 'env'  


2. See if you are able to make the WebHDFS call via terminal from ambari server host? And to see the output of the request is being passed via proxy?

# curl -ivL -X GET "http://$ACTIVE_NAME_NODE:50070/webhdfs/v1/user/admin?op=GETHOMEDIRECTORY&user.name=admin"


3. You can also refer to the following doc to know how to enable Http Proxy settings inside Ambari Server (and you can also configure ambari JVM property to exclude your cluster nodes requests to NOT be passed via proxy) See: https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.2.0/bk_ambari-administration/content/ch_setting...

-Dhttp.nonProxyHosts=<pipe|separated|list|of|hosts>


4. Or you can also configure "no_proxy" at the "~/.bash_profile" OR "/etc/profile" level globally to make suere that your internal cluster requests are not passed vias Proxy.

no_proxy=".example.com"export no_proxy

.

Please see if this helps you and please accept answer if it did.

View solution in original post

8 REPLIES 8

avatar
Rising Star

you also need to add the Ambari admin user as a proxy user (the same way as root), or ensure that admin itself has access to read/write in that HDFS location

the error comes down to the user that hive is executing as on HDFS doesn't have read/write access to the files, this could be hive (if impersonation is disabled) or the end user that you are signed into Ambari as (if impersonation is enabled)

avatar
Explorer

@rtheron actually both "hive" and "admin" (which was the end user used for signing into Ambari) were added as proxy users; the error is the same.

avatar
Rising Star

Try to write to that hdfs folder using each of these users (hdfs dfs -put) for example, whichever user hive impersonates to access hdfs likely lacks permissions on that folder To get around this I usually just do (hdfs dfs -chmod 777) on that folder in hdfs, you may not want open permissions on that file/folder but that's a good way to confirm the issue is actually file permissions

When impersonation is disabled the user that needs access to the folder is 'hive', when enabled it is the user logged in to ambari, so check your impersonation settings for hive views and hive itself

avatar
Explorer

@rtheron thanks much for the quick response, but are you sure it's a file permission issue... hdfs dfs commands work just fine. The strange thing is, Ambari is able to create the hive job folder (e.g. /user/admin/hive/jobs/hive-job-75-2018-11-26_05-14), create the correct "query.hql" DDL file in it, but somehow not able to write the execution results to the very same folder.

Btw I can't seem to find the impersonation settings in hive -> config -> advanced. Do you know where? Thanks

avatar
Rising Star

I saw the "could not write to" error in u your logs and figured it would be worth confirming, if user hive or end user can read /write to that location it's probably not the problem

There are some gateway related errors in there as well, is this happening via Knox gateway?

avatar
Explorer

@rtheron yes i just realized the knox gateway is on - I turned it off and restarted hive and hdfs, but the error remains. Should I restart Ambari server? I need to do that in the office tomorrow and give it a try. Thanks again

avatar

Hi @Eric Yuan ,

i see error :

Causedby: java.io.IOException:Unexpected HTTP response: code=504!=201, op=CREATE, message=GatewayTimeout

please check if you have enabled any Http Proxy / Network proxy at your end?

I am suspecting that the WebHDFS requests originated by the Hive View is actually passing through some Http Proxy configured on your cluster. You may need to either make the request bypass the proxy server or make the proxy work.

So please check the following:

1. Check the "environment" setting to find out if there is any Http Proxy added? (look for 'proxy')

# /var/lib/ambari-agent/ambari-sudo.sh su hdfs -l -s /bin/bash -c 'env'  


2. See if you are able to make the WebHDFS call via terminal from ambari server host? And to see the output of the request is being passed via proxy?

# curl -ivL -X GET "http://$ACTIVE_NAME_NODE:50070/webhdfs/v1/user/admin?op=GETHOMEDIRECTORY&user.name=admin"


3. You can also refer to the following doc to know how to enable Http Proxy settings inside Ambari Server (and you can also configure ambari JVM property to exclude your cluster nodes requests to NOT be passed via proxy) See: https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.2.0/bk_ambari-administration/content/ch_setting...

-Dhttp.nonProxyHosts=<pipe|separated|list|of|hosts>


4. Or you can also configure "no_proxy" at the "~/.bash_profile" OR "/etc/profile" level globally to make suere that your internal cluster requests are not passed vias Proxy.

no_proxy=".example.com"export no_proxy

.

Please see if this helps you and please accept answer if it did.

avatar
Explorer

Thanks Akhil! I did forget to set nonProxyHosts in the ambari-env.sh file. Now everything works fine.