About rtheron

rtheron · ‎11-27-2018

I saw the "could not write to" error in u your logs and figured it would be worth confirming, if user hive or end user can read /write to that location it's probably not the problem There are some gateway related errors in there as well, is this happening via Knox gateway?

rtheron · ‎11-26-2018

Try to write to that hdfs folder using each of these users (hdfs dfs -put) for example, whichever user hive impersonates to access hdfs likely lacks permissions on that folder To get around this I usually just do (hdfs dfs -chmod 777) on that folder in hdfs, you may not want open permissions on that file/folder but that's a good way to confirm the issue is actually file permissions When impersonation is disabled the user that needs access to the folder is 'hive', when enabled it is the user logged in to ambari, so check your impersonation settings for hive views and hive itself

rtheron · ‎11-26-2018

you also need to add the Ambari admin user as a proxy user (the same way as root), or ensure that admin itself has access to read/write in that HDFS location the error comes down to the user that hive is executing as on HDFS doesn't have read/write access to the files, this could be hive (if impersonation is disabled) or the end user that you are signed into Ambari as (if impersonation is enabled)

rtheron · ‎11-16-2018

Another option could be to use Ambari log search

rtheron · ‎10-02-2018

Using Hive Web UI (Hive View) does not mimic the Pentaho DoAs command correctly, Hive View will execute the DoAs as the "admin" user, while impersonating the end user (user logged in to Ambari), "admin" would by default have the privilege to do this You need to test this on command line using the beeline utility, specifically with a JDBC connection that invokes the impersonation command on behalf of the user that Pentaho is configured to connect as (if the Pentaho processor has a specific connection string you can use that as well for your jdbc connection string in beeline) The exercise here is to connect to hive exactly the same way that Pentaho would, using Hive view does not (necessarily) do that an example of kerberos authenticated user Hive impersonating user "testuser": jdbc:hvie2://HiveHost:10001/default;principal=hive/_host@HOST1.COM;hive.server2.proxy.user=testuser for more information see the below article on impersonation in the zeppelin notebook interface: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_zeppelin-component-guide/content/config-hive-access.html

rtheron · ‎10-02-2018

There's some info on this thread that should point you in the right direction, let me know if you get stuck: https://community.hortonworks.com/questions/67900/startstop-processor-via-nifi-api.html

rtheron · ‎10-01-2018

have you tried executing the same insert using beeline with the same credentials? it looks like the kettle engine is invoking doAs (execute a command on bahalf of user x while logged in as some superuser), confirm the doAs is in fact enabled /possible for that admin user you can invoke the same doAs in your beeline connection when testing in doing this you should see the actual hive error (if any) that's happening

rtheron · ‎10-01-2018

You should be able to call a nifi process by api on a shell executed script from within Nifi

rtheron · ‎09-07-2018

The problem here could be that the external table isn't structured to make the filter/split of this file optimal, for example " WHERE department = 'xxx' AND time='yyyy';" executed against a non-partitioned external table causes a complete file scan of the 10gb for each statement (so you're reading the 10GB entirely every time) You may want to read the file into nifi flow file with a configured buffer as actual data in stead of taking the external table approach, alternatively, you can use an intermediate orc table that inserts the entire external file in some sort of sorted manner, before splitting it into multiple tables based on some filter (which you would optimize for in your intermediate table structure), I'd personally recommend the first approach though

rtheron · ‎08-17-2018

See https://community.hortonworks.com/questions/212611/hivepartitionssmall-filesconcatenate.html

Online	Offline
Last Visited	‎12-09-2021 07:09 PM

Member Since	‎07-20-2018 03:19 PM
Last Visited	‎12-09-2021 07:09 PM
Posts	35
Kudos received	4

Cloudera Community

Re: Is there any functionality available for calli...

Re: puthiveQL is very slow

Re: Cannot create table in Ambari Hive 2.0 view

Re: Cannot create table in Ambari Hive 2.0 view

Re: Cannot create table in Ambari Hive 2.0 view

Re: How can I filter the NiFi log to exclude a cer...

Re: Problem inserting in Hive using Pentaho Data I...

Re: Is there any functionality available for calli...

Re: Problem inserting in Hive using Pentaho Data I...

Re: Is there any functionality available for calli...

Re: puthiveQL is very slow

Re: orc small files Concatenate in Hive