Member since
07-20-2018
35
Posts
4
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1526 | 10-01-2018 10:34 PM | |
1596 | 09-07-2018 07:52 PM |
11-27-2018
03:43 AM
I saw the "could not write to" error in u your logs and figured it would be worth confirming, if user hive or end user can read /write to that location it's probably not the problem There are some gateway related errors in there as well, is this happening via Knox gateway?
... View more
11-26-2018
11:39 PM
Try to write to that hdfs folder using each of these users (hdfs dfs -put) for example, whichever user hive impersonates to access hdfs likely lacks permissions on that folder To get around this I usually just do (hdfs dfs -chmod 777) on that folder in hdfs, you may not want open permissions on that file/folder but that's a good way to confirm the issue is actually file permissions When impersonation is disabled the user that needs access to the folder is 'hive', when enabled it is the user logged in to ambari, so check your impersonation settings for hive views and hive itself
... View more
11-26-2018
10:13 PM
you also need to add the Ambari admin user as a proxy user (the same way as root), or ensure that admin itself has access to read/write in that HDFS location the error comes down to the user that hive is executing as on HDFS doesn't have read/write access to the files, this could be hive (if impersonation is disabled) or the end user that you are signed into Ambari as (if impersonation is enabled)
... View more
11-16-2018
08:51 PM
Another option could be to use Ambari log search
... View more
10-02-2018
10:58 PM
Using Hive Web UI (Hive View) does not mimic the Pentaho DoAs command correctly, Hive View will execute the DoAs as the "admin" user, while impersonating the end user (user logged in to Ambari), "admin" would by default have the privilege to do this You need to test this on command line using the beeline utility, specifically with a JDBC connection that invokes the impersonation command on behalf of the user that Pentaho is configured to connect as (if the Pentaho processor has a specific connection string you can use that as well for your jdbc connection string in beeline) The exercise here is to connect to hive exactly the same way that Pentaho would, using Hive view does not (necessarily) do that an example of kerberos authenticated user Hive impersonating user "testuser": jdbc:hvie2://HiveHost:10001/default;principal=hive/_host@HOST1.COM;hive.server2.proxy.user=testuser for more information see the below article on impersonation in the zeppelin notebook interface: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_zeppelin-component-guide/content/config-hive-access.html
... View more
10-02-2018
04:52 PM
There's some info on this thread that should point you in the right direction, let me know if you get stuck: https://community.hortonworks.com/questions/67900/startstop-processor-via-nifi-api.html
... View more
10-01-2018
10:39 PM
have you tried executing the same insert using beeline with the same credentials? it looks like the kettle engine is invoking doAs (execute a command on bahalf of user x while logged in as some superuser), confirm the doAs is in fact enabled /possible for that admin user you can invoke the same doAs in your beeline connection when testing in doing this you should see the actual hive error (if any) that's happening
... View more
10-01-2018
10:34 PM
You should be able to call a nifi process by api on a shell executed script from within Nifi
... View more
09-07-2018
07:52 PM
The problem here could be that the external table isn't structured to make the filter/split of this file optimal, for example " WHERE department = 'xxx' AND time='yyyy';" executed against a non-partitioned external table causes a complete file scan of the 10gb for each statement (so you're reading the 10GB entirely every time) You may want to read the file into nifi flow file with a configured buffer as actual data in stead of taking the external table approach, alternatively, you can use an intermediate orc table that inserts the entire external file in some sort of sorted manner, before splitting it into multiple tables based on some filter (which you would optimize for in your intermediate table structure), I'd personally recommend the first approach though
... View more
08-17-2018
05:39 PM
See https://community.hortonworks.com/questions/212611/hivepartitionssmall-filesconcatenate.html
... View more