Member since
08-08-2013
339
Posts
132
Kudos Received
27
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
14779 | 01-18-2018 08:38 AM | |
1555 | 05-11-2017 06:50 PM | |
9114 | 04-28-2017 11:00 AM | |
3418 | 04-12-2017 01:36 AM | |
2809 | 02-14-2017 05:11 AM |
03-23-2017
05:01 AM
Hi @mathieu.d , do you think this needs to be raised as an issue/bug , or is my case that unusual (I don't think so, honestly 😉 )?
... View more
03-22-2017
10:15 AM
thanks for this hint. I can confirm by providing SELECT privileges on db level, the INSERT works. From a security perspective this looks a bit weird. Is this a known bug ?!?!
... View more
03-22-2017
09:43 AM
Thanks for jumping in here 😉 @mathieu.d , @saranvisa : I did further playaround in this 5.8 sandbox, and even using the Hue=>Security UI produces the same error/issue. What I did: - created 2 groups in Linux (one for read-only, the other for read-write) and added users to it - created 2 roles in sentry (again, one for read-only and the other for read-write) and granted corresponding previously created group to it - created a hive database "sentrydemo" and one table in it - open 2 terminals with 2 different users (one 'read', and one 'read-write') and connect to hive via beeline - "select * from hivetablename" works fine - "insert into hivetablename values (100, 'test entry')" gives me the following error: The required privileges: Server=server1->Db=sentrydemo->Table=values__tmp__table__2->Column=tmp_values_col1->action=select; (state=42000,code=40000) "Enable Sentry synchronisation" is enabled in HDFS config, the "real" table folders have the correct permissions set according to the groups. Where are those "values__tmp__table..." 's are being stored and why are they not being considered by the Sentry permissions I defined ?!?! Any ideas ? At the end it is not either just a "INSERT only permission not working", right now it is a general "INSERT" not working issue, since the "read-write" group has privileges "SELECT" & "INSERT"
... View more
03-20-2017
12:04 PM
Thanks @saranvisa for replying. But the HDFS permissions should be adjusted by Sentry itself, since HDFS sync is enabled, right ? Nevertheless, HDFS permissions look like: # file: /user/hive/warehouse/shipment_test
# owner: hive
# group: hive
group:ingester:-wx
... View more
03-20-2017
08:05 AM
Hi, I am playing around with Sentry and want to provide "write-only" permission to a user via grant insert ON default.shipment_test TO ROLE ingester; But if I connect to hive via beeline and execute an insert statement, I receive the error: 0: jdbc:hive2://quickstart.cloudera:10000/def> insert into shipment_test values (1,'1111');
Error: Error while compiling statement: FAILED: SemanticException No valid privileges
User writer does not have privileges for QUERY
The required privileges: Server=server1->Db=default->Table=values__tmp__table__2->Column=tmp_values_col1->action=select; (state=42000,code=40000) The environment is CDH sandbox 5.8 Linux user 'writer' is member of group 'ingester'; The group 'ingester' is assigned to the proper role in Sentry and was given the INSERT privilege: grant role sentry_ingester to group ingester; grant INSERT ON default.shipment_test TO ROLE sentry_ingester; Why the statement errors out due to a "select" issue on a tmp table ?!?! Do I have to specify some more privileges, or how do you grant "INSERT"-only permissions to a group ? Thanks in advance...
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Sentry
03-14-2017
10:25 PM
Hi, to be able to set timestamp based filters on HBase scan/get commands, I first need to know which timestamps are there. Therefore my question: how can I select/get/retrieve the max timestamp within a table ? ...preferrably via 'hbase shell' Finally I want to store this max. timestamp in a file, and use it in next run of manual script to get all HBase rows which have timestamp > previous_max_timestamp Thanks in advance...
... View more
Labels:
- Labels:
-
Apache HBase
02-14-2017
05:11 AM
Hi again, just fixed it by adding field 'id' to SORL schema, but didn't find that hint anywhere in the HBaseMapReduceIndexer doc....therefore I was unsure, initially 😉
... View more
02-14-2017
04:43 AM
Hello, my HBaseIndexer MR job failed with the following error message: 2017-02-14 10:57:29,676 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1484216395768_0113_r_000000_3: Error: java.io.IOException: Batch Write Failure
at org.apache.solr.hadoop.BatchWriter.throwIf(BatchWriter.java:239)
at org.apache.solr.hadoop.BatchWriter.queueBatch(BatchWriter.java:181)
at org.apache.solr.hadoop.SolrRecordWriter.close(SolrRecordWriter.java:275)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.solr.common.SolrException: ERROR: [doc=#0;#0;#0;#0;#27;�z] unknown field 'id'
at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:185)
at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:78) Why does it want to write to a field called 'id' ? since neither the Solr schema nor in the morphline there is a field called 'id'.... Is this a prereq. for the HBaseIndexer MR tool to work ?!?! Thanks in advance...
... View more
Labels:
- Labels:
-
Apache HBase
-
Apache Solr
01-31-2017
09:19 AM
Hi @Baruch AMOUSSOU DJANGBAN , to avoid running out of disk space, you should use some OS tools to let them manage handling of log files, e.g. logrotate. You can configure it to run on a daily basis, compressing log files and remove old ones. But this behaviour highly depends on your requirements, e.g. how long do you want to keep the logfiles on local nodes, ... You can also check if you have DEBUG logging enabled in one of your services where you do not need it, which will increase the size of log output heavily. If you want to collect the logs from all the nodes to one central location for muuuuch easier diving into issues, you could create a data flow e.g. ingesting all the log entries into Solr HTH
... View more
12-13-2016
07:37 AM
thanks @slachterman , that's perfect. I missed the attached xml on my first view of your article 😉
... View more