Member since
07-01-2015
460
Posts
78
Kudos Received
43
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1345 | 11-26-2019 11:47 PM | |
1304 | 11-25-2019 11:44 AM | |
9470 | 08-07-2019 12:48 AM | |
2172 | 04-17-2019 03:09 AM | |
3483 | 02-18-2019 12:23 AM |
01-12-2017
07:06 AM
Probably the problem will be that the CDH has old version of Sentry (sentry-1.5.1+cdh5.9.0+261). The Kafka broker needs a sentry-binding-kafka.jar file, which should be available in Sentry 1.7. "Starting from 1.7.0 release, Apache Sentry has Kafka binding that can be used to enable authorization in Apache Kafka with Apache Sentry." Confirmed here> https://cwiki.apache.org/confluence/display/SENTRY/Apache+Kafka+Authorization+with+Apache+Sentry
... View more
01-04-2017
02:49 AM
I would go directly.
... View more
10-05-2016
02:34 AM
1 Kudo
Update: the explain function on the problematic query fails: explain insert into tmp.tab partition (day_id)
select t.fn, cast( t.ROW_COUNT as int ), cast(from_unixtime(unix_timestamp(now()), 'yyyyMMdd') as int) as DAY_ID
from ( select count(*) as row_count, max( file_name ) as fn from tmp.tab group by file_name ) t;
Query: explain insert into tmp.tab partition (day_id)
select t.fn, cast( t.ROW_COUNT as int ), cast(from_unixtime(unix_timestamp(now()), 'yyyyMMdd') as int) as DAY_ID
from ( select count(*) as row_count, max( file_name ) as fn from tmp.tab group by file_name ) t
ERROR: IllegalStateException: null but the explain on the working query suceeds: explain insert into tmp.tab partition (day_id)
> select t.fn, cast( t.ROW_COUNT as int ), cast(from_unixtime(unix_timestamp(now()), 'yyyyMMdd') as int) as DAY_ID
> from ( select rc as ROW_COUNT, file_name as fn from tmp.tab ) t;
Query: explain insert into tmp.tab partition (day_id)
select t.fn, cast( t.ROW_COUNT as int ), cast(from_unixtime(unix_timestamp(now()), 'yyyyMMdd') as int) as DAY_ID
from ( select rc as ROW_COUNT, file_name as fn from tmp.tab ) t
+--------------------------------------------------------------------------------------------------------------------------+
| Explain String |
+--------------------------------------------------------------------------------------------------------------------------+
| WARNING: The following tables are missing relevant table and/or column statistics. |
| tmp.tab |
| |
| WRITE TO HDFS [tmp.tab, OVERWRITE=false, PARTITION-KEYS=(CAST(from_unixtime(unix_timestamp(now()), 'yyyyMMdd') AS INT))] |
| | partitions=1 |
| | |
| 00:SCAN HDFS [tmp.tab] |
| partitions=0/0 files=0 size=0B |
+--------------------------------------------------------------------------------------------------------------------------+
Fetched 8 row(s) in 0.03s I noticed the warning about missing stats on empty table, so I did compute stats on tm,p.tab This helped, so the problem is solved! Query: explain insert into tmp.tab partition (day_id)
select t.fn, cast( t.ROW_COUNT as int ), cast(from_unixtime(unix_timestamp(now()), 'yyyyMMdd') as int) as DAY_ID
from ( select count(*) as row_count, max( file_name ) as fn from tmp.tab group by file_name ) t
+--------------------------------------------------------------------------------------------------------------------------+
| Explain String |
+--------------------------------------------------------------------------------------------------------------------------+
| Estimated Per-Host Requirements: Memory=0B VCores=0 |
| |
| WRITE TO HDFS [tmp.tab, OVERWRITE=false, PARTITION-KEYS=(CAST(from_unixtime(unix_timestamp(now()), 'yyyyMMdd') AS INT))] |
| | partitions=1 |
| | |
| 01:AGGREGATE [FINALIZE] |
| | output: count(*), max(file_name) |
| | group by: file_name |
| | |
| 00:SCAN HDFS [tmp.tab] |
| partitions=0/0 files=0 size=0B |
+--------------------------------------------------------------------------------------------------------------------------+
Fetched 11 row(s) in 0.03s
... View more
08-17-2016
01:13 AM
Yes, you are right, there has to be a explicit grant on that URI, not just a HDFS access to the given directory. I don't understand why the documentation do not explain it more clearly.. Thanks
... View more
07-29-2016
12:09 PM
1 Kudo
Filed the JIRA: https://issues.cloudera.org/browse/IMPALA-3938 Thanks a lot for your detailed report and easy reproduction!
... View more
06-13-2016
10:20 PM
But how would he fix it, if he did need it?? I am experiencing the same thing.
... View more
06-10-2016
05:44 AM
Actually this has been already resolved, we changed the create table statetment, added #b (hash b - as binary). create external table md_extract_file_status ( table_key string, fl_counter bigint ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,colfam:FL_Counter#b ) TBLPROPERTIES('hbase.table.name' ='HBTABLE');
... View more
04-27-2016
07:56 AM
Hello, yarn makes three checks ( source code ) : compare the name of the user with string root with string compare (strcmp(user, "root") == 0 verify if your user is white listed ( !is_whitelisted(user) check the uid of the user with minuid. ( user_info->pw_uid < min_uid ) For now the only workaround I found is to create a new user with UID and GID equal to 0 and insert the name of the user in white listed and set min user id to 0. There is an important motivation to use root: if you need to use distcp on a target location that is an NFS filesystem or a sharable filesystem mounted local on the datanode/workernode to make a backup. Infact in that case, if you run a job with a normal user, it's not possible to change the owner of the file, so the distcp backup will fails. Obviously if you run as root it will fail too for the hard coded control. Kind Regards
... View more
02-17-2016
04:12 AM
1 Kudo
Try this one:
http://www.cloudera.com/documentation/manager/5-1-x/Configuring-Hadoop-Security-with-Cloudera-Manager/cm5chs_enable_security_s8.html
... View more
- « Previous
- Next »