About Tomas79

Tomas79 · ‎01-12-2017

Probably the problem will be that the CDH has old version of Sentry (sentry-1.5.1+cdh5.9.0+261). The Kafka broker needs a sentry-binding-kafka.jar file, which should be available in Sentry 1.7. "Starting from 1.7.0 release, Apache Sentry has Kafka binding that can be used to enable authorization in Apache Kafka with Apache Sentry." Confirmed here> https://cwiki.apache.org/confluence/display/SENTRY/Apache+Kafka+Authorization+with+Apache+Sentry

Tomas79 · ‎01-04-2017

I would go directly.

Tomas79 · ‎10-05-2016

Update: the explain function on the problematic query fails: explain insert into tmp.tab partition (day_id) select t.fn, cast( t.ROW_COUNT as int ), cast(from_unixtime(unix_timestamp(now()), 'yyyyMMdd') as int) as DAY_ID from ( select count(*) as row_count, max( file_name ) as fn from tmp.tab group by file_name ) t; Query: explain insert into tmp.tab partition (day_id) select t.fn, cast( t.ROW_COUNT as int ), cast(from_unixtime(unix_timestamp(now()), 'yyyyMMdd') as int) as DAY_ID from ( select count(*) as row_count, max( file_name ) as fn from tmp.tab group by file_name ) t ERROR: IllegalStateException: null but the explain on the working query suceeds: explain insert into tmp.tab partition (day_id) > select t.fn, cast( t.ROW_COUNT as int ), cast(from_unixtime(unix_timestamp(now()), 'yyyyMMdd') as int) as DAY_ID > from ( select rc as ROW_COUNT, file_name as fn from tmp.tab ) t; Query: explain insert into tmp.tab partition (day_id) select t.fn, cast( t.ROW_COUNT as int ), cast(from_unixtime(unix_timestamp(now()), 'yyyyMMdd') as int) as DAY_ID from ( select rc as ROW_COUNT, file_name as fn from tmp.tab ) t +--------------------------------------------------------------------------------------------------------------------------+ | Explain String | +--------------------------------------------------------------------------------------------------------------------------+ | WARNING: The following tables are missing relevant table and/or column statistics. | | tmp.tab | | | | WRITE TO HDFS [tmp.tab, OVERWRITE=false, PARTITION-KEYS=(CAST(from_unixtime(unix_timestamp(now()), 'yyyyMMdd') AS INT))] | | | partitions=1 | | | | | 00:SCAN HDFS [tmp.tab] | | partitions=0/0 files=0 size=0B | +--------------------------------------------------------------------------------------------------------------------------+ Fetched 8 row(s) in 0.03s I noticed the warning about missing stats on empty table, so I did compute stats on tm,p.tab This helped, so the problem is solved! Query: explain insert into tmp.tab partition (day_id) select t.fn, cast( t.ROW_COUNT as int ), cast(from_unixtime(unix_timestamp(now()), 'yyyyMMdd') as int) as DAY_ID from ( select count(*) as row_count, max( file_name ) as fn from tmp.tab group by file_name ) t +--------------------------------------------------------------------------------------------------------------------------+ | Explain String | +--------------------------------------------------------------------------------------------------------------------------+ | Estimated Per-Host Requirements: Memory=0B VCores=0 | | | | WRITE TO HDFS [tmp.tab, OVERWRITE=false, PARTITION-KEYS=(CAST(from_unixtime(unix_timestamp(now()), 'yyyyMMdd') AS INT))] | | | partitions=1 | | | | | 01:AGGREGATE [FINALIZE] | | | output: count(*), max(file_name) | | | group by: file_name | | | | | 00:SCAN HDFS [tmp.tab] | | partitions=0/0 files=0 size=0B | +--------------------------------------------------------------------------------------------------------------------------+ Fetched 11 row(s) in 0.03s

Tomas79 · ‎08-17-2016

Yes, you are right, there has to be a explicit grant on that URI, not just a HDFS access to the given directory. I don't understand why the documentation do not explain it more clearly.. Thanks

alex.behm · ‎07-29-2016

Filed the JIRA: https://issues.cloudera.org/browse/IMPALA-3938 Thanks a lot for your detailed report and easy reproduction!

MidwestMike · ‎06-13-2016

But how would he fix it, if he did need it?? I am experiencing the same thing.

Tomas79 · ‎06-10-2016

Actually this has been already resolved, we changed the create table statetment, added #b (hash b - as binary). create external table md_extract_file_status ( table_key string, fl_counter bigint ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,colfam:FL_Counter#b ) TBLPROPERTIES('hbase.table.name' ='HBTABLE');

matdba · ‎04-27-2016

Hello, yarn makes three checks ( source code ) : compare the name of the user with string root with string compare (strcmp(user, "root") == 0 verify if your user is white listed ( !is_whitelisted(user) check the uid of the user with minuid. ( user_info->pw_uid < min_uid ) For now the only workaround I found is to create a new user with UID and GID equal to 0 and insert the name of the user in white listed and set min user id to 0. There is an important motivation to use root: if you need to use distcp on a target location that is an NFS filesystem or a sharable filesystem mounted local on the datanode/workernode to make a backup. Infact in that case, if you run a job with a normal user, it's not possible to change the owner of the file, so the distcp backup will fails. Obviously if you run as root it will fail too for the hard coded control. Kind Regards

cjervis · ‎02-17-2016

Try this one: http://www.cloudera.com/documentation/manager/5-1-x/Configuring-Hadoop-Security-with-Cloudera-Manager/cm5chs_enable_security_s8.html

Tomas79 · ‎02-16-2016

Thanks!! I overlooked that option.

Online	Offline
Last Visited	‎01-14-2021 05:46 AM

Member Since	‎07-01-2015 06:03 AM
Last Visited	‎01-14-2021 05:46 AM
Posts	460
Kudos received	79

Cloudera Community

Re: Read service-wide configuration values via API

Re: Cloudera Altus - create CM with existing postg...

Re: Spark job getting failed with Jupyter notebook

Re: Create Parameterized view Impala

Re: Unable to access NameNode in cross realm trust...

Re: Class SentryKafkaAuthorizer missing

Re: Update Cloudera Manager from 5.5.3 to 5.9.0

Re: IllegalStateException during insert into table

Re: How to create external table without serveradm...

Re: Impala Complex types: position in ARRAY of MAP

Re: Cannot add service Kafka: JVM BROKER_HEAP_SIZE...

Re: Access Hbase Increment column via hive/impala

Re: Running as root is not allowed

Re: Disabling Kerberos

Re: re-create keytab for Spark