About slachterman

slachterman · ‎10-12-2017

This approach is supported by HDI 3.5. It appears HDI 3.6 relies on Hadoop 2.8 code. I can look into an approach for HDI 3.6, but NiFi bundles Hadoop 2.7.3.

slachterman · ‎10-03-2017

In previous releases of HDP, client-side caching of keys could result in unexpected behavior with WebHDFS. Consider the following steps: 1. Create two keys in ranger KMS: user1_key and user2_key 2. Add two resource based policy one per above user. User1_encr_policy: Allow the Decrypt_EEK permissions to user1 only User2_encr_policy: Allow the Decrypt_EEK permissions to user2 only. 3. Add two encryption zones. user1_zone (using user1_key) and user2_zone (using user2_key) 4. Run the following command, you may be able to access the content of test.csv file from user1_zone using user2 curl -i -L "http://sandbox.hortonworks.com:50070/webhdfs/v1/customer/user1_zone/test.csv?user.name=user2&op=OPEN" HDP-2.6.1.2 includes HADOOP-13749, which fixes the caching issue. The FS cache and KMS provider cache can be disabled by changing the configuration as follows: "fs.hdfs.impl.disable.cache", "true" dfs.client.key.provider.cache.expiry, 0

slachterman · ‎08-24-2017

Kirk, thanks for reporting this issue. It does appear to be a valid bug. I've entered an internal support case to investigate further. Best regards, Sam

slachterman · ‎06-05-2017

Please see updated answer.

slachterman · ‎06-05-2017

Please see updated answer and accept if helpful.

slachterman · ‎06-05-2017

@Saikiran Parepally you have to pass the -Dhadoop.security.credential.provider.path argument. Please see my article here: https://community.hortonworks.com/articles/59161/using-hadoop-credential-api-to-store-aws-secrets.html This property hadoop.security.credential.provider.path is not supported within core-site.xml. If you set fs.s3a.secret.key and fs.s3a.access.key only and restart HDFS, those credentials will be used. But that means all users have access to the bucket(s) to which these IAM credentials are tied. The best way to pass credentials using Hadoop credentials API is as follows: hdfs dfs -Dhadoop.security.credential.provider.path=jceks://hdfs/user/admin/aws.jceks -ls s3a://my-bucket

slachterman · ‎06-05-2017

When using ALTER TABLE ... SET LOCATION the target directory must already exist (your syntax above should be fine). Please try creating the HDFS directory first and then running the ALTER TABLE command.

slachterman · ‎06-04-2017

Yes. You can use describe extended to see the HDFS path associated with the Hive table and you can use parquet-tools to interact with the parquet file.

slachterman · ‎05-11-2017

Yes, @Naveen Keshava, that is correct. You will create two tables, say, employees_staging and employees_parquet. Employees_staging will be stored as text and employees_parquet will be stored as parquet. You will sqoop into employees_staging and then run another SQL command to insert into employees_parquet from employees_staging.

slachterman · ‎05-11-2017

My hypothesis is this is an issue specific to using the --as-parquetfile argument as there appears to be a KiteSDK bug related to this behavior. Can you please try executing the command without that? The pattern would be to create a staging table first and then insert into a parquet-backed table.

Online	Offline
Last Visited	‎05-03-2018 08:43 PM

Member Since	‎06-20-2016 02:58 PM
Last Visited	‎05-03-2018 08:43 PM
Posts	251
Kudos received	196

Cloudera Community

Re: PySpark and Python version (<3.6)?

Re: Ambari Server Start failure - Ranger Atlas Ta...

Re: Using underscore _ in a database name in HIVE

Re: Active directory as Directory Service and MIT ...

Re: 4 node cluster configuration

Re: Connecting to Azure Data Lake from a NiFi data...

WebHDFS: KMS Cache Issue with HDFS TDE

Re: Ambari Server Start failure - Ranger Atlas Ta...

Re: Issue passing credential provider for reading ...

Re: Issue passing credential provider for reading ...

Re: Issue passing credential provider for reading ...

Re: Table location - CREATE TABLE vs. ALTER TABLE....

Re: Using underscore _ in a database name in HIVE

Re: Using underscore _ in a database name in HIVE

Re: Using underscore _ in a database name in HIVE