About nsabharwal

nsabharwal · ‎03-01-2016

@rbalam You can move file from TDE zone to another location as long as you have keys/access to read file. https://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html#Accessing_data_within_an_encryption_zone When creating a new file in an encryption zone, the NameNode asks the KMS to generate a new EDEK encrypted with the encryption zone’s key. The EDEK is then stored persistently as part of the file’s metadata on the NameNode. When reading a file within an encryption zone, the NameNode provides the client with the file’s EDEK and the encryption zone key version used to encrypt the EDEK. The client then asks the KMS to decrypt the EDEK, which involves checking that the client has permission to access the encryption zone key version. Assuming that is successful, the client uses the DEK to decrypt the file’s contents. All of the above steps for the read and write path happen automatically through interactions between the DFSClient, the NameNode, and the KMS. Access to encrypted file data and metadata is controlled by normal HDFS filesystem permissions. This means that if HDFS is compromised (for example, by gaining unauthorized access to an HDFS superuser account), a malicious user only gains access to ciphertext and encrypted keys. However, since access to encryption zone keys is controlled by a separate set of permissions on the KMS and key store, this does not pose a security threat.

nsabharwal · ‎02-29-2016

@wsalazar You can increase the size and on the safe side run rebalance https://wiki.apache.org/hadoop/FAQ

nsabharwal · ‎02-29-2016

@Aniruddha Joshi There is a chance that you have configured the static IP at the vm level and it's getting 101 address from there. You can download ova and try with ova

nsabharwal · ‎02-29-2016

@Aniruddha Joshi The setup does not look right. root/hadoop is the login credentials. See this demo Shutdown all vms in your laptop, delete Hortonworks vm and reimport the image

nsabharwal · ‎02-29-2016

@Krishna Srinivas There is no official date. You can keep checking www.hortonworks.com May be in couple/few months or so

nsabharwal · ‎02-29-2016

@Kibrom Gebrehiwot Thank you so much for the final update. I have converted your comment to an answer and accepted it as best answer

nsabharwal · ‎02-29-2016

@Saurabh Kumar In order to distcp between two HDFS HA cluster (for example A and B), modify the following in the hdfs-site.xml for both clusters: For example, nameservice for cluster A and B is HAA and HAB respectively. - Add value to the nameservice for both clusters dfs.nameservices = HAA, HAB - Add property dfs.internal.nameservices In cluster A: dfs.internal.nameservices = HAA In cluster B: dfs.internal.nameservices = HAB - Add dfs.ha.namenodes.<nameservice> In cluster A dfs.ha.namenodes.HAB = nn1,nn2 In cluster B dfs.ha.namenodes.HAA = nn1,nn2 - Add property dfs.namenode.rpc-address.<cluster>.<nn> In cluster A dfs.namenode.rpc-address.HAB.nn1 = <NN1_fqdn>:8020 dfs.namenode.rpc-address.HAB.nn2 = <NN2_fqdn>:8020 In cluster B dfs.namenode.rpc-address.HAA.nn1 = <NN1_fqdn>:8020 dfs.namenode.rpc-address.HAA.nn2 = <NN2_fqdn>:8020 - Add property dfs.client.failover.proxy.provider.<cluster - i.e HAA or HAB> In cluster A dfs.client.failover.proxy.provider.HAB = org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider In cluster B dfs.client.failover.proxy.provider.HAA = org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider - Restart HDFS service. Once complete you will be able to run the distcp command using the nameservice similar to: hadoop distcp hdfs://HDPINFHA/tmp/testDistcp hdfs://HDPTSTHA/tmp/

nsabharwal · ‎02-29-2016

@Saurabh Kumar See this https://hortonworks.jira.com/browse/BUG-22998 and https://issues.apache.org/jira/browse/HDFS-6376

nsabharwal · ‎02-29-2016

@nejm hadj Adding more information based on your comments https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.flume.ExecuteFlumeSink/additionalDetails.html You should stick with NiFi and use built in processor to ingest the data from various social media sources. Please do read docs. In NiFi, the contents of a FlowFile are accessed via a stream, but in Flume it is stored in a byte array. This means the full content will be loaded into memory when a FlowFile is processed by the ExecuteFlumeSink processor. You should consider the typical size of the FlowFiles you'll process and the batch size, if any, your sink is configured with when setting NiFi's heap size.

nsabharwal · ‎02-29-2016

@henryon wen Ambari version? https://cwiki.apache.org/confluence/display/RANGER/Configure+Ranger+UserSync+for+LDAP This is handy https://cwiki.apache.org/confluence/display/RANGER/LDAP+Connection+Check+Tool

Online	Offline
Last Visited	‎07-18-2019 05:10 PM

Member Since	‎09-18-2015 05:49 PM
Last Visited	‎07-18-2019 05:10 PM
Posts	3,274
Kudos received	1129

Cloudera Community

Re: Is Ranger KMS Encryption FIPS 140-2 compliant ...

Re: How to add another HiveServer for current meta...

Re: FQDNs - are they necessary?

Re: java.io.FileNotFoundException: (Is a director...

Re: Need Design/Architecture Suggestion on HDP & H...

Re: Can we move file from TDE encrypt zone?

Re: Impact of growing a Datanode Volume

Re: Unable to login on sandbox

Re: Unable to login on sandbox

Re: When will hortonworks hdp 2.4 be released

Re: Error: "Cannot retrieve repository metadata (r...

Re: Distcp is failing in HA

Re: Distcp is failing in HA

Re: i'am trying to develop my first project with h...

Re: ranger usersync connect to ldap failed