About kaliyugantagoni

Former Member · ‎09-10-2016

Vagrant provides a VM in that is run by the provisioner of your choice, for instance, VirtualBox or VMWare. The network configuration of your VM determines whether you can connect to the network outside. Typically, in your example you would use one of two configurations: In a bridged network configuration, the VM has full access to the outside network and can see any machine out there. It also means that your VM is visible as its very own network device from the outside. While this is very convenient, it may be a security issue. And corporate networks may ban you from adding non-approved network devices. In a NAT configuration, traffic is routed through the host machine. In short, this means the VM can see the outside network but the outside network cannot see the VM. You can however expose some of the VM's services using port forwarding. If you want to "bake" your data sets into your Vagrant boxes, this can all be scripted. In order to always get the recent version of the data set, you might want to create a Vagrant box, based on a plain sandbox, that just goes out to the production system and fetches its data as it is spun up the first time. Because the Vagrant box acts as a client using standard APIs, generally speaking I believe you would not have to change your production systems. To give you a precise answers I would need to know your case in more detail, though.

aanghel · ‎09-09-2016

1) What could be the root cause ? I think it's just the wrong ldapsearch filter, should be ldapsearch -h unix-ldap.company.com -p 389-x -b "dc=company,dc=SE""(&(cn=devdatalakeadm)(memberUid=ojoqcu))" cn=devdatalakeadm,ou=Group,dc=company,dc=se is actually the full dn and you cannot search on it as it's not an attribute. 2) Your problem is still the userDnTemplate, that's why you're still getting the LDAP authentication exception ldapRealm.userDnTemplate = uid={0},cn=devdatalakeadm,ou=Group,dc=company,dc=se Why are you trying to search the user inside the cn=devdatalakeadm subtree? That's not how users and groups are represented in LDAP (unless you did something very specific). Users and Groups are normally in separate trees and membership is only decided by the memberUid parameter in your case. But if memberUid is ojoqcu it doesn't mean uid=ojoqcu,cn=devdatalakeadm,ou=Group,dc=company,dc=se actually exist, ojoqcu user could be in a separate tree/ou, like uid=ojoqcu,ou=User,dc=company,dc=se To further help you finding out the correct userDnTemplate, I'd need an ldapsearch output for a user, just like you showed for groups.

vranganathan · ‎09-06-2016

without compression [numFiles=8, numRows=6547431, totalSize=66551787, rawDataSize=3154024078] with zlib [numFiles=8, numRows=6547431, totalSize=44046849, rawDataSize=3154024078] As you can see, the totalSize is less with zlib.

kaliyugantagoni · ‎08-18-2016

I have either discovered something strange or I lack the understanding of how Sqoop works : Sqoop doc. says that in case of a composite PK, the --split-by column should be specified during sqoop import, however, I proceeded without doing so. Sqoop then picked up one int column belonging to the PK Only in case of few tables(all of them having at least 1.2 billion rows) did I face this mismatch issue I then used --split-by for those tables and also added --validate. Then I got the same no. of rows imported

kaliyugantagoni · ‎08-12-2016

Well, I'm unsure whether it was an authorization issue or a mere parsing problem or both. I did the following and it worked : Did an 'su hive' Executed the following command(probably, the -- --schema should be the last arg, Sqoop simply ignores/breaks after that!) sqoop import --hcatalog-home /usr/hdp/current/hive-webhcat --hcatalog-database FleetManagement_Ape --hcatalog-table DatabaseLog --create-hcatalog-table --hcatalog-storage-stanza "stored as orcfile" --connect 'jdbc:sqlserver://<IP>;database=FleetManagement' --username --password --table DatabaseLog -- --schema ape

myoung · ‎08-09-2016

You can always remove the files in .Trash as you would any other directory/file. hdfs dfs -rm -r -skipTrash /user/hdfs/.Trash/*

ravi1 · ‎07-07-2016

I am not sure what they mean by ORC not being a general purpose format. Anyway, in this case, you are still going through HCatalog (there are HCatalog APIs for MR and Pig). When I said you can transform this data as necessary, I mean things like creating new Partitions, Buckets, Sorting, Bloom filters and even redesigning tables for better access. There will be data duplication with any data transforms if you want to keep raw data as well.

emaxwell · ‎07-14-2016

@Kaliyug Antagonist You will typically need to do some configuration on the views to make them work properly. In a secured cluster, you have to specify all of the parameters for connecting to the particular service instead of using the "Local Cluster" configuration drop down. The Ambari Views Documentation contains instructions for configuring all of the various views.

robert_jones · ‎07-26-2016

ranger-home-directory-policy.png@Kaliyug Antagonist We've found another neat solution to this, using a resource path of the form: "/user/${id}" Credit to Naveed Hussain, who found it after we moaned a lot about the alternatives. Screenshot attached.

ravi1 · ‎06-23-2016

1. If you need home directories for each of the users, then you need to create home directories. Ownership can be changed from CLI or you can set using Ranger (though I think changing from CLI is better than creating a new profile in Ranger for these things) 2. I am talking about principals here, not service users (like hdfs, hive, yarn) coming from AD (using SSSD or some other such too). So, with you setup local users are create on each node. But they still need to authenticate with your KDC. Ambari can create it for you on the OU once you give the credentials to ambari. 3. Its not mandatory to have /user/<username> for each user. We have cases where BI users how use ODBC/JDBC and don't even have login access to the nodes not needing /user/<username>. Even users that login don't need /user/<username> and could use something like /data/<group>/... to read/write to hdfs.

Online	Offline
Last Visited	‎03-18-2020 10:21 AM

Member Since	‎04-11-2016 02:31 PM
Last Visited	‎03-18-2020 10:21 AM
Posts	174
Kudos received	29

Cloudera Community

Re: NiFi custom processor custom log, logging in t...

Re: Separate log file for custom processor

Re: Sqoop imported more records than source

Re: Sqoop import to HCatalog/Hive - table not visi...

Re: HDFS Space not reclaimed

Re: Prod. HDP sandbox/clones for organizational us...

Re: Zeppelin security : Issues while securing Zepp...

Re: Sqoop import to HCatalog/Hive : Compression di...

Re: Sqoop imported more records than source

Re: Sqoop import to HCatalog/Hive - table not visi...

Re: HDFS Space not reclaimed

Re: Revisited : Import to Hive or HDFS ?

Re: Hue installation HDP 2.4 and RHEL7

Re: HDFS Policy 'resource path' with placeholder -...

Re: Part-1 : Authorization on production cluster