About mtdeguzis

mtdeguzis · ‎01-28-2019

I also noticed you can monitor the "need to move" message for the remaining space to be balanced. This can go up or down depending on how busy the cluster is: cat /tmp/hdfs_rebalancer.log | grep "Need to move" | tail -n 10 19/01/28 12:23:02 INFO balancer.Balancer: Need to move 11.11 TB to make the cluster balanced. 19/01/28 12:43:48 INFO balancer.Balancer: Need to move 11.10 TB to make the cluster balanced. 19/01/28 13:04:38 INFO balancer.Balancer: Need to move 10.89 TB to make the cluster balanced. 19/01/28 13:25:23 INFO balancer.Balancer: Need to move 10.83 TB to make the cluster balanced. 19/01/28 13:45:59 INFO balancer.Balancer: Need to move 10.83 TB to make the cluster balanced. 19/01/28 14:06:30 INFO balancer.Balancer: Need to move 10.78 TB to make the cluster balanced. 19/01/28 14:27:14 INFO balancer.Balancer: Need to move 10.73 TB to make the cluster balanced. 19/01/28 14:47:53 INFO balancer.Balancer: Need to move 10.70 TB to make the cluster balanced. 19/01/28 15:08:42 INFO balancer.Balancer: Need to move 10.66 TB to make the cluster balanced. 19/01/28 15:29:23 INFO balancer.Balancer: Need to move 10.75 TB to make the cluster balanced.

mtdeguzis · ‎09-10-2018

We do not yet use this in production due to other items, but I'd suspect your krb.conf should be validated before going further. That is a pretty simple kerberos message.

mtdeguzis · ‎05-07-2018

Another good one for those looking for properties. You can then write known files here via JSON dictionary > XML if needed. Making API connection to: https://host.port/api/v1/clusters/cluster_name/configurations/service_config_versions?is_current=true

mtdeguzis · ‎04-26-2018

I was able to get the client config via tarball with python requests: TARBALL_URL = AMBARI_URI + ":" + AMBARI_PORT + "/api/v1/clusters/" + CLUSTER_NAME + '/components?format=client_config_tar' However as others have stated, this is a limited set. I also need the ranger configs like ranger-hive-security.xml. I have been looking at the Ranger API and webpages that describe developing Ranger plugins, as obviously, when something in Hive etc. needs to talk to Ranger, it has to be aware of this config, and this is available under the hive conf.server folder on a give hiveserver2 host: $ sudo ls /usr/hdp/current/hive-client/conf/conf.server/ hadoop-metrics2-hiveserver2.properties hive-env.sh.template hiveserver2-site.xml ranger-hive-audit.xml ranger-security.xml hive-default.xml.template hive-exec-log4j.properties hive-site.xml ranger-hive-security.xml zkmigrator_jaas.conf hive-env.sh hive-log4j.properties mapred-site.xml ranger-policymgr-ssl.xml I need essentially this set from conf.server (Working on a Hive sidecar instance). I do not* want to pull these from a server via rsync or use cp, as it needs to be portable for my purposes. Related: https://community.hortonworks.com/questions/135415/rest-api-to-fetch-server-configs.html

mtdeguzis · ‎04-25-2018

How can one download other configs, such as ranger-security.xml ? Do you need to use other APIs to get these files?

mtdeguzis · ‎09-11-2017

Hmm... so it does* appear you need to provide just* the filename for S1 and S2. interesting

mtdeguzis · ‎09-11-2017

I have the same issue when trying to compute the diff. hadoop distcp -diff s1 s2 -update /data/a /data/a_target /data/a_target is on another cluster. s1 (yesterdays snap) and s2 (todays snap) on the first cluster location are side by side of course. I wonder if the diff needs to the snapshot filename only, and not the absolute path.

mtdeguzis · ‎09-09-2017

Just so everyone is aware: The snapshot created dirs must be named the same on both sides to do the diff distcp: Cannot find the snapshot of directory /group/bti/snapshot with name /group/bti/.snapshot/s20170908-080603.486 #LOF: /group/bti/snapshot/.snapshot/s20170908-212827.054 Due to default naming conventions, the folders will not be the same. The default folder names created are seemingly time-stamped to the second. Name each created folder with todays day, such as "s20170908" so when the diff distcp runs, it can find and update the same-day folder on the LOF side.

mtdeguzis · ‎09-09-2017

Thanks for the update! I ran into this myself when designing a Python HDFS snapshot manager for two of our clusters.

mtdeguzis · ‎08-04-2017

We got this working. For those using Hortonworks HiveServer2 with Kerberos, this is what you need to do (providing your kerberos / kr5.conf is valid on your target host): Plus signs are for diff representation only. dbeaver.ini: -startup plugins/org.eclipse.equinox.launcher_1.3.201.v20161025-1711.jar --launcher.library plugins/org.eclipse.equinox.launcher.win32.win32.x86_64_1.1.401.v20161122-1740 -showsplash -vmargs -Xms128m -Xmx2048m + -Djavax.security.auth.useSubjectCredsOnly=false + -Djava.security.krb5.conf="krb5.conf" Place the krb5.conf in the main installation path, or provide a path to it. After debugging for hours, and checking traces and more, this is wall it took. class name: org.apache.hive.jdbc.HiveDriver Dbeaver URL template: jdbc:hive2://{host}:{port}/{database};principal=hive/{host}.host.com@HOST.COM

Online	Offline
Last Visited	‎12-03-2019 08:00 AM

Member Since	‎01-12-2017 08:41 PM
Last Visited	‎12-03-2019 08:00 AM
Posts	59
Kudos received	1

Cloudera Community

Re: Question on HDFS Rebalance

Re: Connecting to HiveServer2 Interactive / LLAP u...

Re: How to export all HDP configuration files (xml...

Re: How to export all HDP configuration files (xml...

Re: Download Client Configs

Re: distcp update difference between two snapshot ...

Re: distcp update difference between two snapshot ...

Re: Managing Hadoop DR with 'distcp' and 'snapshot...

Re: Distcp with snapshot diff copy doesn't work wi...

Re: Connecting to HiveServer2 Interactive / LLAP u...