Member since
01-19-2017
3679
Posts
632
Kudos Received
372
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 881 | 06-04-2025 11:36 PM | |
| 1472 | 03-23-2025 05:23 AM | |
| 728 | 03-17-2025 10:18 AM | |
| 2622 | 03-05-2025 01:34 PM | |
| 1740 | 03-03-2025 01:09 PM |
05-06-2018
09:07 PM
@Liana Napalkova That's simple as hdfs user run $ hdfs dfs -mkdir /app-logs/centos Change owner permissions $ hdfs dfs -chown -R centos /app-logs/centos The permissions should then look like this see the third line $ hdfs dfs -ls /app-logs
Found 5 items
drwxrwx--- - admin hadoop 0 2018-05-04 18:03 /app-logs/admin
drwxrwx--- - ambari-qa hadoop 0 2017-10-19 13:59 /app-logs/ambari-qa
drwxr-xr-x - centos hadoop 0 2018-05-06 21:31 /app-logs/centos
drwxrwx--- - hive hadoop 0 2018-04-13 23:04 /app-logs/hive Now you can launch your spark -submit Please let me know
... View more
05-06-2018
07:59 PM
@Sim kaur Have you tried switching to beeline? and executed your queries? set hive.execution.engine=mr; Take note Hive CLI should be deprecated as the Hive community has long recommended using the Beeline plus HiveServer2 configuration
... View more
05-06-2018
07:33 PM
3 Kudos
@Mudit Kumar Below is an outline of the next procedure Assumption: - Centos6 or RHEL 6 - REALM is EXAMPLE.COM the command will differ for Centos/RHEL7 ie systemctl # Install a new MIT KDC Install a new version of the KDC server: # yum install krb5-server krb5-libs krb5-workstation On KDC clients cluster clients datanodes etc # yum install krb5-workstation # Edit the KDC server configuration file Change the [realms] section of this file by replacing the default “kerberos.example.com” setting for the kdc and admin_server properties with the Fully Qualified Domain Name of the KDC server host. In the following example, “kerberos.example.com” has been replaced with “my.kdc.server”. # vi /etc/krb5.conf [realms]
EXAMPLE.COM = {
kdc = my.kdc.server
admin_server = my.kdc.server
} Some components such as long-running spark jobs require renewable tickets. To configure MIT KDC to support them, ensure the following settings are specified in the libdefaults section of the /etc/krb5.conf file. renew_lifetime = 7d # Create the Kerberos Database takes a while # kdb5_util create -s
Loading random data
Initializing database '/var/kerberos/krb5kdc/principal' for realm 'EXAMPLE.COM',
master key name 'K/M@EXAMPLE.COM'
You will be prompted for the database Master Password.
It is important that you NOT FORGET this password.
Enter KDC database master key:xxxxxxxx {dont lose this password}
Re-enter KDC database master key to verify:xxxxxxxx # Start the KDC Start the KDC server and the KDC admin server. # service krb5kdc start
# service kadmin start # Set up the KDC server to auto-start on boot. # chkconfig krb5kdc on
# chkconfig kadmin on # Create a Kerberos Admin Create a KDC admin by creating an admin principal. # kadmin.local -q "addprinc admin/admin"
Authenticating as principal admin/admin@EXAMPLE.COM with password.
WARNING: no policy specified for admin/admin@EXAMPLE.COM; defaulting to no policy
Enter password for principal "admin/admin@EXAMPLE.COM":
Re-enter password for principal "admin/admin@EXAMPLE.COM": Principal "admin/admin@EXAMPLE.COM" created. Confirm that this admin principal has permissions in the KDC ACL. Using a text editor, open the KDC ACL file: /var/kerberos/krb5kdc/kadm5.acl Ensure that the KDC ACL file includes an entry so to allow the admin principal to administer the KDC for your specific realm. When using a realm that is different than EXAMPLE.COM, be sure there is an entry for the realm you are using. If not present, principal creation will fail. For example, for an admin/admin@HADOOP.COM principal, you should have an entry: */admin@EXAMPLE.COM * After editing and saving the kadm5.acl file, you must restart the kadmin process. RHEL/CentOS/Oracle Linux 6 # service kadmin restart Check status # service krb5kdc status desired output krb5kdc (pid 2204) is running... # service kadmin status desired output kadmind (pid 16891) is running... # Install the JCE On the Ambari Server, obtain the JCE policy file appropriate for the JDK version in your cluster. For Oracle JDK 1.8: nstall JCE 8wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jce/8/jce_policy-8.zip"unzip jce_policy-8.zip Save the policy file archive in a temporary location.On Ambari Server and on each host in the cluster, add the unlimited security policy JCE jars to $JAVA_HOME/jre/lib/security/. # unzip -o -j -q jce_policy-8.zip -d /usr/jdk64/jdk1.8.0_77/jre/lib/security/ # Restart Ambari Server. # ambari-server restart # Running the Kerberos Security Wizard When choosing Existing MIT KDC or Existing Active Directory, the Kerberos Wizard prompts for information related to the KDC, the KDC Admin Account and the Service and Ambari principals. Once provided, Ambari will automatically create principals, generate keytabs and distribute keytabs to the hosts in the cluster. The services will be configured for Kerberos and the service components are restarted to authenticate against the KDC # To continue http://docs.hortonworks.com/HDPDocuments/Ambari-2.4.1.0/bk_ambari-security/content/ch_advanced_security_options_for_ambari.html Go to Ambari GUI To enable kerberos, the inputs are quite straight forward Admin pricipal password anREALM etc Good luck After the successful installation, all the service are restart ! Test without kerberos ticket ad HDFS user su - hdfs Destroy any valid ticket $kdestroy The below command should error out hdfs dfs -ls /user List the generated keytabs $ ls /etc/security/keytabs Test the with a valid Kerberos ticket as hdfs $ klist -kt /etc/security/keytabs/hdfs.service.keytab
Keytab name: FILE:/etc/security/keytabs/hdfs.service.keytab
KVNO Timestamp Principal
---- ----------------- --------------------------------------------------------
1 02/02/17 23:00:12 hdfs/london.EXAMPLE.COM@EXAMPLE.COM
1 02/02/17 23:00:12 hdfs/london.EXAMPLE.COM@EXAMPLE.COM
1 02/02/17 23:00:12 hdfs/london.EXAMPLE.COM@EXAMPLE.COM
1 02/02/17 23:00:12 hdfs/london.EXAMPLE.COM@EXAMPLE.COM
1 02/02/17 23:00:12 hdfs/london.EXAMPLE.COM@EXAMPLE.COM Get a ticket $ kinit -kt /etc/security/keytabs/hdfs.service.keytab hdfs/london.EXAMPLE.COM@EXAMPLE.COM You should see a valid ticket $ klist
Ticket cache: FILE:/tmp/krb5cc_504
Default principal: hdfs/london.EXAMPLE.COM@EXAMPLE.COM
Valid starting Expires Service principal
02/10/17 01:32:45 02/11/17 01:32:45 krbtgt/EXAMPLE.COM@EXAMPLE.COM
renew until 02/10/17 01:32:45 The below command should succeed hdfs dfs -ls /user Hope that helps
... View more
05-06-2018
07:29 AM
@Sim kaur Just answered your YARN memory setup on another thread, you were scheduling you hive queries through oozie 🙂 Please do use the excel file to help you setup correctly your Vcores and Memory. Please accept and close the previous threads where part of your issues was resolved because the Memory issue is different from the zookeeper setup or Cluster setup question which I had already answered. Try to keep your threads component specific its easier to resolve and open a new thread 🙂
... View more
05-05-2018
08:18 PM
@Sim kaur Okay will wait for your feedback!! How many zookeepers do you have running now? Anything less than 3 is NOT good! check zookeeper split brain documentation For the failed installation run ps aux |grep supervisor if there is an existing supervisor process PID kill it if one exists and then start the agent to make sure the agent is using the right supervisord.conf /var/run/cloudera-scm-agent/supervisor/supervisor.conf
... View more
05-05-2018
07:12 PM
@Sim kaur Any updates?
... View more
05-05-2018
07:11 PM
@Raj ji Any updates?
... View more
05-05-2018
04:17 PM
@Vishal G I did sit for the HDPCA and I can confirm you will have a link to the official HDP 2.3 documentation. But the instances are very dodgy try to work starting with easiest and save most time for the more difficult questions but if you tried out the test exam in AWS the look and feel is exactly the same. Good luck
... View more
05-05-2018
12:43 PM
@Sim kaur I would suggest you install at least the node managers on the servers where datanodes are running this way node managers can find the data locally. Datanodes are part of HDFS and node managers are part of Yarn. Datanodes are used to store data on HDFS whereas Nodemanagers are used to start a container on Yarn. There is no strict rule that datanodes and node managers have to be on the same host. If you have nodemanagers on all nodes, in this case, the containers running on hosts where datanodes aren't installed will still run the application by copying data from datanodes. That could be the issue of timeouts you are experiencing.
... View more