Member since
01-19-2017
3679
Posts
632
Kudos Received
372
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1562 | 06-04-2025 11:36 PM | |
| 2035 | 03-23-2025 05:23 AM | |
| 959 | 03-17-2025 10:18 AM | |
| 3644 | 03-05-2025 01:34 PM | |
| 2529 | 03-03-2025 01:09 PM |
01-29-2019
12:02 PM
@Ali Erdem YES it's possible to connect and run a sqoop job against an SQL server without a password. Hadoop credential provider API the CredentialProvider API in Hadoop allows for the separation of applications and how they store their required passwords/secrets. With Sqoop 1.4.5 or higher, the credential API keystore is supported by Sqoop. The AD user ONLY needs to include the -Dhadoop.security.crendential.provider.path in the sqoop command. Here are the steps, The API expects the password .jceks file to be in HDFS and accessible to that user preferably in his/her home directory Assumption password for Production sqlserver it's good to standardize eg sql_prod,sql_dev or ora_prod,ora_dev etc $ hadoop credential create sql_prod.password -provider jceks://hdfs/user/erdem/sql_prod.password.jceks The above command will prompt for the target database password see output below Enter password: {the_target_database_password}
Enter password again: {the_target_database_password}ora_prod.password
has been successfully created.org.apache.hadoop.security.alias.JavaKeyStoreProvider
has been updated. Now the password should be in your home directory,the file should be readable $ hdfs dfs -ls /user/erdem
Found 1 items
-rwx------ 3 erdem erdem 502 2019-01-29 11:08 /user/erdem/sql_prod.password.jceks Now the user erdem can run a sqoop job sqoop import
-Dhadoop.security.crendential.provider.path jceks//hdfs/user/erdem/sql_prod.password.jceks
-Doraoop.timestamp.string=false -Dmapreduce.job.user.classpath.first=true \
--verbose --connect jdbc:sqlserver://sqlserver-name \
--username erdem \
--password alias ora_prod.password \
--driver com.microsoft.sqlserver.jdbc.SQLServerDriver \
--table test \
--target-dir "{some_dir}" \
--split-by NOOBJETRISQUECONTRAT --direct --as-parquetfile In the above, I modified the output from my oracle sqoop output especially for the driver part. But it should work without issue you will realise the user erdem didn't key in a password on the CLI a security loophole. There you go revert if you need more help.
... View more
01-29-2019
12:53 AM
Part 3 of the previous kerberization document
... View more
01-29-2019
12:52 AM
@Tom Burke Setup the Server: Install Kerberos KDC and Admin Server $ apt update && apt upgrade -y
$ apt install krb5-kdc krb5-admin-server krb5-config -y
$ krb5_newrealm Locate and edit the krb5.conf [logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
default_realm = TEST.COM
dns_lookup_realm = false
dns_lookup_kdc = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
[realms]
TEST.COM = {
kdc = server.test.com
admin_server = server.test.com
}
[domain_realm]
.test.com = TEST.COM
test.com = TEST.COM
KDC configuration Locate and edit the kdc.conf /etc/krb5kdc/kdc.conf. [kdcdefaults]
kdc_ports = 88
kdc_tcp_ports = 88
[realms]
TEST.COM = {
#master_key_type = aes256-cts
acl_file = /var/kerberos/krb5kdc/kadm5.acl
dict_file = /usr/share/dict/words
admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
}
Create the Kerberos database This should pick your REALM for the krb5.conf and kdc.conf you will be prompted for a master password keep it preciously it will be useful for the Ambari Kerberos wizard # /usr/sbin/kdb5_util create -s output Loading random data Initializing database '/var/kerberos/krb5kdc/principal' for realm 'TEST.COM', master key name 'K/M@TEST.COM' You will be prompted for the database Master Password. It is important that you NOT FORGET this password. Enter KDC database master key: Re-enter KDC database master key to verify: Locate and edit the kadm5.acl Assign Administrator Privilege by editing the kadm5.acl in /var/kerberos/krb5kdc/kadm5.acl replace the EXAMPLE.COM with your realm */admin@TEST.COM * Restart the KDC and kadmin Set the 2 daemons to auto start at boot else your cluster won't start # /etc/rc.d/init.d/krb5kdc start
Starting Kerberos 5 KDC: [ OK ]
# /etc/rc.d/init.d/kadmin start
Starting Kerberos 5 Admin Server: Create a Kerberos Admin Use the same master password # kadmin.local -q "addprinc admin/admin" Output Authenticating as principal root/admin@TEST.COM with password. WARNING: no policy specified for admin/admin@TEST.COM; defaulting to no policy Enter password for principal "admin/admin@TEST.COM": Re-enter password for principal "admin/admin@TEST.COM": Principal "admin/admin@TEST.COM" created. Check if the root principal was created Go to Ambari and enable Kerberos See attached Kerberos setup for HDP 3.1 they are quite similar save for the new UI
... View more
01-28-2019
04:07 PM
1 Kudo
@Marcel-Jan Krijgsman So frustrating indeed have you tried running the hive import from /usr/hdp/2.6.5.0-292/atlas/hook-bin ? The output should look like below # ./import-hive.sh
Using Hive configuration directory [/etc/hive/conf]
Log file for import is /usr/hdp/current/atlas-server/logs/import-hive.log
log4j:WARN No such property [maxFileSize] in org.apache.log4j.PatternLayout.
log4j:WARN No such property [maxBackupIndex] in org.apache.log4j.PatternLayout.
Enter username for atlas :- admin
Enter password for atlas :-
Hive Meta Data imported successfully!!! After running successfully you should be able to see your tables in Atlas
... View more
01-28-2019
11:57 AM
1 Kudo
@Michael Bronson If you have exhausted all other avenues YES, Step 1 Check and compare the /usr/hdp/current/kafka-broker symlinks Step 2 Download both env'es as backup from the problematic and functioning cluster Upload the functioning cluster env to the problematic one, since you have a backup Start kafka through ambari Step 3 sed -i 's/verify=platform_default/verify=disable/'/etc/python/cert-verification.cfg Step 4 Lastly, if the above steps don't remedy the issue, then remove and -re-install the ambari-agent and remember to manually point to the correct ambari server in the ambari-agent.ini
... View more
01-28-2019
09:01 AM
1 Kudo
@Michael Bronson If you can start your brokers from the CLI then that means your env is not set properly as Ambari depends on that env to successfully start or stop a component. What you could do is export the env from the problematic cluster and compare it meticulously against the env from the working cluster using the procedures I sent above. You should be able to see the difference Can you also validate that the symlinks are okay
... View more
01-28-2019
08:50 AM
@Bhushan Kandalkar Good it worked out but you shouldn't have omitted the information about the architecture ie Load balancer such info is critical in the analysis ....:-) Happy hadooping
... View more
01-27-2019
10:07 PM
@Michael Bronson Then what you could do using the config.py copy the kafka.env to the /tmp on the working cluster see below # /var/lib/ambari-server/resources/scripts/configs.py --user=admin --password=admin --port=8080 --action=get --host=localhost --cluster={your_clustername} --config-type=kafka-env --file=/tmp/kafka-env.json Sample output 2019-01-27 22:27:09,474 INFO ### Performing "get" content:
2019-01-27 22:27:09,474 INFO ### to file "/tmp/kafka.env.json"
2019-01-27 22:27:09,600 INFO ### on (Site:kafka.env, Tag:version1) Validate the contents of the .json in the "/tmp/kafka-env.json" Sample output {
"properties": {
"kafka_user_nproc_limit": "65536",
"content": "\n#!/bin/bash\n\n# Set KAFKA specific environment variables here.\n\n# The java implementation to use.\nexport JAVA_HOME={{java64_home}}\nexport PATH=$PATH:$JAVA_HOME/bin\nexport PID_DIR={{kafka_pid_dir}}\nexport LOG_DIR={{kafka_log_dir}}\n{% if kerberos_security_enabled or kafka_other_sasl_enabled %}\nexport KAFKA_KERBEROS_PARAMS=\"-Djavax.security.auth.useSubjectCredsOnly=false {{kafka_kerberos_params}}\"\n{% else %}\nexport KAFKA_KERBEROS_PARAMS={{kafka_kerberos_params}}\n{% endif %}\n# Add kafka sink to classpath and related depenencies\nif [ -e \"/usr/lib/ambari-metrics-kafka-sink/ambari-metrics-kafka-sink.jar\" ]; then\n export CLASSPATH=$CLASSPATH:/usr/lib/ambari-metrics-kafka-sink/ambari-metrics-kafka-sink.jar\n export CLASSPATH=$CLASSPATH:/usr/lib/ambari-metrics-kafka-sink/lib/*\nfi\nif [ -f /etc/kafka/conf/kafka-ranger-env.sh ]; then\n. /etc/kafka/conf/kafka-ranger-env.sh\nfi",
"kafka_log_dir": "/var/log/kafka",
"kafka_pid_dir": "/var/run/kafka",
"kafka_user_nofile_limit": "128000",
"is_supported_kafka_ranger": "true",
"kafka_user": "kafka"
} Copy the file using scp or whatever it over to your cluster and run the below command --action=set to update your problematic cluster. Before you start the kafka check the properties in the kafka.env.json to match you ie memory to match you cluster config. # /var/lib/ambari-server/resources/scripts/configs.py --user=admin --password=admin --port=8080 --action=set --host=localhost --cluster={your_clustername} --config-type=kafka-env --file=/tmp/kafka-env.json Sample output 2019-01-27 22:29:38,568 INFO ### Performing "set":
2019-01-27 22:29:38,568 INFO ### from file /tmp/kafka.env.json
2019-01-27 22:29:38,569 INFO ### PUTting file: "/tmp/kafka.env.json"
2019-01-27 22:29:38,569 INFO ### PUTting json into: doSet_version1.json
2019-01-27 22:29:38,719 INFO ### NEW Site:kafka.env, Tag:version2 Start you Kafka from Ambari this should work. Please let me know
... View more
01-27-2019
09:17 PM
@Michael Bronson Do you have a working cluster of same HDP version? Is it a kerberized environment?
... View more
01-27-2019
10:01 AM
@Bhushan Kandalkar Can you start with the following checks to investigate the SaslTransport issues, first the hive keytab # ll /etc/security/keytabs/hive.service.keytab Desired output see ownership and permission bits !!! -r--r----- 1 hive hadoop 353 Oct 11 10:49 /etc/security/keytabs/hive.service.keytab Check the Hive--->Configs--->Advanced hive-site check the hive.server2.authentication.kerberos.principal Desired output hive/_HOST@REALM This should match the entry in the Kerberos database, validate by running on the KDC server the below command as root user # kadmin.local
kadmin.local: listprincs Desired output hive/$FQDN@REALM Lastly, Can you regenerate the keytabs using the Ambari Kerberos wizard the restart the cluster HTH
... View more