Member since
06-04-2019
9
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5798 | 01-09-2024 03:40 AM |
09-18-2024
09:19 PM
1 Kudo
This solution worked for eliminating error , but data is not being fetched from table. empty data frame showing.
... View more
01-09-2024
03:40 AM
As I was already using the Hadoop Credential Provider, I found a solution that does not require decrypting the password as follows: PySpark code: # Spark session
spark = SparkSession.builder \
.config("spark.yarn.keytab=/etc/security/keytabs/<APPLICATION_USER>.keytab") \
.appName('SPARK_TEST') \
.master("yarn") \
.getOrCreate()
credential_provider_path = 'jceks://hdfs/<PATH>/<CREDENTIAL_FILE>.jceks'
credential_name = 'PASSWORD.ALIAS'
# Hadoop credential
conf = spark.sparkContext._jsc.hadoopConfiguration()
conf.set('hadoop.security.credential.provider.path',credential_provider_path)
credential_raw = conf.getPassword(credential_name)
for i in range(credential_raw.__len__()):
password = password + str(credential_raw.__getitem__(i)) The important point above is the .config() line in SparkSession. You must enter the keytab to access the password. Otherwise you will get the encrypted value. I can't say that I'm very happy with being able to directly manipulate the password value in the code. I would like to delegate this to some component in a way that the programmer does not have direct access to the password value. Maybe what I'm looking for is some kind of authentication provider, but for now the solution above works for me.
... View more
12-26-2023
03:31 AM
This way the password is provided to the connection is exposed in plain text?
... View more
07-20-2022
08:25 AM
As far as I know this is not something that Ambari or SQOOP allow for. What you could do to achieve your goal is one of the two: Prepare sh scripts and refer to your jdbc string as a variable Prepare an Oozie Worklfow and pass the jdbc string as a variable At that point you might have an external tool (e.g. Jenkins) maintaining a list of jdbc strings and taking the responsibility to specify the desidred one. In solution 1, Jenkins should SSH to the node, set the variable to the JDBC string, launch the sh. In solution 2, Jenkins should use Oozie API to start the workflow while specifying the desired variable value. Solution 2 is much better than 1, since it relies on a distributed, highly available service (Oozie). Regards
... View more
01-07-2022
09:37 AM
@cardozogp Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks
... View more
06-15-2021
11:41 AM
Hi @Bender, Thanks for your replay. The Faq wasn't useful. It's aswer is: "With BDA V2.0 Sqoop automatically supports Oracle Database and MySQL. Hence connect strings beginning with jdbc:oracle or jdbc:mysql:// are handled with no additional setup." 😞
... View more