About mugdha

mugdha · ‎09-20-2021

To demonstrate what is happening here, see these steps and output. Create a database at a custom location: 0: jdbc:hive2://hostname.cloudera.co> create database grandtour location 'hdfs://hostname.cloudera.com:8020/bdr-test/grandtour.db'; <TRUNCATED> INFO : Executing command(queryId=hive_20210818180811_d96dd7f8-2713-440f-8e9f- 8eebd2954d05): create database grandtour location 'hdfs://hostname.cloudera.com:8020/bdr-test/grandtour.db' INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=hive_20210818180811_d96dd7f8-2713- 440f-8e9f-8eebd2954d05); Time taken: 0.031 seconds INFO : OK No rows affected (0.083 seconds) Create a table and insert a record: 0: jdbc:hive2://hostname.cloudera.co> use grandtour; <TRUNCATED> INFO : Completed executing command(queryId=hive_20210818180835_f365f042-e2a9- 4f53-a9ed-317d21dcfc07); Time taken: 0.008 seconds INFO : OK No rows affected (0.065 seconds) 0: jdbc:hive2://hostname.cloudera.co> create table madagascar (name string); <TRUNCATED> INFO : Completed executing command(queryId=hive_20210818180858_79bb34bf-b703- 43c9-a720-4756c22cb661); Time taken: 0.053 seconds INFO : OK No rows affected (0.117 seconds) 0: jdbc:hive2://hostname.cloudera.co> insert into madagascar values('james may'); <TRUNCATED> INFO : Completed executing command(queryId=hive_20210818181038_83ea0501-5d24- 4c57-bdde-2e214e7abb9c); Time taken: 19.26 seconds INFO : OK 1 row affected (19.47 seconds) Table contents: 0: jdbc:hive2://hostname.cloudera.co> select * from madagascar; <TRUNCATED> INFO : Completed executing command(queryId=hive_20210818181134_346ee6b8-0c2e- 4eae-9ee9-e88d94aa6b3e); Time taken: 0.001 seconds INFO : OK +------------------+ | madagascar.name | +------------------+ | james may | +------------------+ 1 row selected (0.421 seconds) 0: jdbc:hive2://hostname.cloudera.co> HDFS listing in CDH 6.x: [hdfs@c441-node4 ~]$ hdfs dfs -ls /bdr-test Found 1 items drwxrwxrwx - hive supergroup 0 2021-08-18 18:08 /bdr-test/grandtour.db [hdfs@c441-node4 ~]$ hdfs dfs -ls /bdr-test/grandtour.db Found 1 items drwxrwxrwx - hive supergroup 0 2021-08-18 18:08 /bdrtest/ grandtour.db/madagascar [hdfs@c441-node4 ~]$ hdfs dfs -ls /bdr-test/grandtour.db/madagascar Found 1 items -rwxrwxrwx 3 hive supergroup 10 2021-08-18 18:10 /bdrtest/ grandtour.db/madagascar/000000_0 [hdfs@c441-node4 ~]$ hdfs dfs -cat /bdr-test/grandtour.db/madagascar/000000_0 james may Run a Hive BDR job in CDP cluster. After BDR, on the destination cluster: 0: jdbc:hive2://hostname.cloudera.co> show databases; <TRUNCATED> INFO : Completed executing command(queryId=hive_20210818185431_6c2ad328-78c4- 454d-bead-3aa7baae907e); Time taken: 0.007 seconds INFO : OK +---------------------+ | database_name | +---------------------+ | default | | grandtour | | information_schema | | sys | +---------------------+ 4 rows selected (0.044 seconds) Describe output shows a different location. It should be /bdr-test, but it shows the default location. Even though describe shows a wrong location, the table is at the correct location on HDFS. The listing looks as follows: [root@c241-node3 ~]# hdfs dfs -ls / Found 7 items drwxrwxrwx - hive supergroup 0 2021-08-18 18:40 /bdr-test drwxrwxrwx - hbase hbase 0 2021-08-18 18:29 /hbase drwxrwxrwx - hdfs supergroup 0 2021-08-18 01:14 /ranger drwxrwxrwx - solr solr 0 2021-08-18 01:14 /solr-infra drwxrwxrwx - hdfs supergroup 0 2021-08-18 18:25 /tmp drwxrwxrwx - hdfs supergroup 0 2021-08-18 18:25 /user drwxrwxrwx - hdfs supergroup 0 2021-08-18 01:14 /warehouse [root@c241-node3 ~]# hdfs dfs -ls /bdr-test Found 1 items drwxrwxrwx - hive supergroup 0 2021-08-18 18:40 /bdr-test/grandtour.db [root@c241-node3 ~]# hdfs dfs -ls /bdr-test/grandtour.db Found 1 items drwxrwxrwx - hive supergroup 0 2021-08-18 18:40 /bdrtest/ grandtour.db/madagascar [root@c241-node3 ~]# hdfs dfs -ls -R /bdr-test/grandtour.db drwxrwxrwx - hive supergroup 0 2021-08-18 18:40 /bdrtest/ grandtour.db/madagascar -rwxrwxrwx 3 hive supergroup 10 2021-08-18 18:10 /bdrtest/ grandtour.db/madagascar/000000_0 [root@c241-node3 ~]# hdfs dfs -cat /bdr-test/grandtour.db/madagascar/000000_0 james may The reason the HDFS listing gets created this way is that this is the table location. create_database call from client is coming with "locationUri:/warehouse/tablespace/managed/hive/grandtour.db" create_table call from client is coming with "location:/bdr-test/grandtour.db/madagascar" You can verify this in the Hive metastore log. Note the locationURI in these messages. For database: 2021-08-18 18:40:47,421 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-7-thread-196]: 203: source:172.xx.xx.xx create_database: Database(name:grandtour, description:null, locationUri:/warehouse/tablespace/managed/hive/grandtour.db, parameters:{}, ownerName:hive, ownerType:USER, catalogName:hive, createTime:1629310091) For table: 2021-08-18 18:40:47,563 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-7-thread-198]: 205: source:172.xx.xx.xx create_table_req: Table(tableName:madagascar, dbName:grandtour, owner:hive, createTime:1629310138, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:name, type:string, comment:null)], location:/bdr-test/grandtour.db/madagascar, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{external.table.purge=true, numRows=1, rawDataSize=9, transient_lastDdlTime=1629310257, numFilesErasureCoded=0, totalSize=10, EXTERNAL=TRUE, COLUMN_STATS_ACCURATE={"BASIC_STATS":"true"}, numFiles=1}, viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE, catName:hive, ownerType:USER) Now, how to fix the database location? This issue has been resolved in Cloudera Manager 7.4.4.

mugdha · ‎09-20-2021

In this example, I am importing encryption keys from HDP 3.1.5 cluster to an HDP 2.6.5 cluster. Create key "testkey" in Ranger KMS HDP 3.1.5 cluster with steps: List and Create Keys. In HDP 3.1.5, the current master key is: Encryption Key: Create an encryption zone with the "testkey": [hdfs@c241-node3 ~]$ hdfs crypto -createZone -keyName testkey -path /testEncryptionZone Added encryption zone /testEncryptionZone List to confirm the zone and keys: [hdfs@c241-node3 ~]$ hdfs crypto -listZones /testEncryptionZone testkey Export the keys: Log in to KMS host export java home cd /usr/hdp/current/ranger-kms ./exportKeysToJCEKS.sh $filename The output will look as follows: [root@c241-node3 ranger-kms]# export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk- 1.8.0.292.b10-1.el7_9.x86_64/jre [root@c241-node3 ranger-kms]# ./exportKeysToJCEKS.sh /tmp/hdp315keys.keystore Enter Password for the keystore FILE : Enter Password for the KEY(s) stored in the keystore: Keys from Ranger KMS Database has been successfully exported into /tmp/hdp315keys.keystore On to the HDP 2.6.5 cluster where we need to import the keys, do the following: Log in to KMS host Add org.apache.hadoop.crypto.key.**; in the property jceks.key.serialFilter. This needs to be changed in the following file on KMS host only: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.292.b10-1.el7_9.x86_64/jre/lib/security/java.security After the change, the entry in the file should look like this: jceks.key.serialFilter = java.lang.Enum;java.security.KeyRep;\ java.security.KeyRep$Type;javax.crypto.spec.SecretKeySpec;org.apache.hadoop.crypto.k ey.**;!* export JAVA_HOME, RANGER_KMS_HOME, RANGER_KMS_CONF, SQL_CONNECTOR_JAR cd /usr/hdp/current/ranger-kms/ Run ./importJCEKSKeys.sh $filename JCEKS The output looks like this: [root@c441-node3 ranger-kms]# export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk- 1.8.0.292.b10-1.el7_9.x86_64/jre [root@c441-node3 ranger-kms]# export RANGER_KMS_HOME=/usr/hdp/2.6.5.0-292/ranger-kms [root@c441-node3 ranger-kms]# export RANGER_KMS_CONF=/etc/ranger/kms/conf [root@c441-node3 ranger-kms]# export SQL_CONNECTOR_JAR=/var/lib/ambariagent/ tmp/mysql-connector-java.jar [root@c441-node3 security]# cd /usr/hdp/current/ranger-kms/ [root@c441-node3 ranger-kms]# ./importJCEKSKeys.sh /tmp/hdp315keys.keystore JCEKS Enter Password for the keystore FILE : Enter Password for the KEY(s) stored in the keystore: 2021-08-12 23:58:06,729 ERROR RangerKMSDB - DB Flavor could not be determined Keys from /tmp/hdp315keys.keystore has been successfully imported into RangerDB. To confirm that the encryption keys are imported, in DB of HDP 2.6.5 cluster, check the ranger_keystore table for the entry for "testkey". Also, check if the master key in HDP 2.6.5 is untouched; it is the same which Ranger KMS created: Now create an encryption zone in HDP 2.6.5 with the imported key: [hdfs@c441-node3 ~]$ hdfs dfs -mkdir /testEncryptionZone-265 [hdfs@c441-node3 ~]$ hdfs crypto -createZone -keyName testkey -path /testEncryptionZone-265 Added encryption zone /testEncryptionZone-265 Confirm the zone and keys: [hdfs@c441-node3 ~]$ hdfs crypto -listZones /testEncryptionZone-265 testkey Now for the distcp, note that it needs to have /.reserved/raw before the encryption zone path and -px option. Command: hadoop distcp -px /.reserved/raw/$encryptionZonePath/filename hdfs://destination/.reserved/raw/$encryptionZonePath/filename Check this document link to read about these options: Configuring Apache HDFS Encryption Following is the output of distcp. It is truncated but shows copied file. Note that the skipCRC is false. [hdfs@c241-node3 ~]$ hadoop distcp -px /.reserved/raw/testEncryptionZone/text.txt hdfs://172.25.37.10:8020/.reserved/raw/testEncryptionZone-265/ ERROR: Tools helper /usr/hdp/3.1.5.0-152/hadoop/libexec/tools/hadoop-distcp.sh was not found. 21/08/13 01:52:58 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=false, useRdiff=false, fromSnapshot=null, toSnapshot=null, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=0.0, copyStrategy='uniformsize', preserveStatus=[XATTR], atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[/.reserved/raw/testEncryptionZone/text.txt], targetPath=hdfs://172.25.37.10:8020/.reserved/raw/testEncryptionZone-265, filtersFile='null', blocksPerChunk=0, copyBufferSize=8192, verboseLog=false, directWrite=false}, sourcePaths=[/.reserved/raw/testEncryptionZone/text.txt], targetPathExists=true, preserveRawXattrsfalse <TRUNCATED> 21/08/13 01:52:59 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 1; dirCnt = 0 21/08/13 01:52:59 INFO tools.SimpleCopyListing: Build file listing completed. 21/08/13 01:52:59 INFO tools.DistCp: Number of paths in the copy list: 1 21/08/13 01:52:59 INFO tools.DistCp: Number of paths in the copy list: 1 <TRUNCATED> DistCp Counters Bandwidth in Btyes=21 Bytes Copied=21 Bytes Expected=21 Files Copied=1 Another question that came up - what happens to old keys when I import a new key? It just gets added to the existing keys. Here is a screenshot:

mugdha · ‎04-07-2021

Did you make any changes to the SQL connector recently? Upgrade or copied the jar over? Would you be able to attach the full log file here?

mugdha · ‎03-24-2021

They will be in the process directory for the component. For example: hive.keytab is in: /var/run/cloudera-scm-agent/process/*-hive_on_tez-HIVESERVER2

mugdha · ‎03-23-2021

@zampJeri This /tmp is about the OS file system, not HDFS. It wants to create the _resources files and unable. Does the user have permissions on /tmp/hive?

mugdha · ‎03-22-2021

The 'hive.notification.sequence.lock.max.retries' Parameter detects the number of retries for acquiring a Lock for getting the Next Notification ID for entries in the 'NOTIFICATION_LOG' Table. The error that you are seeing does seem to be because of this. Add more context, when are you seeing this? What job are you running? Full trace?

mugdha · ‎03-19-2021

If you keep seeing this, check in the bash profile on all nodes, if the umask is set to something else. It should be 0022.

mugdha · ‎03-19-2021

Do you have SPNEGO enabled for browsers? https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.0.1/authentication-with-kerberos/content/authe_spnego_enabling_browser_access_to_a_spnego_enabled_web_ui.html Are you seeing any error on the UI?

mugdha · ‎03-18-2021

What is the spark command you are using? What is the HDP version? Are you using LLAP?

mugdha · ‎03-18-2021

@mohammad_shamim Have you enabled Kerberos in the cluster? Can you paste the screenshot of RM UI? Is there any error do you see on the page?

Online	Offline
Last Visited	‎07-22-2022 05:43 PM

Member Since	‎09-29-2015 04:12 AM
Last Visited	‎07-22-2022 05:43 PM
Posts	186
Kudos received	62

Cloudera Community

Re: TEZ View- Queries tab doesn't show any query, ...

Re: Ranger Admin Install Failed

Re: httpfs installation hadoop apache download

Re: Tez UI not coming up, failing with spnego auth

Re: Can someone let me know how to skipACLS in zoo...

Hive BDR job from CDH to CDP changes Hive table lo...

Import/Export Ranger KMS keys and distcp without s...

Re: hive metastore issue on hortonworks hadoop 2.5...

Re: Location of keytab files

Re: User class threw exception: org.apache.spark.s...

Re: Will updating hive.notification.sequence.lock....

Re: AccessDeniedException : /opt/yarn/nm/usercache...

Re: RM ui

Re: Cannot work with spark sql to show show databa...

Re: RM ui