About mugdha

Sak25agr · ‎08-29-2023

https://drive.google.com/file/d/1ApuElSIlU0zmDmI12pQ0PfJN8B1-du2k/view?usp=share_link Here is the log file

Kartik_Agarwal · ‎09-23-2021

The keytabs are pushed from a database to a runtime location at startup of services, what you are describing as a configuration is not really viable from what I understand. You will see /var/run/cloudera-scm-agent/process/ but this is ephemeral, next restart will have another locaiton. You could experiment with trying to provide the manual keytabs through safety valve to the necessary services.

mugdha · ‎09-20-2021

To demonstrate what is happening here, see these steps and output. Create a database at a custom location: 0: jdbc:hive2://hostname.cloudera.co> create database grandtour location 'hdfs://hostname.cloudera.com:8020/bdr-test/grandtour.db'; <TRUNCATED> INFO : Executing command(queryId=hive_20210818180811_d96dd7f8-2713-440f-8e9f- 8eebd2954d05): create database grandtour location 'hdfs://hostname.cloudera.com:8020/bdr-test/grandtour.db' INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=hive_20210818180811_d96dd7f8-2713- 440f-8e9f-8eebd2954d05); Time taken: 0.031 seconds INFO : OK No rows affected (0.083 seconds) Create a table and insert a record: 0: jdbc:hive2://hostname.cloudera.co> use grandtour; <TRUNCATED> INFO : Completed executing command(queryId=hive_20210818180835_f365f042-e2a9- 4f53-a9ed-317d21dcfc07); Time taken: 0.008 seconds INFO : OK No rows affected (0.065 seconds) 0: jdbc:hive2://hostname.cloudera.co> create table madagascar (name string); <TRUNCATED> INFO : Completed executing command(queryId=hive_20210818180858_79bb34bf-b703- 43c9-a720-4756c22cb661); Time taken: 0.053 seconds INFO : OK No rows affected (0.117 seconds) 0: jdbc:hive2://hostname.cloudera.co> insert into madagascar values('james may'); <TRUNCATED> INFO : Completed executing command(queryId=hive_20210818181038_83ea0501-5d24- 4c57-bdde-2e214e7abb9c); Time taken: 19.26 seconds INFO : OK 1 row affected (19.47 seconds) Table contents: 0: jdbc:hive2://hostname.cloudera.co> select * from madagascar; <TRUNCATED> INFO : Completed executing command(queryId=hive_20210818181134_346ee6b8-0c2e- 4eae-9ee9-e88d94aa6b3e); Time taken: 0.001 seconds INFO : OK +------------------+ | madagascar.name | +------------------+ | james may | +------------------+ 1 row selected (0.421 seconds) 0: jdbc:hive2://hostname.cloudera.co> HDFS listing in CDH 6.x: [hdfs@c441-node4 ~]$ hdfs dfs -ls /bdr-test Found 1 items drwxrwxrwx - hive supergroup 0 2021-08-18 18:08 /bdr-test/grandtour.db [hdfs@c441-node4 ~]$ hdfs dfs -ls /bdr-test/grandtour.db Found 1 items drwxrwxrwx - hive supergroup 0 2021-08-18 18:08 /bdrtest/ grandtour.db/madagascar [hdfs@c441-node4 ~]$ hdfs dfs -ls /bdr-test/grandtour.db/madagascar Found 1 items -rwxrwxrwx 3 hive supergroup 10 2021-08-18 18:10 /bdrtest/ grandtour.db/madagascar/000000_0 [hdfs@c441-node4 ~]$ hdfs dfs -cat /bdr-test/grandtour.db/madagascar/000000_0 james may Run a Hive BDR job in CDP cluster. After BDR, on the destination cluster: 0: jdbc:hive2://hostname.cloudera.co> show databases; <TRUNCATED> INFO : Completed executing command(queryId=hive_20210818185431_6c2ad328-78c4- 454d-bead-3aa7baae907e); Time taken: 0.007 seconds INFO : OK +---------------------+ | database_name | +---------------------+ | default | | grandtour | | information_schema | | sys | +---------------------+ 4 rows selected (0.044 seconds) Describe output shows a different location. It should be /bdr-test, but it shows the default location. Even though describe shows a wrong location, the table is at the correct location on HDFS. The listing looks as follows: [root@c241-node3 ~]# hdfs dfs -ls / Found 7 items drwxrwxrwx - hive supergroup 0 2021-08-18 18:40 /bdr-test drwxrwxrwx - hbase hbase 0 2021-08-18 18:29 /hbase drwxrwxrwx - hdfs supergroup 0 2021-08-18 01:14 /ranger drwxrwxrwx - solr solr 0 2021-08-18 01:14 /solr-infra drwxrwxrwx - hdfs supergroup 0 2021-08-18 18:25 /tmp drwxrwxrwx - hdfs supergroup 0 2021-08-18 18:25 /user drwxrwxrwx - hdfs supergroup 0 2021-08-18 01:14 /warehouse [root@c241-node3 ~]# hdfs dfs -ls /bdr-test Found 1 items drwxrwxrwx - hive supergroup 0 2021-08-18 18:40 /bdr-test/grandtour.db [root@c241-node3 ~]# hdfs dfs -ls /bdr-test/grandtour.db Found 1 items drwxrwxrwx - hive supergroup 0 2021-08-18 18:40 /bdrtest/ grandtour.db/madagascar [root@c241-node3 ~]# hdfs dfs -ls -R /bdr-test/grandtour.db drwxrwxrwx - hive supergroup 0 2021-08-18 18:40 /bdrtest/ grandtour.db/madagascar -rwxrwxrwx 3 hive supergroup 10 2021-08-18 18:10 /bdrtest/ grandtour.db/madagascar/000000_0 [root@c241-node3 ~]# hdfs dfs -cat /bdr-test/grandtour.db/madagascar/000000_0 james may The reason the HDFS listing gets created this way is that this is the table location. create_database call from client is coming with "locationUri:/warehouse/tablespace/managed/hive/grandtour.db" create_table call from client is coming with "location:/bdr-test/grandtour.db/madagascar" You can verify this in the Hive metastore log. Note the locationURI in these messages. For database: 2021-08-18 18:40:47,421 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-7-thread-196]: 203: source:172.xx.xx.xx create_database: Database(name:grandtour, description:null, locationUri:/warehouse/tablespace/managed/hive/grandtour.db, parameters:{}, ownerName:hive, ownerType:USER, catalogName:hive, createTime:1629310091) For table: 2021-08-18 18:40:47,563 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-7-thread-198]: 205: source:172.xx.xx.xx create_table_req: Table(tableName:madagascar, dbName:grandtour, owner:hive, createTime:1629310138, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:name, type:string, comment:null)], location:/bdr-test/grandtour.db/madagascar, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{external.table.purge=true, numRows=1, rawDataSize=9, transient_lastDdlTime=1629310257, numFilesErasureCoded=0, totalSize=10, EXTERNAL=TRUE, COLUMN_STATS_ACCURATE={"BASIC_STATS":"true"}, numFiles=1}, viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE, catName:hive, ownerType:USER) Now, how to fix the database location? This issue has been resolved in Cloudera Manager 7.4.4.

mugdha · ‎09-20-2021

In this example, I am importing encryption keys from HDP 3.1.5 cluster to an HDP 2.6.5 cluster. Create key "testkey" in Ranger KMS HDP 3.1.5 cluster with steps: List and Create Keys. In HDP 3.1.5, the current master key is: Encryption Key: Create an encryption zone with the "testkey": [hdfs@c241-node3 ~]$ hdfs crypto -createZone -keyName testkey -path /testEncryptionZone Added encryption zone /testEncryptionZone List to confirm the zone and keys: [hdfs@c241-node3 ~]$ hdfs crypto -listZones /testEncryptionZone testkey Export the keys: Log in to KMS host export java home cd /usr/hdp/current/ranger-kms ./exportKeysToJCEKS.sh $filename The output will look as follows: [root@c241-node3 ranger-kms]# export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk- 1.8.0.292.b10-1.el7_9.x86_64/jre [root@c241-node3 ranger-kms]# ./exportKeysToJCEKS.sh /tmp/hdp315keys.keystore Enter Password for the keystore FILE : Enter Password for the KEY(s) stored in the keystore: Keys from Ranger KMS Database has been successfully exported into /tmp/hdp315keys.keystore On to the HDP 2.6.5 cluster where we need to import the keys, do the following: Log in to KMS host Add org.apache.hadoop.crypto.key.**; in the property jceks.key.serialFilter. This needs to be changed in the following file on KMS host only: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.292.b10-1.el7_9.x86_64/jre/lib/security/java.security After the change, the entry in the file should look like this: jceks.key.serialFilter = java.lang.Enum;java.security.KeyRep;\ java.security.KeyRep$Type;javax.crypto.spec.SecretKeySpec;org.apache.hadoop.crypto.k ey.**;!* export JAVA_HOME, RANGER_KMS_HOME, RANGER_KMS_CONF, SQL_CONNECTOR_JAR cd /usr/hdp/current/ranger-kms/ Run ./importJCEKSKeys.sh $filename JCEKS The output looks like this: [root@c441-node3 ranger-kms]# export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk- 1.8.0.292.b10-1.el7_9.x86_64/jre [root@c441-node3 ranger-kms]# export RANGER_KMS_HOME=/usr/hdp/2.6.5.0-292/ranger-kms [root@c441-node3 ranger-kms]# export RANGER_KMS_CONF=/etc/ranger/kms/conf [root@c441-node3 ranger-kms]# export SQL_CONNECTOR_JAR=/var/lib/ambariagent/ tmp/mysql-connector-java.jar [root@c441-node3 security]# cd /usr/hdp/current/ranger-kms/ [root@c441-node3 ranger-kms]# ./importJCEKSKeys.sh /tmp/hdp315keys.keystore JCEKS Enter Password for the keystore FILE : Enter Password for the KEY(s) stored in the keystore: 2021-08-12 23:58:06,729 ERROR RangerKMSDB - DB Flavor could not be determined Keys from /tmp/hdp315keys.keystore has been successfully imported into RangerDB. To confirm that the encryption keys are imported, in DB of HDP 2.6.5 cluster, check the ranger_keystore table for the entry for "testkey". Also, check if the master key in HDP 2.6.5 is untouched; it is the same which Ranger KMS created: Now create an encryption zone in HDP 2.6.5 with the imported key: [hdfs@c441-node3 ~]$ hdfs dfs -mkdir /testEncryptionZone-265 [hdfs@c441-node3 ~]$ hdfs crypto -createZone -keyName testkey -path /testEncryptionZone-265 Added encryption zone /testEncryptionZone-265 Confirm the zone and keys: [hdfs@c441-node3 ~]$ hdfs crypto -listZones /testEncryptionZone-265 testkey Now for the distcp, note that it needs to have /.reserved/raw before the encryption zone path and -px option. Command: hadoop distcp -px /.reserved/raw/$encryptionZonePath/filename hdfs://destination/.reserved/raw/$encryptionZonePath/filename Check this document link to read about these options: Configuring Apache HDFS Encryption Following is the output of distcp. It is truncated but shows copied file. Note that the skipCRC is false. [hdfs@c241-node3 ~]$ hadoop distcp -px /.reserved/raw/testEncryptionZone/text.txt hdfs://172.25.37.10:8020/.reserved/raw/testEncryptionZone-265/ ERROR: Tools helper /usr/hdp/3.1.5.0-152/hadoop/libexec/tools/hadoop-distcp.sh was not found. 21/08/13 01:52:58 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=false, useRdiff=false, fromSnapshot=null, toSnapshot=null, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=0.0, copyStrategy='uniformsize', preserveStatus=[XATTR], atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[/.reserved/raw/testEncryptionZone/text.txt], targetPath=hdfs://172.25.37.10:8020/.reserved/raw/testEncryptionZone-265, filtersFile='null', blocksPerChunk=0, copyBufferSize=8192, verboseLog=false, directWrite=false}, sourcePaths=[/.reserved/raw/testEncryptionZone/text.txt], targetPathExists=true, preserveRawXattrsfalse <TRUNCATED> 21/08/13 01:52:59 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 1; dirCnt = 0 21/08/13 01:52:59 INFO tools.SimpleCopyListing: Build file listing completed. 21/08/13 01:52:59 INFO tools.DistCp: Number of paths in the copy list: 1 21/08/13 01:52:59 INFO tools.DistCp: Number of paths in the copy list: 1 <TRUNCATED> DistCp Counters Bandwidth in Btyes=21 Bytes Copied=21 Bytes Expected=21 Files Copied=1 Another question that came up - what happens to old keys when I import a new key? It just gets added to the existing keys. Here is a screenshot:

Shelton · ‎05-22-2021

@LPottakkattil Sorry to get back to you late, yes I can confirm no alteration will happen.

zampJeri · ‎03-31-2021

Ok, we found the very stupid issue. This specific job running as standalone was passing the "hive-site.xml" as file to the spark-submit, whereas all other jobs run under Oozie and make use of a generic spark-submit that doesnt pass the "hive-site.xml" file. This file specifies /tmp/hive as default directory to dump temporary resources and it came out that our user still has issues with that folder, issues that are being investigated. The workaround so far is to not pass the hive-site.xml file, so the default directory is instead /tmp, where we can happily live without issues. All in all, it was a stupid "mistake" that let us know about other issues with out current system. Cheers and thanks to all for the support!

Hadoop_dev · ‎03-30-2021

Hi, I Saw this error while select data from one hive table and insert it to other. It is resolved now after restarting yarn.

totti1 · ‎03-20-2021

I'm using pyspark and I'm using hadoop 2.7.7 and not I'm not using llap

mugdha · ‎03-19-2021

If you keep seeing this, check in the bash profile on all nodes, if the umask is set to something else. It should be 0022.

mugdha · ‎03-19-2021

Do you have SPNEGO enabled for browsers? https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.0.1/authentication-with-kerberos/content/authe_spnego_enabling_browser_access_to_a_spnego_enabled_web_ui.html Are you seeing any error on the UI?

Online	Offline
Last Visited	‎07-22-2022 05:43 PM

Member Since	‎09-29-2015 04:12 AM
Last Visited	‎07-22-2022 05:43 PM
Posts	186
Kudos received	62

Cloudera Community

Re: TEZ View- Queries tab doesn't show any query, ...

Re: Ranger Admin Install Failed

Re: httpfs installation hadoop apache download

Re: Tez UI not coming up, failing with spnego auth

Re: Can someone let me know how to skipACLS in zoo...

Re: Yarn node manager not starting.

Re: Location of keytab files

Hive BDR job from CDH to CDP changes Hive table lo...

Import/Export Ranger KMS keys and distcp without s...

Re: hive metastore issue on hortonworks hadoop 2.5...

Re: User class threw exception: org.apache.spark.s...

Re: Will updating hive.notification.sequence.lock....

Re: Cannot work with spark sql to show show databa...

Re: AccessDeniedException : /opt/yarn/nm/usercache...

Re: RM ui