About Shelton

Shelton · ‎06-14-2019

@Michael Bronson Get the journal node that is healthy (active namenode) aftter saving the Namespace you also wipe out the other journal node which had edits_inprogress_0000000000018783114.empty remember to backup/zip all the journalnodes as good practice Once you have copied the good to all the 3 destinations proceed and when you start the namenode after staring the journalnode one should become active and the other standby thanks to ZKFailover.

Shelton · ‎06-14-2019

@Michael Bronson Can you confirm all the other 2 journal nodes have the last-promised-epoch of 30? That when the failure occurred, it's okay to replace the contents of the /hadoop/hdfs/journal/hdfsha/current/* with the contents of the good(active) namenode. Then proceed with the subsequent steps

Shelton · ‎06-13-2019

@Shashank Naresh Great news !! If you found this answer addressed your question, please take a moment to log in and click the "accept" link on the answer. That would be a great help to Community users to find the solution quickly for these kinds of errors.

Shelton · ‎06-13-2019

@Shashank Naresh What do you mean by choosing only network adapter? There are different adapter al are network adaoters can you elaborate

Shelton · ‎06-13-2019

@Michael Bronson Yes its possible to recover from this situation, which happens sometimes in a Namenode HA setup. Journal nodes are distributed system to store edits. Active Namenode as a client writes edits to journal nodes and commit only when it's replicated to all the journal nodes in a distributed system. Standby NN needs to read data from edits to be in sync with Active one. It can read from any of the replica stored on journal nodes. ZKFC will make sure that only one Namenode should be active at a time. However, when a failover occurs, it is still possible that the previous Active NameNode could serve read requests to clients, which may be out of date until that NameNode shuts down when trying to write to the JournalNodes. For this reason, we should configure fencing methods even when using the Quorum Journal Manager. To work with a fencing journal manager uses epoc numbers. Epoc numbers are an integer which always gets increased and have unique value once assigned. Namenode generates epoc number using a simple algorithm and uses it while sending RPC requests to the QJM. When you configure Namenode HA, the first Active Namenode will get epoc value 1. In case of failover or restart, epoc number will get increased. The Namenode with higher epoc number is considered as newer than any Namenode with an earlier epoc number. Now let's proceed with the real case, note the hostname of the healthy namenode You will need to proceed as follows assuming you are logged on as root here is How do I fix one corrupted JN's edits? # su - hdfs 1) Put both NN in safemode ( NN HA) $ hdfs dfsadmin -safemode enter Sample output Safe mode is ON in namenode1/xxx.xxx.xx.xx:8020 Safe mode is ON in namenode2/xxx.xxx.xx.xx:8020 2) Save Namespace $ hdfs dfsadmin -saveNamespace 3) On the non-working name node change directory to /hadoop/hdfs/journal/hdfsha/current/* Get the epoch and note the number it should be lower than the in the working name node cross check $ cat last-promised-epoch 4) On the non-working name node backup all the files in journal dir /hadoop/hdfs/journal/hdfsha/current/* they should look like below -rw-r--r-- 1 hdfs hadoop 1019566 Jun 10 09:45 edits_0000000000000928232-0000000000000935461 -rw-r--r-- 1 hdfs hadoop 1014516 Jun 10 15:45 edits_0000000000000935462-0000000000000942657 -rw-r--r-- 1 hdfs hadoop 1017540 Jun 10 21:46 edits_0000000000000942658-0000000000000949874 -rw-r--r-- 1 hdfs hadoop 1048576 Jun 10 23:36 edits_0000000000000949875-0000000000000952088 -rw-r--r-- 1 hdfs hadoop 1048576 Jun 13 22:27 edits_inprogress_0000000000000952089 -rw-r--r-- 1 hdfs hadoop 277083 Jun 10 21:46 fsimage_0000000000000949874 -rw-r--r-- 1 hdfs hadoop 62 Jun 10 21:46 fsimage_0000000000000949874.md5 -rw-r--r-- 1 hdfs hadoop 276740 Jun 13 22:13 fsimage_0000000000000952088 -rw-r--r-- 1 hdfs hadoop 62 Jun 13 22:13 fsimage_0000000000000952088.md5 -rw-r--r-- 1 hdfs hadoop 7 Jun 13 22:13 seen_txid -rw-r--r-- 1 hdfs hadoop 206 Jun 13 22:13 VERSION 5) While in the current directory backup all the files note the (.) indicating current dir $ tar -zcvf editsbck.tar.gz . 6) Move the editsbck.tar.gz to a safe location $ scp editsbck.tar.gz /home/bronson 7) Backup or move any directory therein eg $ mv paxos paxos.bck 😎 Delete all files in the /hadoop/hdfs/journal/hdfsha/current/ on the bad node remember you have a backup editsbck.tar.gz $ rm -rf /hadoop/hdfs/journal/hdfsha/current/* 9) zip or tar the journal dir from a working JN node /hadoop/hdfs/journal/hdfsha/current/* $ tar -zcvf good_editsbck.tar.gz 10) Copy and unzip/untar the good_editsbck.tar.gz to the non-working JN node to same path as the working namenode /hadoop/hdfs/journal/hdfsha/current/ # scp good_editsbck.tar.gz root@namenode2:/hadoop/hdfs/journal/hdfsha/current/ 11) Unzip the files # tar xvzf something.tar.gz -C /hadoop/hdfs/journal/hdfsha/current/ 12) Chown ownership to hdfs the -R recursive in case you have directories # chown -R hdfs:hadoop /hadoop/hdfs/journal/hdfsha/current/* Log on the unhealthy name node 13) Restarting the journal nodes Start all 3 journal nodes note I run the command like root if the were running stop you will see journal node running as process xxxx. Stop it first. 14) Stopping journal node # su -l hdfs -c "/usr/hdp/current/hadoop-hdfs-journalnode/../hadoop/sbin/hadoop-daemon.sh stop journalnode" 15) Starting journal node # su -l hdfs -c "/usr/hdp/current/hadoop-hdfs-journalnode/../hadoop/sbin/hadoop-daemon.sh start journalnode" Restart HDFS from Ambari UI After some minutes the alerts should go and you should see a healthy Active & standby Namenodes. All should be fine now, the NameNode failover should now occur transparently and the below alerts should gradually disappear HTH

Shelton · ‎06-10-2019

@sugata kar If already have an SSL cert file, then you can generate you own JKS file and import your cert into your jks. Run the following commands? Assumption Alias = sugatajks SSL Cert= exadata.crt keytool -genkey -alias sugatajks -keystore exadata.jks -storepass {hidden_passwd} keytool -delete -alias sugatajks -keystore exadata.jks -storepass {hidden_passwd} keytool -import -alias sugatajks -file /etc/pki/CA/certs/exadata.crt -keypass {hidden_passwd} -keystore exadata.jks -storepass {hidden_passwd} Now run the import command $ sqoop import --connect --connect "jdbc:mysql://hadoop.node1.com:3306/test username={hidden_username};password={hidden_passwd};encrypt=true;trustServerCertificate=false;trustStore=/exadata.jks" + {options} e.g --table customer --fields-terminated-by , --escaped-by \\ --enclosed-by '"' --compress -m 1 --target-dir /user/sugata/ --append --hive-drop-import-delims -- --schema exadat --table-hints NOLOCK Hope that gives you the idea.

Shelton · ‎06-10-2019

@Adrián Gil Can you share the below entries of your ambari.ini and your ambari.properties? [agent] ... [security] ... [heartbeat] ... [logging] Please revert

Shelton · ‎06-10-2019

@sugata kar Yes you can use SSL/TLS with sqoop but you have to do a couple of configurations and configure the keystore see Sqoop 2 shell support for TLS/SSL With Sqoop 1.4.5 you can use Hadoop credential provider API The CredentialProvider API in Hadoop allows for the separation of applications and how they store their required passwords/secrets see below example # Encrypting SQOOP password Generating the jceks file, the password should be the database $ hadoop credential create mysql.testDB.alias -provider jceks://hdfs/user/sugata/mysql.testDB.password.jceks Enter password: Enter password again: mysql.testDB.alias has been successfully created. org.apache.hadoop.security.alias.JavaKeyStoreProvider has been updated. Validating the creation $ hdfs dfs -ls /user/sugata Found 1 items -rwx------ 3 sheltong hdfs 503 2019-06-09 01:40 /user/sugata/mysql.testDB.password.jceks Running the sqoop with the jceks alias $ sqoop import -Dhadoop.security.credential.provider.path=jceks://hdfs/user/sugata/mysql.testDB.password.jceks --driver com.mysql.jdbc.Driver --connect jdbc:mysql://hadoop.node1.com:3306/test --username sugata --password-alias mysql.testDB.alias --table "customer" --target-dir /user/sugata/test Success output Warning: /usr/hdp/2.6.2.0-205/accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 18/09/02 02:08:04 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.6.2.0-205 18/09/02 02:08:06 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. 18/09/02 02:08:06 INFO manager.SqlManager: Using default fetchSize of 1000 18/09/02 02:08:06 INFO tool.CodeGenTool: Beginning code generation 18/09/02 02:08:07 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM customer AS t WHERE 1=0 18/09/02 02:08:07 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM customer AS t WHERE 1=0 18/09/02 02:08:07 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.6.2.0-205/hadoop-mapreduce Note: /tmp/sqoop-sheltong/compile/32c3e11ab1e1878e6ca7638a96feb30b/customer.java uses or overrides a deprecated API. Physical memory (bytes) snapshot=669270016 Virtual memory (bytes) snapshot=18275794944 Total committed heap usage (bytes)=331350016 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=243892 18/09/02 02:11:48 INFO mapreduce.ImportJobBase: Transferred 238.1758 KB in 218.8164 seconds (1.0885 KB/sec) 18/09/02 02:11:48 INFO mapreduce.ImportJobBase: Retrieved 2170 records. Sqoop import in hdfs $ hdfs dfs -ls /user/sugata/test Found 5 items -rw-r--r-- 3 sugata hdfs 0 2019-06-09 02:11 /user/sugata/test/_SUCCESS -rw-r--r-- 3 sugata hdfs 60298 2019-06-09 02:10 /user/sugata/test/part-m-00000 -rw-r--r-- 3 sugata hdfs 60894 2019-06-09 02:10 /user/sugata/test/part-m-00001 -rw-r--r-- 3 sugata hdfs 62050 2019-06-09 02:11 /user/sugata/test/part-m-00002 -rw-r--r-- 3 sugata hdfs 60650 2019-06-09 02:11 /user/sugata/test/part-m-00003 So you have the solutions

Shelton · ‎06-09-2019

@Nani Bigdata Please can you share the screenshot of the command line snippet

Shelton · ‎06-08-2019

@Nani Bigdata Please could you try this different approach, invoke the beeline as user hive $ beeline Beeline version 0.14.0.2.2.7.1-10 by Apache Hive beeline> !connect jdbc:hive2://headnodehost:10001/;transportMode=http admin Hope that helps

Online	Offline
Last Visited	‎12-11-2025 11:50 PM

Member Since	‎01-19-2017 04:35 AM
Last Visited	‎12-11-2025 11:50 PM
Posts	3,679
Kudos received	627

Cloudera Community

Re: Apache nifi memory consumption in kubernetes

Re: Nifi toolkit command for GitLabFlowRegistry

Re: Not able to delete the NiFi existing flow usin...

Re: Securing Nifi with SSL and using OIDC provider...

Re: External zookeeper and nifi cluster connection...

Re: how to recover the edits_inprogress_xxxxxx fil...

Re: how to recover the edits_inprogress_xxxxxx fil...

Re: How to integrate HDP and HDF on same virtual b...

Re: HDP 3.0 "ip addr show" shows only eth0 and lo

Re: how to recover the edits_inprogress_xxxxxx fil...

Re: Using SSL for Sqoop Import

Re: ambari-agent connection refused after host reb...

Re: Using SSL for Sqoop Import

Re: Unable to connect to beeline

Re: Unable to connect to beeline