About ravi1

ravi1 · ‎05-03-2016

Was there an earlier install of HDP either successful or unsuccessful? Its important to cleanup failed installs or clean up existing installs for reinstalls. After host registration, ambari gives a good estimate of issues that can effect your install. You can take a look at https://community.hortonworks.com/questions/1110/how-to-completely-remove-uninstall-ambari-and-hdp.html though this is geared towards centos. For this specific issue, you can unlink /usr/hdp/current/zookeeper-client/ and retry. This should take care of it.

ravi1 · ‎05-03-2016

As a workaround, you can try manual registration of ambari agents since this is a 5 node cluster. Steps are here

ravi1 · ‎05-03-2016

@Christian Guegi You could write your audits to nfs share with shell action. Every few minutes you can load all your audits from that folder into HDFS. This is an example of microbatching strategy. You can also try putting JMS messages and use storm-jms to direct them to HDFS. This is a streaming approach.

ravi1 · ‎05-03-2016

I think you should open a new question for this. Please take a look at HBase Master logs. This is most likely that HBase Master is missing phoenix jars.

ravi1 · ‎05-03-2016

It looks like you are executing one insert per row. I see you have 900byte files in HDFS. Most likely reason for your error is two workflows going in parallel trying to insert into the table. Even if you get the insert flow right, 900 byte files in HDFS will create performance hit for hive and NN overload for HDFS. You should try to change your oozie workflow. You should consider microbatching or streaming in your data into HDFS/Hive. You could write your audits to nfs share with shell action. Every few minutes you can load all your audits from that folder into HDFS. This is an example of microbatching strategy. You can also try putting JMS messages and use storm-jms to direct them to HDFS. This is a streaming approach.

ravi1 · ‎05-03-2016

You can use nifi to ingest logs into solr/elasticsearch and visualize using banana/kibana. You can also use nifi to ingest them into splunk. Since all hadoop component logs use log4j, it is easy to standardize log collection.

ravi1 · ‎05-03-2016

You are right that it is not on the NN UI. But you can get this from JMX (LastCheckpointTime which shows 1462223090012). { "name" : "Hadoop:service=NameNode,name=FSNamesystem", "modelerType" : "FSNamesystem", "tag.Context" : "dfs", "tag.HAState" : "standby", "tag.TotalSyncTimes" : "", "tag.Hostname" : "demo2.cloud.hortonworks.com", "MissingBlocks" : 0, "MissingReplOneBlocks" : 0, "ExpiredHeartbeats" : 0, "TransactionsSinceLastCheckpoint" : -8853, "TransactionsSinceLastLogRoll" : 0, "LastWrittenTransactionId" : 50372, "LastCheckpointTime" : 1462223090012, "CapacityTotal" : 44338987008, "CapacityTotalGB" : 41.0, "CapacityUsed" : 4014164154, "CapacityUsedGB" : 4.0, "CapacityRemaining" : 13009718052, "CapacityRemainingGB" : 12.0, "CapacityUsedNonDFS" : 27315104802, "TotalLoad" : 22, "SnapshottableDirectories" : 0, "Snapshots" : 0, "LockQueueLength" : 0, "BlocksTotal" : 1145, "NumFilesUnderConstruction" : 4, "NumActiveClients" : 4, "FilesTotal" : 1385, "PendingReplicationBlocks" : 0, "UnderReplicatedBlocks" : 0, "CorruptBlocks" : 0, "ScheduledReplicationBlocks" : 0, "PendingDeletionBlocks" : 0, "ExcessBlocks" : 0, "PostponedMisreplicatedBlocks" : 0, "PendingDataNodeMessageCount" : 4, "MillisSinceLastLoadedEdits" : 39983, "BlockCapacity" : 2097152, "StaleDataNodes" : 0, "TotalFiles" : 1385, "TotalSyncCount" : 0 } You could also monitor Standby NN log for this. Below is the log from Standby NN during checkpoint. 2016-05-02 17:04:49,810 INFO ha.StandbyCheckpointer (StandbyCheckpointer.java:doWork(336)) - Triggering checkpoint because it has been 21600 seconds since the last checkpoint, which exceeds the configured interval 21600 2016-05-02 17:04:49,810 INFO namenode.FSImage (FSImage.java:saveNamespace(1090)) - Save namespace ... 2016-05-02 17:04:50,014 INFO namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:getImageTxIdToRetain(203)) - Going to retain 2 images with txid >= 50371 2016-05-02 17:04:50,187 INFO namenode.TransferFsImage (TransferFsImage.java:setTimeout(443)) - Image Transfer timeout configured to 60000 milliseconds 2016-05-02 17:04:50,339 WARN namenode.FSNamesystem (FSNamesystem.java:getCorruptFiles(7324)) - Get corrupt file blocks returned error: Operation category READ is not supported in state standby 2016-05-02 17:04:50,363 INFO namenode.TransferFsImage (TransferFsImage.java:uploadImageFromStorage(237)) - Uploaded image with txid 59225 to namenode at http://demo1.cloud.hortonworks.com:50070 in 0.222 seconds 2016-05-02 17:04:53,644 WARN namenode.FSNamesystem (FSNamesystem.java:getCorruptFiles(7324)) - Get corrupt file blocks returned error: Operation category READ is not supported in state standby 2016-05-02 17:05:19,677 INFO ha.EditLogTailer (EditLogTailer.java:triggerActiveLogRoll(271)) - Triggering log roll on remote NameNode xomdemo1.cloud.hortonworks.com/172.24.64.97:8020 2016-05-02 17:05:21,042 INFO namenode.FSImage (FSImage.java:loadEdits(834)) - Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@6fba8d60 expecting start txid #59226

ravi1 · ‎05-03-2016

I was able to reproduce this issue and it looks like a jar version mismatch for parquet on sandbox. Is there a reason for using parquet instead of ORC here? While both are supported on hive, ORC has advantages when using with hive since some of the stinger initiative improvements to hive take advantage of ORC. Here is an example of using ORC from sqoop. sqoop import --connect "jdbc:sqlserver://<ServerName>:<PortNo>;database=<DatabaseName>" --username <UserName> --password <Password> --table <TableName> --columns "COLUMN1, COLUMN2" --where "COLUMN1 = 7390" --hcatalog-database default --hcatalog-table my_table_orc --create-hcatalog-table --hcatalog-storage-stanza "stored as orcfile"

ravi1 · ‎05-02-2016

The IoT demo configs are available under "Advanced demo-env", but do not require updating as all required configs will be auto-populated: Ambari host Name node host/port Nimbus host Hive metastore host/port Supervisor host HBase master host Kafka host/port (also where ActiveMQ will be installed) You can click on IoT-demo on ambari, go to advanced demo-env and set the correct hbase master (and any other hosts that are different)

ravi1 · ‎05-02-2016

How big is this specific ORC file and can this be shared with us ? Can you also check if this is hanging in one of the mapper (that is reading this ORC file) or before you get into application/mapper in YARN.

Online	Offline
Last Visited	‎12-18-2021 05:54 PM

Member Since	‎01-09-2019 05:01 PM
Last Visited	‎12-18-2021 05:54 PM
Posts	401
Kudos received	163

Cloudera Community

Re: 2 hosts not running master services

Re: ambari restart and service restart updating kr...

Re: How to automate sqoop incremental import using...

Re: Path to core-site.xml in sandbox?

Re: Curious to know why majority of the people are...

Re: Last step of Ambari HDP installation fails for...

Re: Ambari HDP install fails during Confirm Hosts ...

Re: INSERT INTO TABLE failing with error while mo...

Re: https://github.com/hortonworks-gallery/iotdemo...

Re: INSERT INTO TABLE failing with error while mo...

Re: Is there a best practice for logging within HD...

Re: Where can I find the time since the last check...

Re: Getting Error in Sqoop Import from Oracle & Ms...

Re: https://github.com/hortonworks-gallery/iotdemo...

Re: Help understanding corrupt ORC file in Hive