Member since
01-09-2019
401
Posts
163
Kudos Received
80
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2605 | 06-21-2017 03:53 PM | |
| 4312 | 03-14-2017 01:24 PM | |
| 2404 | 01-25-2017 03:36 PM | |
| 3842 | 12-20-2016 06:19 PM | |
| 2103 | 12-14-2016 05:24 PM |
05-03-2016
01:43 PM
Was there an earlier install of HDP either successful or unsuccessful? Its important to cleanup failed installs or clean up existing installs for reinstalls. After host registration, ambari gives a good estimate of issues that can effect your install. You can take a look at https://community.hortonworks.com/questions/1110/how-to-completely-remove-uninstall-ambari-and-hdp.html though this is geared towards centos. For this specific issue, you can unlink /usr/hdp/current/zookeeper-client/ and retry. This should take care of it.
... View more
05-03-2016
05:48 AM
1 Kudo
As a workaround, you can try manual registration of ambari agents since this is a 5 node cluster. Steps are here
... View more
05-03-2016
04:53 AM
@Christian Guegi You could write your audits to nfs share with shell action. Every few minutes you can load all your audits from that folder into HDFS. This is an example of microbatching strategy. You can also try putting JMS messages and use storm-jms to direct them to HDFS. This is a streaming approach.
... View more
05-03-2016
04:16 AM
I think you should open a new question for this. Please take a look at HBase Master logs. This is most likely that HBase Master is missing phoenix jars.
... View more
05-03-2016
03:49 AM
2 Kudos
It looks like you are executing one insert per row. I see you have 900byte files in HDFS. Most likely reason for your error is two workflows going in parallel trying to insert into the table. Even if you get the insert flow right, 900 byte files in HDFS will create performance hit for hive and NN overload for HDFS. You should try to change your oozie workflow. You should consider microbatching or streaming in your data into HDFS/Hive. You could write your audits to nfs share with shell action. Every few minutes you can load all your audits from that folder into HDFS. This is an example of microbatching strategy. You can also try putting JMS messages and use storm-jms to direct them to HDFS. This is a streaming approach.
... View more
05-03-2016
02:34 AM
1 Kudo
You can use nifi to ingest logs into solr/elasticsearch and visualize using banana/kibana. You can also use nifi to ingest them into splunk. Since all hadoop component logs use log4j, it is easy to standardize log collection.
... View more
05-03-2016
02:11 AM
3 Kudos
You are right that it is not on the NN UI. But you can get this from JMX (LastCheckpointTime which shows 1462223090012). {
"name" : "Hadoop:service=NameNode,name=FSNamesystem",
"modelerType" : "FSNamesystem",
"tag.Context" : "dfs",
"tag.HAState" : "standby",
"tag.TotalSyncTimes" : "",
"tag.Hostname" : "demo2.cloud.hortonworks.com",
"MissingBlocks" : 0,
"MissingReplOneBlocks" : 0,
"ExpiredHeartbeats" : 0,
"TransactionsSinceLastCheckpoint" : -8853,
"TransactionsSinceLastLogRoll" : 0,
"LastWrittenTransactionId" : 50372,
"LastCheckpointTime" : 1462223090012,
"CapacityTotal" : 44338987008,
"CapacityTotalGB" : 41.0,
"CapacityUsed" : 4014164154,
"CapacityUsedGB" : 4.0,
"CapacityRemaining" : 13009718052,
"CapacityRemainingGB" : 12.0,
"CapacityUsedNonDFS" : 27315104802,
"TotalLoad" : 22,
"SnapshottableDirectories" : 0,
"Snapshots" : 0,
"LockQueueLength" : 0,
"BlocksTotal" : 1145,
"NumFilesUnderConstruction" : 4,
"NumActiveClients" : 4,
"FilesTotal" : 1385,
"PendingReplicationBlocks" : 0,
"UnderReplicatedBlocks" : 0,
"CorruptBlocks" : 0,
"ScheduledReplicationBlocks" : 0,
"PendingDeletionBlocks" : 0,
"ExcessBlocks" : 0,
"PostponedMisreplicatedBlocks" : 0,
"PendingDataNodeMessageCount" : 4,
"MillisSinceLastLoadedEdits" : 39983,
"BlockCapacity" : 2097152,
"StaleDataNodes" : 0,
"TotalFiles" : 1385,
"TotalSyncCount" : 0
} You could also monitor Standby NN log for this. Below is the log from Standby NN during checkpoint. 2016-05-02 17:04:49,810 INFO ha.StandbyCheckpointer (StandbyCheckpointer.java:doWork(336)) - Triggering checkpoint because it has been 21600 seconds since the last checkpoint, which exceeds the configured interval 21600
2016-05-02 17:04:49,810 INFO namenode.FSImage (FSImage.java:saveNamespace(1090)) - Save namespace ...
2016-05-02 17:04:50,014 INFO namenode.NNStorageRetentionManager (NNStorageRetentionManager.java:getImageTxIdToRetain(203)) - Going to retain 2 images with txid >= 50371
2016-05-02 17:04:50,187 INFO namenode.TransferFsImage (TransferFsImage.java:setTimeout(443)) - Image Transfer timeout configured to 60000 milliseconds
2016-05-02 17:04:50,339 WARN namenode.FSNamesystem (FSNamesystem.java:getCorruptFiles(7324)) - Get corrupt file blocks returned error: Operation category READ is not supported in state standby
2016-05-02 17:04:50,363 INFO namenode.TransferFsImage (TransferFsImage.java:uploadImageFromStorage(237)) - Uploaded image with txid 59225 to namenode at http://demo1.cloud.hortonworks.com:50070 in 0.222 seconds
2016-05-02 17:04:53,644 WARN namenode.FSNamesystem (FSNamesystem.java:getCorruptFiles(7324)) - Get corrupt file blocks returned error: Operation category READ is not supported in state standby
2016-05-02 17:05:19,677 INFO ha.EditLogTailer (EditLogTailer.java:triggerActiveLogRoll(271)) - Triggering log roll on remote NameNode xomdemo1.cloud.hortonworks.com/172.24.64.97:8020
2016-05-02 17:05:21,042 INFO namenode.FSImage (FSImage.java:loadEdits(834)) - Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@6fba8d60 expecting start txid #59226
... View more
05-03-2016
01:07 AM
1 Kudo
I was able to reproduce this issue and it looks like a jar version mismatch for parquet on sandbox. Is there a reason for using parquet instead of ORC here? While both are supported on hive, ORC has advantages when using with hive since some of the stinger initiative improvements to hive take advantage of ORC. Here is an example of using ORC from sqoop. sqoop import --connect "jdbc:sqlserver://<ServerName>:<PortNo>;database=<DatabaseName>" --username <UserName> --password <Password> --table <TableName> --columns "COLUMN1, COLUMN2" --where "COLUMN1 = 7390" --hcatalog-database default --hcatalog-table my_table_orc --create-hcatalog-table --hcatalog-storage-stanza "stored as orcfile"
... View more
05-02-2016
10:15 PM
1 Kudo
The IoT demo configs are available under "Advanced demo-env", but do not require updating as all required configs will be auto-populated:
Ambari host Name node host/port Nimbus host Hive metastore host/port Supervisor host HBase master host Kafka host/port (also where ActiveMQ will be installed) You can click on IoT-demo on ambari, go to advanced demo-env and set the correct hbase master (and any other hosts that are different)
... View more
05-02-2016
09:00 PM
How big is this specific ORC file and can this be shared with us ? Can you also check if this is hanging in one of the mapper (that is reading this ORC file) or before you get into application/mapper in YARN.
... View more