Member since
01-09-2019
401
Posts
163
Kudos Received
80
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2596 | 06-21-2017 03:53 PM | |
| 4297 | 03-14-2017 01:24 PM | |
| 2393 | 01-25-2017 03:36 PM | |
| 3840 | 12-20-2016 06:19 PM | |
| 2102 | 12-14-2016 05:24 PM |
05-25-2016
02:41 PM
This is a case of corrupt pig.tar.gz in hdfs (/hdp/apps/<version>/pig) folder. I am not sure how it ended up with a corrupt version there on a fresh install based on ambari. But once I manually updated with pig.tar.gz from /usr/hdp/<version>/pig/, the error got resolved. However, confusing part is pig view throwing a completed unrelated error (File does not exist at /user/rmutyal/pig/jobs/test_23-05-2016-14-46-54/stdout)
... View more
05-23-2016
05:36 PM
HA failover is automatic by default if you enabled failover from ambari. Mapreduce jobs won't fail during a failover scenario.
... View more
05-23-2016
04:12 PM
No luck with that. This is a cluster with https configured for ambari
... View more
05-23-2016
03:43 PM
Please check if proxy users are properly set. https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_ambari_views_guide/content/_setup_HDFS_proxy_user.html
... View more
05-23-2016
02:51 PM
Pig jobs work from gateway node but fail from ambari-view. I crosschecked configs with proxyusers and I have ambariid which is running ambari-server configured there. Error from RM shows this. AM Container for appattempt_1463770749228_0048_000001 exited with exitCode: -1000
For more detailed output, check application tracking page:http://<hostname>:8088/cluster/app/application_1463770749228_0048Then, click on links to logs of each attempt.
Diagnostics: ExitCodeException exitCode=2:
gzip: /grid/6/hadoop/yarn/local/filecache/33_tmp/tmp_pig.tar.gz: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
Failing this attempt Error from Pig view is File does not exist: /user/rmutyal/pig/jobs/test_23-05-2016-14-46-54/stdout but webhcat user configuration looks alright. What am I missing?
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hive
05-23-2016
01:29 PM
2 Kudos
You don't need to restart HDP 2.4 cluster. But it is recommended to decommission the node with dead disk, change the disk and add it back to the cluster. This will ensure that data is evenly distributed on all the data disks on this node. 1. To decommission, go to ambari -> Host -> Datanode. That has an option to decommission. 2. Go to nodemanager to decommission it as well. 3. Once it goes to decommissioned state, stop Datanode and nodemanager on that host and replace the disk. 4. Start datanode and nodemanager back. 5. You will see an option to recommission at the same place. You can click on it to take it out of decommissioned state. No other services across the cluster need to be stopped and if you have more than 3 datanodes and your default rep. factor is 3, all services will continue.
... View more
05-22-2016
05:56 PM
Hadoop is distributed filesystem and distributed compute, so you can store and process any kind of data. I know that a lot of examples point of csv and DB imports since they are the most common use cases. I will give a list of ways of how the data that you listed can be used and processed in hadoop. You can see some blogs and public repos for examples. 1. csv Like you said you will see a lot of examples including in our sandbox tutorials. 2. doc You can put raw 'doc' documents into hdfs and use tika or tesseract to do OCR from these documents. 3. audio and video. You can put raw data again in hdfs. Processing depends on what you want to do with this data. You can extract metadata out of this data using yarn. 4. relational DB. You can take a look at sqoop examples on how you can ingest relations DB into HDFS and use hive/hcatalog to access this data.
... View more
05-21-2016
01:50 AM
If this is a production cluster and you are on support, I suggest opening a support ticket since any tweaks can lead to data loss. Before you more further, please take a back of NN metadata and edits from journal nodes.
... View more
05-20-2016
08:21 PM
1 Kudo
If you are looking for opensource volume level encryption tools, we have seen LUKS being used. There will be some overhead from LUKS. You can take a look at LUKS at https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Security_Guide/sec-Encryption.html
... View more
05-20-2016
02:29 PM
1 Kudo
You can increase memory on your mappers. Take a look at mapreduce.map.memory.mb, mapreduce.reduce.memory.mb, mapreduce.map.java.opts and mapreduce.reduce.java.opts. I think your mapreduce.map.memory.mb is set to 256MB based on the error. I don't know what else is running on your 3GB node and what heap is given, but you maybe able to allocate 1GB of it to yarn (container memory). It is also possible to get it to run on 15GB node by using node labels. You can also switch off nodemanager on 3GB node if other processes are running no this, so it uses 15GB node.
... View more