Member since
01-09-2019
401
Posts
163
Kudos Received
80
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1983 | 06-21-2017 03:53 PM | |
3012 | 03-14-2017 01:24 PM | |
1929 | 01-25-2017 03:36 PM | |
3082 | 12-20-2016 06:19 PM | |
1507 | 12-14-2016 05:24 PM |
05-25-2016
02:41 PM
This is a case of corrupt pig.tar.gz in hdfs (/hdp/apps/<version>/pig) folder. I am not sure how it ended up with a corrupt version there on a fresh install based on ambari. But once I manually updated with pig.tar.gz from /usr/hdp/<version>/pig/, the error got resolved. However, confusing part is pig view throwing a completed unrelated error (File does not exist at /user/rmutyal/pig/jobs/test_23-05-2016-14-46-54/stdout)
... View more
05-22-2016
05:56 PM
Hadoop is distributed filesystem and distributed compute, so you can store and process any kind of data. I know that a lot of examples point of csv and DB imports since they are the most common use cases. I will give a list of ways of how the data that you listed can be used and processed in hadoop. You can see some blogs and public repos for examples. 1. csv Like you said you will see a lot of examples including in our sandbox tutorials. 2. doc You can put raw 'doc' documents into hdfs and use tika or tesseract to do OCR from these documents. 3. audio and video. You can put raw data again in hdfs. Processing depends on what you want to do with this data. You can extract metadata out of this data using yarn. 4. relational DB. You can take a look at sqoop examples on how you can ingest relations DB into HDFS and use hive/hcatalog to access this data.
... View more
05-26-2016
04:44 AM
You can get a sandbox from http://hortonworks.com/downloads/#sandbox But you will need at least 8GB for the sandbox, so make sure you are on a machine that has 12-16GB RAM if you get that. If you don't have a machine with that amount of RAM, Azure/AWS is your option. Any further questions, please open a new thread for each question, so it won't be a long thread of question and answers.
... View more
05-17-2016
01:29 PM
@Sagar Shimpi I am not using HA with this cluster (it is a small demo cluster) but I will take note of that for when we build future clusters. Thanks!
... View more
05-18-2016
11:15 AM
@Saurabh Kumar Then I can only think of increasing the yarn.nodemanager.log-dirs size by adding multiple mount points. But still i'm suspecting that something else is also occupying the space.
... View more
05-13-2016
09:36 PM
If there are no HA components (Namenode HA and 2 hiveserver2 instances), then there is no depending with zookeeper. Check the hiveserver2.log from /var/log/hiveserver2/hiveserver2.log to see if you see any errors. If you have 2 hiveserver2 instances, they will register with zookeeper which may be when they are running into issues.
... View more
05-23-2016
05:46 PM
Thanks @Ravi Mutyala and @Artem Ervits. After getting stuck I ended up starting over using instructions on Apache wiki. I'm not sure what exactly was different but no password-related problems.
... View more
05-13-2016
05:21 PM
1. NN data lost. Is it that the disk with NN directories crashed or have you deleted them? Is this with HA or non-HA. With non-HA, if both data directories of NN have no data, then you will run into data loss issues. You can revive and get to some state from secondary NN data directories but can not guarantee no data loss. If there is no useful data, you can always issue a NN format and start fresh. (You will need to manually update tez and mapreduce apps, you can get the information from manual install documentation) 2. Ambari not starting. Clean up /var/log and start it back. 3. Most likely HDFS services are not up. Filled up disk can kill processes. Once ambari is up, see which one is running and which one is not. Againt if NN data is lost, then NN will not start up.
... View more
05-06-2016
05:54 AM
@santosh rai Out of curiosity do you have any specific used case for using 2.2?
... View more
05-05-2016
05:43 PM
Not sure how your sandbox missed that folder in the first place. I sent you the steps from manual install. Should have been there in the first place, but since it was not there, we followed manual steps to put them there. Tez and mapreduce use the tar.gz files from /hdp/apps/<hdp_version>/ for submitting applications.
... View more