Member since
06-07-2016
923
Posts
322
Kudos Received
115
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 4082 | 10-18-2017 10:19 PM | |
| 4336 | 10-18-2017 09:51 PM | |
| 14833 | 09-21-2017 01:35 PM | |
| 1838 | 08-04-2017 02:00 PM | |
| 2417 | 07-31-2017 03:02 PM |
08-15-2016
01:34 AM
2 Kudos
@Tech Guy Your jar appears to be in hdfs. You cannot have jar inside hdfs. It needs to be in your local file system. Put it in local file system and run again. It should work.
... View more
08-15-2016
01:28 AM
2 Kudos
@Emily Sharpe That's a great question. I have updated the answer. You would basically use ViewFS which uses mount tables similar to linux to solve the problem of using relative paths in different namespaces without the need of specifying the namenode uri. I must admit that it gets ugly. That's why unless you have thousands of nodes, this should ideally be avoided. Please share your motivations on considering Federation. May be there is a better and cleaner solution.
... View more
08-14-2016
11:56 PM
1 Kudo
@Obaid Salikeen
Nifi nodes do not talk to each other and only talk to NCM. So if you would like to add a new node, you don't need to bring the cluster down. Check the following link for details. https://community.hortonworks.com/articles/8607/how-to-create-nifi-fault-tolerance-using-multiple.html https://community.hortonworks.com/articles/8631/how-to-create-nifi-fault-tolerance-using-multiple-1.html
... View more
08-14-2016
07:19 PM
@zkfs Just because your block is set to 128 MB doesn't mean you don't have small files. Please use fsck to find out more details about your filesystem. It is likely that you have a lot of small files.
... View more
08-14-2016
07:06 PM
@kishore sanchina Failed to create log directory "logs": [Errno 13] Permission denied: 'logs' You don't have write permissions to the logs directory. Give this user "write permissions" so it is able to write to logs folder. Or you'll probably need to ask HUE admin to do this for you.
... View more
08-14-2016
06:36 PM
2 Kudos
@Fasil Ahamed
Please see my replies below: 1. So the limiting factor is the memory capacity of a NameNodes RAM. HDFS federation comes to its rescue by dividing the metadata/namespace across multiple Namenodes, thereby offloading part of one application's metadata/namespace information that have grown beyond the capacity of a single NameNode's RAM to a second NameNode's RAM. Is my understanding correct ?? Answer: Yes 2. If YES, then as per documentation, it says that NameNodes are independent and does not communicate each other. In that case who is managing an application's metadata/namespace informations that are resident between two NameNodes? Answer: No one. Client needs to know the namespace they are connecting to and hence the name node. There are multiple name services in HDFS federation. Your client applications even without HDFS federation today, connect to namenode using a nameservice. They will do the same thing when HDFS federation is enabled and they just need to know which nameservice they are connecting to. Now, the question is how would Hive or other client tools know where the hive metastore is. What if there are external tables. When you enable HDFS federation, you will use what's called ViewFS which enables you to manage multiple namespaces by mounting different file system locations in different namenodes to their logical mount points similar to linux mount tables. I would highly recommend reading following two links (I like the first link better). http://thriveschool.blogspot.com/2014/07/hdfs-federation.html https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/ViewFs.html
... View more
08-14-2016
06:15 PM
2 Kudos
@RAMESH K Spark is the engine that processes data. the data it proceses can be sitting in HDFS or other file systems and data repositories that Spark supports. For example, spark can read and then process data from S3. HDFS is just one of the file systems that Spark supports. Similarly Spark can read from JDBC data sources like Oracle. So HDFS is one of the file systems where you can use Spark. When Spark is running in parallel, that is a Spark cluster. For example, you can have a Spark cluster that reads from S3 and processes data in parallel. Similarly you can have a Spark cluster that reads data from HDFS and processes it in parallel. In this case, Spark is processing data in parallel on a number of machines while HDFS is also being used to read data in pararllel from different machines. You need to distinguish between "reading data in parallel" (HDFS) and processing data in parallel (Spark).
... View more
08-13-2016
05:14 AM
@sam coderunner According to following link, json-lib requires additional dependencies in your class path including groovy-all.jar. do you have these depndencies in your classpath? http://json-lib.sourceforge.net/dependencies.html
... View more
08-13-2016
01:11 AM
@venkat v Can you please share more information? When you say one node is down, does that mean data node process? Can we see hdfs logs from /var/log folder? Is the node able to talk to Ambari? What kind of instance is this? Some low end instances share network bandwidth and other resources from applications other than yours. Those applications at times, may be using resources and impacting your system. If that's the case, it will show up and working as soon as resources become available. Is it possible to restart the node? I know in AWS it's not a simple decision like on-prem cluster but sometimes, that might be it.
... View more
08-10-2016
08:58 PM
1 Kudo
Following link which is for 2.1 should work for your scenario. http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.0/bk_upgrading_Ambari/content/_ambari_upgrade_guide.html
... View more