About mqureshi

mqureshi · ‎08-15-2016

@Tech Guy Your jar appears to be in hdfs. You cannot have jar inside hdfs. It needs to be in your local file system. Put it in local file system and run again. It should work.

mqureshi · ‎08-15-2016

@Emily Sharpe That's a great question. I have updated the answer. You would basically use ViewFS which uses mount tables similar to linux to solve the problem of using relative paths in different namespaces without the need of specifying the namenode uri. I must admit that it gets ugly. That's why unless you have thousands of nodes, this should ideally be avoided. Please share your motivations on considering Federation. May be there is a better and cleaner solution.

mqureshi · ‎08-14-2016

@Obaid Salikeen Nifi nodes do not talk to each other and only talk to NCM. So if you would like to add a new node, you don't need to bring the cluster down. Check the following link for details. https://community.hortonworks.com/articles/8607/how-to-create-nifi-fault-tolerance-using-multiple.html https://community.hortonworks.com/articles/8631/how-to-create-nifi-fault-tolerance-using-multiple-1.html

mqureshi · ‎08-14-2016

@zkfs Just because your block is set to 128 MB doesn't mean you don't have small files. Please use fsck to find out more details about your filesystem. It is likely that you have a lot of small files.

mqureshi · ‎08-14-2016

@kishore sanchina Failed to create log directory "logs": [Errno 13] Permission denied: 'logs' You don't have write permissions to the logs directory. Give this user "write permissions" so it is able to write to logs folder. Or you'll probably need to ask HUE admin to do this for you.

mqureshi · ‎08-14-2016

@Fasil Ahamed Please see my replies below: 1. So the limiting factor is the memory capacity of a NameNodes RAM. HDFS federation comes to its rescue by dividing the metadata/namespace across multiple Namenodes, thereby offloading part of one application's metadata/namespace information that have grown beyond the capacity of a single NameNode's RAM to a second NameNode's RAM. Is my understanding correct ?? Answer: Yes 2. If YES, then as per documentation, it says that NameNodes are independent and does not communicate each other. In that case who is managing an application's metadata/namespace informations that are resident between two NameNodes? Answer: No one. Client needs to know the namespace they are connecting to and hence the name node. There are multiple name services in HDFS federation. Your client applications even without HDFS federation today, connect to namenode using a nameservice. They will do the same thing when HDFS federation is enabled and they just need to know which nameservice they are connecting to. Now, the question is how would Hive or other client tools know where the hive metastore is. What if there are external tables. When you enable HDFS federation, you will use what's called ViewFS which enables you to manage multiple namespaces by mounting different file system locations in different namenodes to their logical mount points similar to linux mount tables. I would highly recommend reading following two links (I like the first link better). http://thriveschool.blogspot.com/2014/07/hdfs-federation.html https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/ViewFs.html

mqureshi · ‎08-14-2016

@RAMESH K Spark is the engine that processes data. the data it proceses can be sitting in HDFS or other file systems and data repositories that Spark supports. For example, spark can read and then process data from S3. HDFS is just one of the file systems that Spark supports. Similarly Spark can read from JDBC data sources like Oracle. So HDFS is one of the file systems where you can use Spark. When Spark is running in parallel, that is a Spark cluster. For example, you can have a Spark cluster that reads from S3 and processes data in parallel. Similarly you can have a Spark cluster that reads data from HDFS and processes it in parallel. In this case, Spark is processing data in parallel on a number of machines while HDFS is also being used to read data in pararllel from different machines. You need to distinguish between "reading data in parallel" (HDFS) and processing data in parallel (Spark).

mqureshi · ‎08-13-2016

@sam coderunner According to following link, json-lib requires additional dependencies in your class path including groovy-all.jar. do you have these depndencies in your classpath? http://json-lib.sourceforge.net/dependencies.html

mqureshi · ‎08-13-2016

@venkat v Can you please share more information? When you say one node is down, does that mean data node process? Can we see hdfs logs from /var/log folder? Is the node able to talk to Ambari? What kind of instance is this? Some low end instances share network bandwidth and other resources from applications other than yours. Those applications at times, may be using resources and impacting your system. If that's the case, it will show up and working as soon as resources become available. Is it possible to restart the node? I know in AWS it's not a simple decision like on-prem cluster but sometimes, that might be it.

mqureshi · ‎08-10-2016

Following link which is for 2.1 should work for your scenario. http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.0/bk_upgrading_Ambari/content/_ambari_upgrade_guide.html

Online	Offline
Last Visited	‎10-31-2017 03:17 AM

Member Since	‎06-07-2016 09:05 AM
Last Visited	‎10-31-2017 03:17 AM
Posts	923
Kudos received	310

Cloudera Community

Re: YARN recommended configuration

Re: How to resolve for NULL values when they are c...

Re: Why is spark has better speed than Hadoop

Re: Is it possible to assign Hadoop queues to Hado...

Re: Kafka NiFi HDF Installation

Re: Not Valid Jar Error

Re: HDFS Federation

Re: Apache Nifi- Adding new slave nodes

Re: Namenode Heap Size usage is 80 to 86 % on dai...

Re: how to change HUE user password ?

Re: HDFS Federation

Re: Spark Standalone need of HDFS

Re: NoClassDefFoundError: net/sf/json/JSONObject

Re: One Data node is down in the cluster

Re: how can I upgrade HDP 2.3.2 to HDP 2.4 using a...