About saranvisa

saranvisa · ‎03-02-2017

@Akira191 1. Go to Cloudera Manager -> Spark -> Instance -> Identify the node where you have Spark server installed 2. Login to the above identified node using CLI and go to path "/opt/cloudera/parcels/CDH-<version>/lib/spark/bin" , it will list binaries for "spark-shell, pyspark, spark-submit, etc". It helps us to login to spark & submit jobs. if it has spark-sql, then you can run the command that you have mentioned. In your case, spark-sql binary should be missing, so you are getting this error. You need to talk to your admin

saranvisa · ‎02-28-2017

@matt123 Go to link http://ipaddress:8088 and check the Cluster Metrics for the RAM, Container, vcore usage Also Click on "active nodes" to see the same information by node Cloudera Manager -> HDFS -> Web UI -> Namenode UI -> See the complete metrics

saranvisa · ‎02-28-2017

@codenchips Go to CM -> Hosts -> Click on each hosts -> Health history (left down) -> share me the details

saranvisa · ‎02-28-2017

@codenchips Go to Cloudera Manager -> Hosts , check the Host status and understand what kind of issue it shows Also login as root in linux and run the below command service ntpd status service ntpd start service ntpd status restart the CM and try again

saranvisa · ‎02-27-2017

upon further analysis, i've noticed that "navigator policies" might help on this https://www.cloudera.com/documentation/enterprise/5-5-x/topics/navigator_policies.html It seems that I need to write search query, let me try to write one... In the mean time, it will be great if some share the query for the above scenario...

saranvisa · ‎02-27-2017

Hi Does Cloudera navigator has an option to identify Unused objects for a particular period (like more than 6 months, 1 year, etc)? The object can be HDFS files, Hive/Impala tables/Oozie, dataset, etc This is my requirement: Our non-prod environment has been used by multiple users for different reasons like dev, test, etc. Sometimes they use common user id & user space to create db, create/import tables, etc. After the task finished, they will move to the next task without cleaning the old DB, tables, files which become garbage after few days. It has been accumulated and become a big garbage now (with 3 replication). I want to identify the DB, tables, files which are not in use for more than 6 months (or) 1 year and delete them (with proper approval...) Is it possible with Navigator? is there any other option/ideas? Thanks Kumar

saranvisa · ‎02-26-2017

@wenjie Pls check the telnet response 1. Cloudera Manager -> Impala -> Instance -> Impala Catalog Server -> Get the hostname 2. Try the below command in Linux $ telnet <hostname> 25020

saranvisa · ‎02-24-2017

@Rashmi22 As an alternate, you can also use "hdfs version" or "hadoop version" command in CLI to get CDH version

saranvisa · ‎02-23-2017

@RakeshE You can use Access Control List (ACL) to protect your file in HDFS. Pls refer the below links https://community.cloudera.com/t5/Security-Apache-Sentry/Hadoop-Security-for-beginners/m-p/48576#M174 https://hortonworks.com/blog/hdfs-acls-fine-grained-permissions-hdfs-files-hadoop/

saranvisa · ‎02-21-2017

@MasterOfPuppets Very Hypothetical "one line" question. I don't think just adding few extra nodes will double the performance...Few of the additional parameters that you need to consider as 1. The way services are configured in the cluster is also very important. Ex: You have 3 nodes now, Consider 10 services are configured in 3 nodes. After 3 more nodes are added, you need to properly distribute the services to the new nodes as well On Existing Cluster - without adding new nodes: 1. If possible, Add RAM to existing nodes 2. Identify which particular services required better performance like hive, impala, etc. You can tune the environment configuration for those services. Ex: Increase Java heap size, etc 3. Prioritize the jobs etc

Online	Offline
Last Visited	‎08-10-2019 05:12 PM

Member Since	‎09-02-2016 11:35 AM
Last Visited	‎08-10-2019 05:12 PM
Posts	523
Kudos received	96

Cloudera Community

Re: Promoting Metadata

Re: Mix on premise and cloud nodes

Re: impala-shell

Re: How do I see user usage stats by table in Impa...

Re: Replica Not FoundException

Re: Spark-sql command not found

Re: How to see Mapreduce Spill Disk Activity

Re: Cloudera Manager services not responding after...

Re: Cloudera Manager services not responding after...

Re: Cloudera Navigator - Identify unused db, table...

Cloudera Navigator - Identify unused db, tables, f...

Re: IMPALA WEB UI NOT WORKING

Re: How I can find CM version and CDH version in r...

Re: Undeletable HDFS Files

Re: Adding nodes will improve performance ?