Member since
11-21-2018
33
Posts
3
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
635 | 12-27-2019 04:20 AM | |
1237 | 12-17-2019 10:06 PM | |
5880 | 04-08-2019 08:28 PM |
01-08-2020
03:07 AM
Hello, If scheduling resources to Spark jobs takes more time, then please check how many jobs are running parallelly in your cluster. Are you submitting all the jobs in the same pool? Try to create a new pool, allocate enough resources to that pool and then submit your small data job into that pool so that your application will get scheduled immediately.
... View more
01-01-2020
10:51 PM
Hi We can not get a list of applications ids of applications running on dedicated yarn pools using tsquery. You can refer document[1] to know available options to create a chart for Yarn pool. But You can get list of applications running on a dedicated yarn pool using "ClouderaManager--Yarn Applications-- Search with "pool=<pool_name>" [1] https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_metrics_yarn_pool.html
... View more
12-31-2019
04:45 AM
Could you please try passing /etc/hadoop/conf( where hadoon conf files are residing into HADOOP_CONF_DIR and then submit Spark job? {{ export HADOOP_CONF_DIR=“/etc/hadoop/conf:/etc/hive/conf }}
... View more
12-31-2019
03:43 AM
Hello It looks like Hadoop configuration files are not passed to Spark. So please check if hdfs-site and core-site xml files are passed to sprk-sql properly
... View more
12-28-2019
06:22 PM
Hello, As per the log traces itself, "This is usually caused by trying to read a non-flume event." It means Flume is getting events that have not been written by Flume source. So I hope your Kafka producer is not Flume or you may have multiple sources that write into same topic. So in this scenario, please set ' parseAsFlumeEvent' as 'false'. Please refer Flume document[1] [1] https://flume.apache.org/FlumeUserGuide.html#kafka-channel
... View more
12-28-2019
06:04 PM
Hello, You try ' spark.scheduler.mode' as 'FAIR' conf . set ( "spark.scheduler.mode" , "FAIR" ) so that multiple job will be executed in parallel. Please refer document [1] [1] https://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application
... View more
12-27-2019
04:40 AM
Hi In order to check available Resource after allocation of container to your application with below grep command on Resource Manager logs {{ grep "Released container" RM_logs_file | grep "1577297544619_0002" | grep "> available" }}
... View more
12-27-2019
04:20 AM
Hi Upgrading Hive version in CDH may lead to getting support features from Cloudera. You can try upgrading CDH 6.3.2 but still it will have hive 2.1.1. But you will a lot of hive issue fixes in CDH 6.3.2. You can refer similar question thread[1] [1] https://community.cloudera.com/t5/Support-Questions/Upgrading-CDH-to-use-Hive-1-2-0-or-higher/td-p/62936
... View more
12-25-2019
03:46 AM
Hi Are you not able to see note which you created in "NoteBook List Box"? If not please try creating new Note from "Notebook Listbox on top left-side screen"-->Create New note--Provide Name and Interpreter. Then you will be able to see the note which you created
... View more
12-25-2019
03:42 AM
Hi Please find inline answers, 1) How can I view the history of switching modes (standby/active) in namenode service? --> You can check history of switching modes (standby/active) in namenode service in failover controler's log file. 2) After how long of active namenode unavailability, standby becomes active? --> It depends on how large your edits/fsimage files are to be synched. We recommend you to refer documents[1][2] [1] https://docs.cloudera.com/runtime/7.0.1/hdfs-overview/topics/hdfs-moving-highly-available-namenode-failover-controller-and-journalnode-roles-using-the-migrate-roles-wizard.html [2] https://stackoverflow.com/questions/27266267/namenode-ha-failover-time
... View more
12-25-2019
03:30 AM
Hi Peruvian, Did you get a chance to refer document[1] which may help you to recover NameNode. [1] http://www.augmentedintel.com/wordpress/index.php/recover-corrupt-hdfs-namenode/
... View more
12-25-2019
03:16 AM
Hi Michael, Rebalancing just the first 20 partitions of a table is similar to balancing all the partitions, but the only difference is you need to specify 2o partitions alone in the JSON file.
... View more
12-25-2019
02:44 AM
Hi, ``You can configure proxy user using properties hadoop.proxyuser.$superuser.hosts along with either or both of hadoop.proxyuser.$superuser.groups and hadoop.proxyuser.$superuser.users .`` Refer: [1] https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html
... View more
12-23-2019
09:03 PM
Hello Could you please ensure if spark streaming connect to the right Kafka broker host; Check if 10.20.0.44:29092 is the correct IP:port. Also please monitor Kafka broker logs to verify if Spark streaming job is connected to Kafka broker.
... View more
12-23-2019
08:23 PM
Check if new IP got updated in hdfs-site.xml and core-site.xml. Also, try clearing nscd cache in all the nodes and restart CDH cluster, CM agent and CM services.
... View more
12-17-2019
10:06 PM
Please check if userid ' solr' is the member of "supergroup"; If not add solr into supergroup.
... View more
04-25-2019
10:26 AM
This is a Hive Metastore health test that checks that a client can connect and perform basic operations. The operations include: (1) creating a database, (2) creating a table within that database with several types of columns and two partition keys, (3) creating a number of partitions, and (4) dropping both the table and the database. The database is created under the /user/hue/.cloudera_manager_hive_metastore_canary/<Hive Metastore role name>/ and is named "cloudera_manager_metastore_canary_test_db". The test returns "Bad" health if any of these operations fail. The test returns "Concerning" health if an unknown failure happens. The canary publishes a metric 'canary_duration' for the time it took for the canary to complete. Here is an example of a trigger, defined for the Hive Metastore role configuration group, that changes the health to "Bad" when the duration of the canary is longer than 5 sec: "IF (SELECT canary_duration WHERE entityName=$ROLENAME AND category = ROLE and last(canary_duration) > 5s) DO health:bad" A failure of this health test may indicate that the Hive Metastore is failing basic operations. Check the logs of the Hive Metastore and the Cloudera Manager Service Monitor for more details. This test can be enabled or disabled using the Hive Metastore Canary Health Test Hive Metastore monitoring setting. Ref: https://www.cloudera.com/documentation/enterprise/5-7-x/topics/cm_ht_hive_metastore_server.html#concept_p03_hon_yk
... View more
04-25-2019
10:21 AM
This is a garbage collection (GC) pause. Check how much JVM Heap had been used for the service (HS2 etc..) for which you received this Alert. From the alert, you can see that JVM pause takes 2+min and you have configured to alert if GC pause takes 60% of 1min. You should see the JVM Heap Memory Usage and GC pause charts in the Service(for which you see this alert) and check If the heap is constantly high then that is the likely reason. In that case, the solution could be a simple as increasing the heap size. You can refer to Cloudera documents[1][2] [1] https://www.cloudera.com/documentation/enterprise/5-7-x/topics/cm_ht_hiveserver2.html [2] https://www.cloudera.com/documentation/enterprise/5-7-x/topics/cm_ht_hive_metastore_server.html
... View more
04-16-2019
10:09 AM
Hi Kamal, If you don't mind, could you please share us which charts you would like to understand and also the errors which you would like to correlate with charts so that we will get a chance to help you in understanding ClouderaManager-->charts Thanks, Senthil Kumar
... View more
04-09-2019
05:54 AM
1 Kudo
You can try to increase the weight of the DRP3, so it will get more priority and jobs submitted to this pool will get more resources than other pools based on the weight configured.
... View more
04-08-2019
08:28 PM
1 Kudo
From the details which you shared, we could see that pyspark is pointing to older version(libboost_system.so.1.65.1) of libboost than the one expected (libboost_system.so.1.66.0) {{ dzdo /opt/cloudera/parcels/Anaconda/bin/conda list |grep boost libboost 1.65.1 habcd387_4 }} It looks like that new version of PyArrow was not installed properly. So please try clean older packages and then install pyarrow again using below command {{ conda install -c conda-forge pyarrow }} Best Regards, Senthil Kumar
... View more
01-03-2019
10:58 AM
Would you please try submitting Spark job by disabling 'spark.streaming.unpersist' {{ --conf spark.streaming.unpersist=false }}
... View more