About MyNamesNotRick

MyNamesNotRick · ‎01-26-2021

@mike_bronson7 It seems to me like this is a symptom of having the default replication set to 3. This is for redundancy and processing capability within HDFS. It is recommended to have minimum 3 data nodes in the cluster to accommodate 3 healthy replicas of a block (as we have a default replication of 3). HDFS will not write replicas of the same blocks to the same data node. In your scenario there will be under replicated blocks and 1 healthy replica will be placed on the available data node. You may run setrep [1] to change the replication factor. If you provide a path to a directory then the command recursively changes the replication factor of all files under the directory tree rooted at path. hdfs dfs -setrep -w 1 /user/hadoop/dir1 [1] https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html#setrep

MyNamesNotRick · ‎01-25-2021

CDH6.3.x supports Spark2.4.0 - https://docs.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_63_packaging.html#c... You may find the CSD + Parcel here: https://docs.cloudera.com/documentation/spark2/latest/topics/spark2_packaging.html#packaging Spark 3 is offically supported in CDP 7.1.5 - https://docs.cloudera.com/cdp-private-cloud-base/7.1.5/cds-3/topics/spark-spark-3-overview.html

MyNamesNotRick · ‎01-25-2021

@Paop You will need a HDFS and spark gateway role on the node where you are triggering the job. The error -- "Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream" -- is a hdfs class. Which would lead me to believe that you do not have gateway role on the node from where you running the command.

MyNamesNotRick · ‎08-28-2019

Slow get rate is The number of Gets that took over 1000ms to complete.

Online	Offline
Last Visited	‎08-24-2021 10:08 PM

Member Since	‎08-14-2018 09:40 AM
Last Visited	‎08-24-2021 10:08 PM
Posts	47
Kudos received	2

Cloudera Community

Re: how to identify the problem about under replic...

Re: HBase Slow get monitoring

Re: how to identify the problem about under replic...

Re: Spark 3.0.1 on a CDH-6.3.4 cluster

Re: Hive-on-Spark: Failed to create spark client

Re: HBase Slow get monitoring