Member since
10-07-2018
16
Posts
0
Kudos Received
0
Solutions
05-16-2019
10:47 PM
I checked the failed job log and found that there was one such item, java.lang.NoSuchField Error: HBASE_CLIENT_PREFETCH. Then I look at the classpath of the failed job, find different versions of the jar package of HBase in hive. aux. jars. path, delete the jar package here, restart hiveserver2, and everything returns to normal.
... View more
05-14-2019
08:49 PM
Hi, cdh 5.7.6 when I use hive client,It's normal to write data to HBase through hive. But when using beeline client to write data to hbase, an error occurred: Error: Could not initialize class org.apache.hadoop.hbase.client.HConnectionKey. The task sometimes succeed and sometimes fail. why?
... View more
Labels:
- Labels:
-
Apache HBase
-
Apache Hive
03-25-2019
01:30 AM
thanks first. I uninstall spark1, then install spark2.1,everything is OK
... View more
03-14-2019
11:38 PM
cdh version 5.7.6, cm version 5.7.2
This morning, I was installing Livy on the cluster, changed some configuration, tried to restart the whole cluster, then the YARN service could not start, tried many times, and always reported errors, but I found all the configurations, and did not find spark_shuffle.
What happened? can anyone help me ?
---error info ------
Start a role
Role failed to start due to error com.cloudera.cmf.service.config.ConfigGenException: Conflicting yarn extensions of key [spark_shuffle].
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
-
Cloudera Manager
03-05-2019
06:20 PM
I have done what you said. In addition to the DFS directory'/dnn/data1/dfs/dn/current', the data disk of the data node is the YARN container log file. And the log file is only about 5G. I tried the command 'lsof | grep deleted' ,and found no deleted files occupied. It's no use restarting YARN's job history alone. Finally, I had to restart HDFS, and then something magical happened. At the moment of restart completion, Non DFS Used dropped directly from 54T to 4T. I'm curious to know how datanode calculates the remaining available capacity of a node. I read the source code, but found no commands, such as du, df.
... View more
03-03-2019
09:58 PM
HI, thanks first for reply. for disk reserved block ,I have done it . tune2fs -m 1 /dev/sda1 ,(and sdb1,sdc1,sdd1) and the value of cluser Non DFS Used is growing slowly。just like From 44T to 54T in 7 days。 how to reduce the Non DFS Used ?
... View more
03-03-2019
07:25 PM
Hi, I have a problem. our HDFS cluster capacity is 300T, but the Non FDS Used is 50T。 cdh version is 5.7.2. I chose one of datanodes to ckeck.it has 4 disks. and the configuration dfs.datanode.du.reserved = 10G. ------hdfs dfadmin report info ------ Name: (hadoop07) Hostname: hadoop07 Rack: /default Decommission Status : Normal Configured Capacity: 23585072676864 (21.45 TB) DFS Used: 15178100988126 (13.80 TB) Non DFS Used: 5833234295881 (5.31 TB) DFS Remaining: 2573737392857 (2.34 TB) DFS Used%: 64.35% DFS Remaining%: 10.91% Configured Cache Capacity: 4294967296 (4 GB) Cache Used: 0 (0 B) Cache Remaining: 4294967296 (4 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 60 ---df -h info ---- /dev/sda1 5.4T 3.5T 1.9T 65% /dnn/data1 /dev/sdb1 5.4T 3.5T 1.9T 65% /dnn/data2 /dev/sdc1 5.4T 3.5T 1.9T 66% /dnn/data3 /dev/sdd1 5.4T 3.5T 1.9T 66% /dnn/data4 disk remain size is 1.9*4=7.6T but dfs Remaining is 2.3T, why these two values are so different.
... View more
Labels:
- Labels:
-
HDFS
11-21-2018
12:48 AM
hu,I have done it. I would like to mention a few points. 1. The disk mount information of the host must be backed up before the host is offline (partition UUID mount information is very important), and the /etc/passwd and /etc/group files of the host must also be backed up. 2. Do not check the Unauthorization button when deleting all roles on the host from the cluster. Otherwise, you need to wait for replication and the action will become slow. 3. Shortly after DN is offline, HDFS will detect insufficient replicas and start replicating. Some configurations can be adjusted, but because I don't want to restart the cluster, there will be no modification, which will not have much impact on the operation of the cluster. 4. it is best to build local private CM, CDH and OS resources. This will greatly reduce the speed of node installation. 5. After the host reinstalls the operating system, it is necessary to mount the disk to its original location through the previously backed-up partition UUID to ensure that the data can be reused. 6. It is very important to manually add users and groups backed up before adding them on the host to ensure that the mapping relationship is the same as the original one, so that the privileges of these data can be correctly identified by HDFS again smoothly. 7. After everything goes smoothly, the HDFS DFS fsck / command can quickly check whether the replica is sufficient and stop the previous three replica recovery actions in time. My English is not good, the expression may not be very accurate, there are problems that can continue to discuss with me.
... View more
11-08-2018
05:58 PM
Hi, I'm cluster manager,and CDH version is 5.7.2. The same trouble,If I can change some params in CM to solve this problem.
... View more
10-22-2018
06:43 PM
Thank you so munch! I change the group of '/tmp/logs' to hadoop , and restart the JobHistoryServer role, everything being OK. So happy !
... View more