Member since
01-11-2017
18
Posts
5
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2857 | 01-11-2017 02:19 AM |
02-23-2021
04:41 AM
Yes, you can download the hdfs client configuration from Cloudera Manager, but this is not possible always, when you are working on different department or any bureaucratic issue... And if you make any change HDFS configuration, you must download this configuration again. Is not a scalable solution in a big environments, the best solution is working on the same cluster (gateway if is possible), but if you have an external Flume Agents, there no exist a properly and scalable solution I think.
... View more
02-23-2021
04:32 AM
1 Kudo
Hi community, I have found an issue on CDH 6 with hive log rotation. In this CDH version, hive log4j changes from rotation by size (configured by cloudera manager) to rotation by size and day (RFA to DRFA). ISSUE This change results in: Cloudera can not remove old logs (and the FS can be filled), property "Maximum Log File Backups" does not work properly. The rotated log name changed from index to date, If we produced more logs than "Max Log Size", we only have one file per day, so we are overwriting this file with the new produced logs, losing the old logs on same day, and we will have X logs per day, but probably losing logs every day (huge problem when resolving issues). I'm completely sure that is a new issue, because hive log4j configuration changes on this version. SOLUTION The solution for me was changing the "appender.DRFA.rollingPolicy.FileNamePattern" property through Cloudera Safety Valve in Hive Server 2, "Logging Advanced Configuration Snippet". Adding this new line: appender.DRFA.rollingPolicy.FileNamePattern=${hive.log.dir}/${hive.log.file}.%i After restarting Hiver Server 2, the log rotation behaviour changes, with this change, I only have extrictly "Maximum Log File Backups" files with "Max Log Size" size, keeping away that the FS can be filled. DEBUGGING This actions are only for debugging proposal I recommend you to check if you are using this CDH version (in my case 6.0.1) and if you have this issue too, you can validate this behaviour generating more logs changing "Debug Level" to TRACE and reducing "Max Log Size Maximum" to 1MB and "Log File Backups" to 2 and run any query on hive. If your new logs are not rotated properly (hive only keep 2 new logs), more probably you have the same issue. On latest versions, Cloudera adds new properties on hive log4j, probably to avoid this issue, but it keeps rotation by day, so we can experiment the issue (point 2) too. I think that rotation by day pattern is not the best option, by index is more effective, even by day and hour (hour, minute and second). Marc Casajus @salimhussain @na
... View more
Labels:
- Labels:
-
Apache Hive
10-16-2017
01:54 AM
This is not useful for a remote hdfs clusters... Is possible to user webhdfs from flume?
... View more
09-20-2017
05:41 AM
Hi, Try running mannualy (as HDFS user): hdfs balancer -threshold 5 HDFS balancer skips tiny blocks, check if this is your case. --> JIRA HDFS-8824 Regards, Marc Casajús
... View more
09-05-2017
10:36 PM
It's possible, but if you can not upgrade to the last version, you can try my steps to recreate manually. Regards, Marc.
... View more
08-23-2017
04:33 AM
This command loads environment variables that you will use in the next command: telnet $server_host $server_port You need to check if the problem is for network issue or for application issue. Regards, Marc.
... View more
08-22-2017
04:00 AM
Please run: $ source /etc/cloudera-scm-agent/config.ini &>/dev/null
$ telnet $server_host $server_port &> without any space between & and >. Regards, Marc.
... View more
08-22-2017
03:45 AM
Hi, You have any snapshot enabled? I think that with "hdfs fsck /" you are not checking snapshots. Remove the snapshot with missing blocks and the error will be disappear. From cloudera you can check in Top Menu > Backup > Snapshot Policies The BDR replica can use snapshots automatically, so you need the check from command line: Check Snapshotable directories: $ hdfs lsSnapshottableDir drwxrwx--- 0 user1 group1 0 2017-08-22 04:00 0 655432 /dir1 Remove the snapshot: $ hdfs dfs -ls /dir1/.snapshot drwxrwx--- 0 user1 group1 0 2017-08-22 04:00 0 655432 /dir1/.snapshot/<snapshot_id>/dir1 $ hdfs dfs -deleteSnapshot /dir1 <snapshot_id> Check HDFS Snapshots for more information. Regards, Marc Casajus.
... View more
08-22-2017
03:04 AM
Hi cdhhadoop, Cloudera agent is completely down? It happens in more servers? Can you provide /var/log/cloudera-scm-agent/cloudera-scm-agent.out output? Can you provide the output of the next commands?: $ netstat -ltnp | grep :9000
$ source /etc/cloudera-scm-agent/config.ini &>/dev/null $ ping -w1 $server_host
$ telnet $server_host $server_port Regards, Marc Casajus
... View more
08-22-2017
02:45 AM
1 Kudo
Hi tasch, /tmp/hadoop-yarn have incorrect owner, it needs to bee yarn. /var/run/cloudera-scm-agent/cgroups/cpu/hadoop-yarn, needs to be created in all nodemanegers. Can you try to create this directories? In HDFS: hdfs dfs -mkdir /tmp/hadoop-yarn
hdfs dfs -chmod 755 /tmp/hadoop-yarn
hdfs dfs -chown yarn:supergroup /tmp/hadoop-yarn In Operating System: for NodeManager in $NodeManegerList
do
ssh $NodeManager 'mkdir --mode=775 /var/run/cloudera-scm-agent/cgroups/cpu/hadoop-yarn'
ssh $NodeManager 'chown yarn:hadoop /var/run/cloudera-scm-agent/cgroups/cpu/hadoop-yarn'
done Please if you found another solution, please share it. It works for me in cdh 5.9. Regards, Marc Casajús.
... View more