Support Questions

maykiwogno · ‎03-24-2017

Hi all

do you know how is it possible to cleanup some file in this directory version-2 now it is up to 1.2Go?

It contains log.* and snapshot.*

/var/opt/hosting/nifi-1.1.0/conf/state/zookeeper
-bash-4.1# du -ks version-2/
1126428 version-2/

thanks

MattWho · ‎03-24-2017

@mayki wogno

Is the same directory the same size of everyone of your zookeeper nodes? If not you may be having an issue on only one of your znodes. You should be able to shutdown the zookeeper node and purge all those files. The pertain files will be re-written from the other znodes in the zookeeper cluster when it rejoins the zookeeper cluster.

Zookeeper is storing information about who is your current cluster coordinator, primary node, and any cluster wide state various from various processor in your dataflows.

I am assuming you are running the embedded zookeeper here. In that case the zookeeper.properties file should control the auto purge of the snapshots through the following properties:

autopurge.purgeInterval=24

autopurge.snapRetainCount=30

The transaction logs should be handle via routine maintenance which you can find here:

http://archive.cloudera.com/cdh4/cdh/4/zookeeper/zookeeperAdmin.html#sc_maintenance

Thanks,

Matt

View solution in original post

MattWho · ‎03-24-2017

@mayki wogno

Is the same directory the same size of everyone of your zookeeper nodes? If not you may be having an issue on only one of your znodes. You should be able to shutdown the zookeeper node and purge all those files. The pertain files will be re-written from the other znodes in the zookeeper cluster when it rejoins the zookeeper cluster.

Zookeeper is storing information about who is your current cluster coordinator, primary node, and any cluster wide state various from various processor in your dataflows.

I am assuming you are running the embedded zookeeper here. In that case the zookeeper.properties file should control the auto purge of the snapshots through the following properties:

autopurge.purgeInterval=24

autopurge.snapRetainCount=30

The transaction logs should be handle via routine maintenance which you can find here:

http://archive.cloudera.com/cdh4/cdh/4/zookeeper/zookeeperAdmin.html#sc_maintenance

Thanks,

Matt

maykiwogno · ‎03-24-2017

@Matt : it seems that purge not correctly running. Yes i'm running embedded zookeeper with default properties

-bash-4.1# ls -rtl | grep snapshot |wc -l
93

-bash-4.1# ls -rtl | more
total 1126420
-rw-r--r-- 1 root root 67108880 Dec 12 10:28 log.100000001
-rw-r--r-- 1 root root      979 Dec 12 14:33 snapshot.200000006
-rw-r--r-- 1 root root 67108880 Dec 12 14:34 log.200000007
-rw-r--r-- 1 root root 67108880 Dec 12 15:00 log.300000001
-rw-r--r-- 1 root root     1167 Dec 12 15:01 snapshot.400000006

MattWho · ‎03-24-2017

@mayki wogno

Both zookeeper and NiFi can be very resource intensive applications. Fine for development, but recommend setting up your own external zookeeper cluster for using in production environments. It is possible load is affecting the zookeeper cleanup. You can use the linked Zookeeper maintenance guide to clean-up your zk version-2 directory.

Snapshots are nothing more then backups in time. Considering that the information that NiFi stores in ZK is ever changing, I personally don't see much value in being able to restore from backup. (Going back to different retained state).

Thanks,

Matt

MattWho · ‎03-24-2017

Try changing the values to a very small number from their defaults:

autopurge.purgeInterval=1
autopurge.snapRetainCount=3

A restart of zookeeper (In your case Nifi) will be needed for changes to take affect.

Cloudera Community

Support Questions

NIFI : zookeeper cleanup directory version-2

How-To: Cleanup SolrCloud entries in ZooKeeper

nifi mutual tls version 1 vs version 2

ZOOKEEPER - EndOfStreamException

Ambari database cleanup - Speed up

Zookeeper Health Checks

How to expand existing NiFi cluster fault toleranc...

Zookeeper Sizing and Placement

How to cleanup HDP stack from the Operating System

NiFi GetHDFSFileInfo Directory Size

How to cleanup service from Ambari database