About pdvorak

pdvorak · ‎06-19-2019

Moving them in 10% at a time would be a good plan. You'll want to make sure that they are on the same filesystem, just in a different directory, so the move isn't copying across filesystems, just changing inodes. Going forward, it would be recommended to add a flume channel trigger, to alert you when the channel starts filling up, if your downstream agent isn't accepting events. -pd

pdvorak · ‎06-05-2019

Can you please provide the reassign-partitions command and files that you are using to migrate? What version of CDK are you using? -pd

pdvorak · ‎05-01-2019

No, if you only have one sink, you would have one file (assuming you don't use header variable buckets). The sink will consume from all three partitions and may deliver those in one batch to one file. -pd

pdvorak · ‎04-10-2019

What version of CDH are you using? In newer versions it displays a warning, but still allows you to save the changes without a source. -pd

pdvorak · ‎03-20-2019

The snapshots are part of the indexes, representing a point in time list of the segments in the index. When you perform the backup, the metadata (information about the cluster) and the snapshot specified indicate s which set of index files to be backup up/copied to the destination hdfs directory (as specified in the <backup> section of the source solr.xml) This blog walks through the process https://blog.cloudera.com/blog/2017/05/how-to-backup-and-disaster-recovery-for-apache-solr-part-i/ When you run the --prepare-snapshot-export, it creates a copy of the metadata, and a copylisting of all the files that will be copied by the distcp command, to the remote cluster. Then, when you execute the snapshot export, the distcp command will copy those files to the remote cluster. The -b on the restore command is just the name of the directory (represented by the snapshot name) that was created and copied by distcp. -pd

pdvorak · ‎03-19-2019

You are correct, thtat there isn't a predictable or guaranteed order for the core_noden names. The recommendation would be to use the solr backup and restore functionality (which uses distcp to transfer the index files and metadata) between your source cluster and your target cluster: https://www.cloudera.com/documentation/enterprise/latest/topics/search_backup_restore.html -pd

pdvorak · ‎02-27-2019

Thats odd that the VM is read only....Are you making the change in CM for the flume logging safety valve? -pd

pdvorak · ‎01-23-2019

Just realized, the log4j setting should go in the flume logging safety valve, not the broker. Also, make sure you can run a kafka-console-consumer and connect to the topic as well, just to make sure its not something with kafka. -pd

pdvorak · ‎01-23-2019

Morphlines would be the preferred way to selectively choose the data that will be passing through the source to the sink. you can use the morphline removeFields command [1] to selectively drop the fields you don't want. If you need to review what is happening with the data you can turn on morphline TRACE by adding the following to the flume logging safety valve: log4j.logger.org.kitesdk.morphline=TRACE -pd [1] http://kitesdk.org/docs/1.1.0/morphlines/morphlines-reference-guide.html#removeFields

pdvorak · ‎01-23-2019

What is your channel size in the flume metrics page reported as? Is it decreasing? flume keeps at least the two most recent log files in the flume file channel at all times, regardless of whether it is fully drained or not. The best is to review the channel size in the flume metrics page, or on the channel size charts. -pd

Online	Offline
Last Visited	‎01-08-2020 04:37 PM

Member Since	‎01-09-2014 08:15 AM
Last Visited	‎01-08-2020 04:37 PM
Posts	283
Kudos received	70

Cloudera Community

Re: spooldir channel error - too many files. - how...

Re: How to configure Flume with Kafka channel with...

Re: How to configure Flume with Kafka channel with...

Re: Solrcloud Replica Names

Re: flume kafkasource, hdfs sink remove avro field

Re: spooldir channel error - too many files. - how...

Re: Reassignment of a replica across Kafka volumes...

Re: How to configure Flume with Kafka channel with...

Re: How to configure Flume with Kafka channel with...

Re: Solrcloud Replica Names

Re: Solrcloud Replica Names

Re: Problem about Configuring Flume as Kafka Consu...

Re: Problem about Configuring Flume as Kafka Consu...

Re: flume kafkasource, hdfs sink remove avro field

Re: Possible to drain a Flume channel?