About pdvorak

pdvorak · ‎06-21-2016

You are specified that all roll values are zero: a1.sinks.k1.hdfs.rollCount = 0 a1.sinks.k1.hdfs.rollInterval = 0 a1.sinks.k1.hdfs.rollSize = 0 Which means the latest file will never roll (since you have hdfs.maxOpenFiles=1). I'd suggest adding the hdfs.idleTimeout if you want to make sure they roll after the file has been ingested and sent to hdfs. -pd

pdvorak · ‎06-07-2016

The flume http source is for creating a REST API that you can post data to from the upstream sender. If you are looking for a source that will consume from SQL server via accessing the SQL server API, you'll need to write a custom source for that, or possibly try this: https://github.com/keedio/flume-ng-sql-source Additionally, if you don't need real time processing, you may want to consider using sqoop to import data via batch processing, and it can handle incremental updates. -pd

pdvorak · ‎05-31-2016

Can you confirm if you have the "Topic Auto Creation" disabled: auto.create.topics.enable=false If so, have you created the t1 topic beforehand? Can you do the following: kafka-topics --zookeeper <zkhost>:2181 --list kafka-topics --zookeeper <zkhost>:2181 --describe --topic t1

pdvorak · ‎05-25-2016

Please confirm, there should be no space between localhost:2181 and /kafka. The zk option should be: --zookeeper localhost:2181/kafka My apologies, it appears I had a typo in my original recommendation. -pd

pdvorak · ‎05-25-2016

This documentation goes over stopping and starting flume when not using Cloudera Manager. This assumes you are running packages and not parcels on this edge node: http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_flume_run.html

pdvorak · ‎05-25-2016

There is a regression that was introduced in CDH5.7.0 that throws that stack trace (at INFO level) for every update, or /admin/cores command. The workaround is to set the following in the logging safety valve for the solr service: log4j.logger.org.apache.solr.servlet.SolrDispatchFilter=WARN This will be resolved in CDH5.7.1. -pd

pdvorak · ‎05-25-2016

It looks like zookeeper didn't get initialized, can you shut down the solr service and run the "Initialize Solr" action from the solr service page? If that doesn't work, then try the following from the command line: solrctl init --force (Note: this is a descructive command which will remove all solr configurations in zk and reinitialize. Do not use if it you have solr configuration you need to keep) -pd

pdvorak · ‎05-24-2016

Please define what you mean by "Trigger Flume agent". Are you referring to starting the flume agent, or being able to deliver events to hdfs?

pdvorak · ‎05-19-2016

Here is info on setting up a flume service in CM: http://www.cloudera.com/documentation/enterprise/latest/topics/cm_mc_flume_service.html You can have multiple flume services within a CM cluster. Each configuration would be separate. -pd

pdvorak · ‎05-19-2016

This error: org.apache.avro.AvroRuntimeException: Excessively large list allocation request detected: 150994944 items! Connection closed. Is usually caused when something upstream is trying to send non-avro data to the avro source. In your source config, you are specifying the avro source with the same port as the hdfs namenode port: tier1.sources.source1.type = avro tier1.sources.source1.bind= 192.168.4.110 tier1.sources.source1.port= 8021 tier1.sources.source1.channels = channel1 tier1.channels.channel1.type= memory tier1.sinks.sink1.type = hdfs tier1.sinks.sink1.channel = channel1 tier1.sinks.sink1.hdfs.path = hdfs://192.168.4.110:8021/user/hadoop/flumelogs/ I believe that will cause issues in your configuration, as the sink will try to connect to the avro source port as it thinks thats the namenode port. If your namenode port is indeed 8021, then you need to change your avro source port to be something different. -pd

Online	Offline
Last Visited	‎01-08-2020 04:37 PM

Member Since	‎01-09-2014 08:15 AM
Last Visited	‎01-08-2020 04:37 PM
Posts	283
Kudos received	70

Cloudera Community

Re: spooldir channel error - too many files. - how...

Re: How to configure Flume with Kafka channel with...

Re: How to configure Flume with Kafka channel with...

Re: Solrcloud Replica Names

Re: flume kafkasource, hdfs sink remove avro field

Re: Move files from a spooling directory to HDFS w...

Re: Use REST API as source in Flume

Re: Error when sending message to topic in Kafka

Re: No brokers found in ZK

Re: Flume without agents on web server

Re: Could not load solr.xml from zookeeper: node n...

Re: Could not load solr.xml from zookeeper: node n...

Re: Flume without agents on web server

Re: Flume agent on edge node

Re: While running flume agent facing some error