Support Questions

Find answers, ask questions, and share your expertise

Send text file into HDFS using Flume in Cloudera

avatar

Hi, I want to use flume to send text file to hdfs, I changed Configuration File in Flume service in Cloudera Manager as follows:

# Sources, channels, and sinks are defined per
# agent name, in this case 'tier1'.
tier1.sources  = source1
tier1.channels = channel1
tier1.sinks    = sink1

# For each source, channel, and sink, set
# standard properties.
# source details
tier1.sources.source1.type     = spooldir
tier1.sources.source1.spoolDir = /data/diem
tier1.sources.source1.fileHeader = false
tier1.sources.source1.basenameHeader = true
tier1.sources.source1.fileSuffix  = .COMPLETED
tier1.sources.source1.thread = 4
tier1.sources.source1.interceptors = newint
tier1.sources.source1.interceptors.newint.type = timestamp
tier1.sources.source1.channels = channel1

# channel details
tier1.channels.channel1.type   = file
tier1.channels.channel1.capacity = 10000
tier1.channels.channel1.transactionCapacity = 10000
tier1.channels.channel1.write-timeout = 60
tier1.channels.channel1.checkpointDir = /data
tier1.channels.channel1.dataDirs = /data

# sink details
tier1.sinks.sink1.type         = HDFS
tier1.sinks.sink1.fileType = DataStream
tier1.sinks.sink1.channel      = channel1
tier1.sinks.sink1.hdfs.path = hdfs://localhost:8020/user/cloudera/flume/events
tier1.sinks.sink1.hdfs.writeFormat  = Text
tier1.sinks.sink1.hdfs.filePrefix = %{basename}
tier1.sinks.sink1.threadsPoolSize = 4
tier1.sinks.sink1.hdfs.idleTimeout = 60
tier1.sinks.sink1.hdfs.batchSize = 100000



Then, I don't know how to start Flume in terminal to send file into HDFS, can someone help me? And can someone look at the configuration file and edit it for me if there are errors?

 

 
2 ACCEPTED SOLUTIONS

avatar
Expert Contributor

Hi,

 

After saving the changes, you should have seen the icon to refresh cluster. Clicking this icon should do the steps to update the values.  The configuration looks good. 

 

Check the value of CM > Flume > configuration > Agent  , this will tell whihc node the tier1 is configured to run on.

 

You can check the logs on that node to confirm if the sink1 got started or not. ( The logs are by default under /var/log/flume-nd). If you do not see the data in HDFS , please see the logs and you should see corresponding error message if ther is  any issue in writting to hdfs.

 

Regards
Bimal

View solution in original post

avatar
Expert Contributor

Hi @AlohaDecember

 

Yeah, flume-ng, It was some typo I guess on previous comment. Please check if there are any ERROR or suspicious messages

 

Additionally, could you please check if your source spool directory is getting content to pass to flume

 

Thanks,
Satz

View solution in original post

6 REPLIES 6

avatar
Expert Contributor

Hi,

 

After saving the changes, you should have seen the icon to refresh cluster. Clicking this icon should do the steps to update the values.  The configuration looks good. 

 

Check the value of CM > Flume > configuration > Agent  , this will tell whihc node the tier1 is configured to run on.

 

You can check the logs on that node to confirm if the sink1 got started or not. ( The logs are by default under /var/log/flume-nd). If you do not see the data in HDFS , please see the logs and you should see corresponding error message if ther is  any issue in writting to hdfs.

 

Regards
Bimal

avatar

You mean log in file flume.log in folder flume-ng? Because I don't see the flume-nd 

avatar
Expert Contributor

Hi @AlohaDecember

 

Yeah, flume-ng, It was some typo I guess on previous comment. Please check if there are any ERROR or suspicious messages

 

Additionally, could you please check if your source spool directory is getting content to pass to flume

 

Thanks,
Satz

avatar

Thank you very much, I solved my problem

avatar
Community Manager

I'm happy to see you resolved your issue. Please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. 

 

 Screen Shot 2018-12-17 at 12.14.06 PM.png

 


Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar

Yeah, I did, tks 😄