Created 01-13-2016 12:17 AM
Can anybody help me to test the flume on sandbox. I did the following 3 steps. Please help on next steps to test this. Like: telnet and input the data.
A. I have installed flume on HDP sandbox the Hortonworks documentation (yum install flume; yum install flume-agent).
B. I have used the sample agent given; shown below.
=============================
Configuration File
==============================
# example.conf: A single-node Flume configuration
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
# Describe the
sink a1.sinks.k1.type = logger
# Use a channel that buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
C. Started the agent: flume-ng agent -n <agent-name> -f <configuration file name>
Created 01-13-2016 05:20 AM
here's my tutorial for telnet source and logger sink and additionally hdfs sink.
# tested on HDP 2.3.2 Sandbox
# Example, single-node Flume configuration using netcat source, memory channel and logger sink
# install telnet
yum install -y telnet
# start flume with this configuration
******************************************************************************
# example.conf: A single-node Flume configuration
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
******************************************************************************
# in another terminal
telnet localhost 44444
# type anything
# then in the original terminal
tail -f /var/log/flume/flume-a1.log
# Exampe netcat source, hdfs sink as DataStream
# create hdfs flume directory
sudo -u hdfs hdfs dfs -mkdir /flume
sudo -u hdfs hdfs dfs -mkdir /flume/events
sudo -u hdfs hdfs dfs -chown -R flume:hdfs /flume/events
******************************************************************************
# example.conf: A single-node Flume configuration
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/%S
a1.sinks.k1.hdfs.filePrefix = events-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute
a1.sinks.k1.hdfs.useLocalTimeStamp = true
a1.sinks.k1.hdfs.fileType = DataStream
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
******************************************************************************
# show the output in hdfs
sudo -u flume hdfs dfs -ls /flume/events/
sudo -u flume hdfs dfs -ls /flume/events/
sudo -u flume hdfs dfs -cat /flume/events/*/*/*/*
Created 01-13-2016 12:24 AM
You can send events on localhost port 44444 and it will show them in the logs. That basically shows that the Flume Agent is working as expected.
Created 01-13-2016 12:50 AM
Can you give sequence of commands step by step. Plz. I am new bee.
Created 01-13-2016 01:06 AM
You can find detailed steps in the Apache Flume User Guide.
Created 01-13-2016 01:20 AM
In one terminal, I started the agent. I opened a new terminal in sandbox. when I give telnet it is giving error: command not found.
$ telnet localhost 44444
Created 01-13-2016 01:28 AM
If you are logged in as root user, you can install telnet package
yum -y install telnet
Created 01-13-2016 01:38 AM
Great. Thanks a lot Deepesh. Succeeded.
Created 01-13-2016 05:20 AM
here's my tutorial for telnet source and logger sink and additionally hdfs sink.
# tested on HDP 2.3.2 Sandbox
# Example, single-node Flume configuration using netcat source, memory channel and logger sink
# install telnet
yum install -y telnet
# start flume with this configuration
******************************************************************************
# example.conf: A single-node Flume configuration
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
******************************************************************************
# in another terminal
telnet localhost 44444
# type anything
# then in the original terminal
tail -f /var/log/flume/flume-a1.log
# Exampe netcat source, hdfs sink as DataStream
# create hdfs flume directory
sudo -u hdfs hdfs dfs -mkdir /flume
sudo -u hdfs hdfs dfs -mkdir /flume/events
sudo -u hdfs hdfs dfs -chown -R flume:hdfs /flume/events
******************************************************************************
# example.conf: A single-node Flume configuration
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/%S
a1.sinks.k1.hdfs.filePrefix = events-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute
a1.sinks.k1.hdfs.useLocalTimeStamp = true
a1.sinks.k1.hdfs.fileType = DataStream
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
******************************************************************************
# show the output in hdfs
sudo -u flume hdfs dfs -ls /flume/events/
sudo -u flume hdfs dfs -ls /flume/events/
sudo -u flume hdfs dfs -cat /flume/events/*/*/*/*