Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Flume - question

avatar
Rising Star

Can anybody help me to test the flume on sandbox. I did the following 3 steps. Please help on next steps to test this. Like: telnet and input the data.

A. I have installed flume on HDP sandbox the Hortonworks documentation (yum install flume; yum install flume-agent).

B. I have used the sample agent given; shown below.

=============================

Configuration File

==============================

# example.conf: A single-node Flume configuration

# Name the components on this agent

a1.sources = r1

a1.sinks = k1

a1.channels = c1

# Describe/configure the source

a1.sources.r1.type = netcat

a1.sources.r1.bind = localhost

a1.sources.r1.port = 44444

# Describe the

sink a1.sinks.k1.type = logger

# Use a channel that buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

C. Started the agent: flume-ng agent -n <agent-name> -f <configuration file name>

1 ACCEPTED SOLUTION

avatar
Master Mentor
@Vidya SK

here's my tutorial for telnet source and logger sink and additionally hdfs sink.

# tested on HDP 2.3.2 Sandbox

# Example, single-node Flume configuration using netcat source, memory channel and logger sink

# install telnet

yum install -y telnet

# start flume with this configuration

******************************************************************************

# example.conf: A single-node Flume configuration

# Name the components on this agent

a1.sources = r1

a1.sinks = k1

a1.channels = c1

# Describe/configure the source

a1.sources.r1.type = netcat

a1.sources.r1.bind = localhost

a1.sources.r1.port = 44444

# Describe the sink

a1.sinks.k1.type = logger

# Use a channel which buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

******************************************************************************

# in another terminal

telnet localhost 44444

# type anything

# then in the original terminal

tail -f /var/log/flume/flume-a1.log

# Exampe netcat source, hdfs sink as DataStream

# create hdfs flume directory

sudo -u hdfs hdfs dfs -mkdir /flume

sudo -u hdfs hdfs dfs -mkdir /flume/events

sudo -u hdfs hdfs dfs -chown -R flume:hdfs /flume/events

******************************************************************************

# example.conf: A single-node Flume configuration

# Name the components on this agent

a1.sources = r1

a1.sinks = k1

a1.channels = c1

# Describe/configure the source

a1.sources.r1.type = netcat

a1.sources.r1.bind = localhost

a1.sources.r1.port = 44444

# Describe the sink

a1.sinks.k1.type = hdfs

a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/%S

a1.sinks.k1.hdfs.filePrefix = events-

a1.sinks.k1.hdfs.round = true

a1.sinks.k1.hdfs.roundValue = 10

a1.sinks.k1.hdfs.roundUnit = minute

a1.sinks.k1.hdfs.useLocalTimeStamp = true

a1.sinks.k1.hdfs.fileType = DataStream

# Use a channel which buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

******************************************************************************

# show the output in hdfs

sudo -u flume hdfs dfs -ls /flume/events/

sudo -u flume hdfs dfs -ls /flume/events/

sudo -u flume hdfs dfs -cat /flume/events/*/*/*/*

View solution in original post

7 REPLIES 7

avatar

You can send events on localhost port 44444 and it will show them in the logs. That basically shows that the Flume Agent is working as expected.

avatar
Rising Star

@Deepesh

Can you give sequence of commands step by step. Plz. I am new bee.

avatar

You can find detailed steps in the Apache Flume User Guide.

avatar
Rising Star

In one terminal, I started the agent. I opened a new terminal in sandbox. when I give telnet it is giving error: command not found.

$ telnet localhost 44444

avatar

If you are logged in as root user, you can install telnet package

yum -y install telnet

avatar
Rising Star

Great. Thanks a lot Deepesh. Succeeded.

avatar
Master Mentor
@Vidya SK

here's my tutorial for telnet source and logger sink and additionally hdfs sink.

# tested on HDP 2.3.2 Sandbox

# Example, single-node Flume configuration using netcat source, memory channel and logger sink

# install telnet

yum install -y telnet

# start flume with this configuration

******************************************************************************

# example.conf: A single-node Flume configuration

# Name the components on this agent

a1.sources = r1

a1.sinks = k1

a1.channels = c1

# Describe/configure the source

a1.sources.r1.type = netcat

a1.sources.r1.bind = localhost

a1.sources.r1.port = 44444

# Describe the sink

a1.sinks.k1.type = logger

# Use a channel which buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

******************************************************************************

# in another terminal

telnet localhost 44444

# type anything

# then in the original terminal

tail -f /var/log/flume/flume-a1.log

# Exampe netcat source, hdfs sink as DataStream

# create hdfs flume directory

sudo -u hdfs hdfs dfs -mkdir /flume

sudo -u hdfs hdfs dfs -mkdir /flume/events

sudo -u hdfs hdfs dfs -chown -R flume:hdfs /flume/events

******************************************************************************

# example.conf: A single-node Flume configuration

# Name the components on this agent

a1.sources = r1

a1.sinks = k1

a1.channels = c1

# Describe/configure the source

a1.sources.r1.type = netcat

a1.sources.r1.bind = localhost

a1.sources.r1.port = 44444

# Describe the sink

a1.sinks.k1.type = hdfs

a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/%S

a1.sinks.k1.hdfs.filePrefix = events-

a1.sinks.k1.hdfs.round = true

a1.sinks.k1.hdfs.roundValue = 10

a1.sinks.k1.hdfs.roundUnit = minute

a1.sinks.k1.hdfs.useLocalTimeStamp = true

a1.sinks.k1.hdfs.fileType = DataStream

# Use a channel which buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

******************************************************************************

# show the output in hdfs

sudo -u flume hdfs dfs -ls /flume/events/

sudo -u flume hdfs dfs -ls /flume/events/

sudo -u flume hdfs dfs -cat /flume/events/*/*/*/*