Support Questions
Find answers, ask questions, and share your expertise

how to import data from an external source to my hdfs ?

how to import data from an external source to my hdfs ?

Rising Star

Hello, I would like to please imoprter data from an external data source to my HDP tools (Hortonworks Data Platform). Currently I found the tool to the Flume based on the "agents" of which the following code:

1. could someone explain to me how clearly this to work?

2. someone there a link where I could find a very specific example?

Thank you Best regards !
agent.sources = syslog    
agent.channels = memoryChannel    
agent.channels.memoryChannel.type = memory 
agent.channels.memoryChannel.capacity = 10000
    
agent.sources.syslog.type = syslogtcp    
agent.sources.syslog.port = 5140    
agent.sources.syslog.host = 127.0.0.1    
agent.sources.syslog.channels = memoryChannel

agent.sinks = HDFSEventSink    
agent.sinks.HDFSEventSink.channel = memoryChannel 
agent.sinks.HDFSEventSink.type = hdfs 
agent.sinks.HDFSEventSink.hdfs.path = hdfs://PATH_TO_YOUR_HDFS
2 REPLIES 2

Re: how to import data from an external source to my hdfs ?

Hi @alain TSAFACK,

From you Flume configuration I see that you are trying to get Syslog data to HDFS. For this, I recommand using NiFi (HDF) instead of Flume. You can create easily a workflow with the following processors:

If you are new to NiFi you can find information on how to install it here and here

You can also use the NiFi Ambari Service if you want to install it with Ambari : https://github.com/abajwa-hw/ambari-nifi-service

Hope this helps

Re: how to import data from an external source to my hdfs ?

Master Collaborator

You can use Flume by configuring a Syslog source and HDFS sink. You can find the information to configure syslog source here and HDFS sink here. The configuration you are using seems ok. Can you post the agent logs that you see on the console? If you do not see the agent logs on console try using the following flag in your agent command:

-Dflume.root.logger=INFO,console

The agent logs output will give us clues on where the problem lies.