Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

I know this is a basic question, but just started with Nifi, want to save syslog data into HDFS, can anyone help

Expert Contributor

it'll be helpfull if what processor to be used in beween listensyslog and puthdfs is suggested.

1 ACCEPTED SOLUTION

Master Guru

@Hadoop User

The "it'll be helpful if what processor to be used in between listenSyslog and putHDFS is suggested" question is a hard one for anyone to answer without understanding the end result you are looking for.

There are The following processors: - parseSyslog (extract bits from syslog content in to FlowFile attributes) You can then use those attributes if you like to make routing decisions (routeOnAttribute), define unique target HDFS directories based on attribute value in PutHDFS - SplitText or SlitContent (Can be used to FlowFiles that contain more then one syslog message each). You get improved performance if listenSyslog ingests in batches.

- UpdateAttribute (Used to add you own custom attributes or manipulate existing attributes on FlowFiles)

Thanks,

Matt

View solution in original post

2 REPLIES 2

Expert Contributor

Hi, You just need put a processor "puthdfs", then righ click to open the configuration panel. they you need to provide the configuration file of hdfs (hdfis-site.xml, core-site.xml) and the folder where you want to put the data. This is to put data to hdfs. To read data (for example locally) use the processor tailfile.

Master Guru

@Hadoop User

The "it'll be helpful if what processor to be used in between listenSyslog and putHDFS is suggested" question is a hard one for anyone to answer without understanding the end result you are looking for.

There are The following processors: - parseSyslog (extract bits from syslog content in to FlowFile attributes) You can then use those attributes if you like to make routing decisions (routeOnAttribute), define unique target HDFS directories based on attribute value in PutHDFS - SplitText or SlitContent (Can be used to FlowFiles that contain more then one syslog message each). You get improved performance if listenSyslog ingests in batches.

- UpdateAttribute (Used to add you own custom attributes or manipulate existing attributes on FlowFiles)

Thanks,

Matt