Member since
06-21-2016
6
Posts
1
Kudos Received
0
Solutions
06-21-2016
11:11 PM
Currently i am using spooldir(source) for copying the files from local file system to HDFS, but i want to copy files from remote windows system. So can some one suggest which source option can i use to copy the files from remote windows system to HDFS using flume where i can specify the username and password.
... View more
Labels:
- Labels:
-
Apache Flume
-
HDFS
06-21-2016
07:09 AM
Hi I am using flume to copy the files from spooling directory to HDFS using file as the channel.
#Component names
a1.sources = src
a1.channels = c1
a1.sinks = k1
#Source details
a1.sources.src.type = spooldir
a1.sources.src.channels = c1
a1.sources.src.spoolDir = /home/cloudera/onetrail
a1.sources.src.fileHeader = false
a1.sources.src.basenameHeader = true
# a1.sources.src.basenameHeaderKey = basename
a1.sources.src.fileSuffix = .COMPLETED
a1.sources.src.threads = 4
a1.sources.src.interceptors = newint
a1.sources.src.interceptors.newint.type = timestamp
#Sink details
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = hdfs:///data/contentProviders/cnet/%Y%m%d/
# a1.sinks.k1.hdfs.round = false
# a1.sinks.k1.hdfs.roundValue = 1
# a1.sinks.k1.hdfs.roundUnit = second
a1.sinks.k1.hdfs.writeFormat = Text
a1.sinks.k1.hdfs.fileType = DataStream
#a1.sinks.k1.hdfs.file.Type = DataStream
a1.sinks.k1.hdfs.filePrefix = %{basename}
# a1.sinks.k1.hdfs.fileSuffix = .xml
a1.sinks.k1.threadsPoolSize = 4
# use a single file at a time
a1.sinks.k1.hdfs.maxOpenFiles = 1
# rollover file based on maximum size of 10 MB
a1.sinks.k1.hdfs.rollCount = 0
a1.sinks.k1.hdfs.rollInterval = 0
a1.sinks.k1.hdfs.rollSize = 0
a1.sinks.k1.hdfs.batchSize = 12
# Channel details
a1.channels.c1.type = file
a1.channels.c1.checkpointDir = /tmp/flume/checkpoint/
a1.channels.c1.dataDirs = /tmp/flume/data/
# Bind the source and sink to the channel
a1.sources.src.channels = c1
a1.sinks.k1.channels = c1
with the above configuration it is able to copy the files to hdfs but the problem which i am facing is one file is keep staying as .tmp and not copying the complete file content.
Can some one help me what could be the problem.
... View more
Labels:
- Labels:
-
Apache Flume
-
HDFS