About Raghava9

Raghava9 · ‎06-21-2016

Currently i am using spooldir(source) for copying the files from local file system to HDFS, but i want to copy files from remote windows system. So can some one suggest which source option can i use to copy the files from remote windows system to HDFS using flume where i can specify the username and password.

Raghava9 · ‎06-21-2016

Hi I am using flume to copy the files from spooling directory to HDFS using file as the channel. #Component names a1.sources = src a1.channels = c1 a1.sinks = k1 #Source details a1.sources.src.type = spooldir a1.sources.src.channels = c1 a1.sources.src.spoolDir = /home/cloudera/onetrail a1.sources.src.fileHeader = false a1.sources.src.basenameHeader = true # a1.sources.src.basenameHeaderKey = basename a1.sources.src.fileSuffix = .COMPLETED a1.sources.src.threads = 4 a1.sources.src.interceptors = newint a1.sources.src.interceptors.newint.type = timestamp #Sink details a1.sinks.k1.type = hdfs a1.sinks.k1.channel = c1 a1.sinks.k1.hdfs.path = hdfs:///data/contentProviders/cnet/%Y%m%d/ # a1.sinks.k1.hdfs.round = false # a1.sinks.k1.hdfs.roundValue = 1 # a1.sinks.k1.hdfs.roundUnit = second a1.sinks.k1.hdfs.writeFormat = Text a1.sinks.k1.hdfs.fileType = DataStream #a1.sinks.k1.hdfs.file.Type = DataStream a1.sinks.k1.hdfs.filePrefix = %{basename} # a1.sinks.k1.hdfs.fileSuffix = .xml a1.sinks.k1.threadsPoolSize = 4 # use a single file at a time a1.sinks.k1.hdfs.maxOpenFiles = 1 # rollover file based on maximum size of 10 MB a1.sinks.k1.hdfs.rollCount = 0 a1.sinks.k1.hdfs.rollInterval = 0 a1.sinks.k1.hdfs.rollSize = 0 a1.sinks.k1.hdfs.batchSize = 12 # Channel details a1.channels.c1.type = file a1.channels.c1.checkpointDir = /tmp/flume/checkpoint/ a1.channels.c1.dataDirs = /tmp/flume/data/ # Bind the source and sink to the channel a1.sources.src.channels = c1 a1.sinks.k1.channels = c1 with the above configuration it is able to copy the files to hdfs but the problem which i am facing is one file is keep staying as .tmp and not copying the complete file content. Can some one help me what could be the problem.

Online	Offline
Last Visited	‎03-17-2017 12:06 AM

Member Since	‎06-21-2016 07:10 AM
Last Visited	‎03-17-2017 12:06 AM
Posts	6
Kudos received	1

Cloudera Community

How to copy files from remote windows system to HD...

Move files from a spooling directory to HDFS with ...