Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

scala - Read files to compare the lines

Highlighted

scala - Read files to compare the lines

New Contributor

Hi,

I wanted to write hdfs file compare in scala functional programming. To start with I have written some code(googled to handle file closing and catching exceptions) to read a single file. I have proceeded so far, I was successful in reading first line but the code does not loop to read the next lines. Any help please. I do not want to use spark.

import java.io.{BufferedReader, FileInputStream, InputStreamReader}
import java.net.URI
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FSDataInputStream, FileSystem, Path}
import scala.util.{Failure, Success, Try}

object DRCompareHDFSFiles {
def main(args: Array[String]): Unit = {
val hdfs = FileSystem.get(new Configuration())
val path1 = new Path(args(0))
val path2 = new Path(args(1))
readHDFSFile(hdfs, path1, path2)
}

// Accept a parameter which implements a close method
def using[A <: { def close(): Unit }, B](resource: A)(f: A => B): B =
try {
f(resource)
} finally {
resource.close()
}

def readHDFSFile(hdfs: FileSystem, path1: Path, path2: Path): Option[Stream[(String,String)]] = {
Try(using(new BufferedReader(new InputStreamReader(hdfs.open(path1))))(readFileStream))
} match {
case Success(result) => {
I am expecting collections of string but get only string
}
case Failure(ex) => {
println(s"Could not read file $path1, detail ${ex.getClass.getName}:${ex.getMessage}")
None
}
}

def readFileStream(br: BufferedReader)= {
for {
line <- Try(br.readLine())
if (line != null )
} yield line
}

}

1 REPLY 1

Re: scala - Read files to compare the lines

New Contributor

I got it working. I used Streams. Thanks

Don't have an account?
Coming from Hortonworks? Activate your account here