Support Questions

Find answers, ask questions, and share your expertise
Welcome to the upgraded Community! Read this blog to see What’s New!

Intelli J IDEA: No FileSystem for scheme: null issue occurs since upgrading to Spark 2.0

New Contributor


After upgrading to Spark 2.0 recently I cannot seem to get something that relies on Spark SQL to run in Intelli J Idea in Windows. My code is provided below.

import org.apache.spark.sql.SparkSession

object PropertyInvestmentCalcs {
  def main(args: Array[String]) {

    val spark = SparkSession
      .appName("Spark PropertyInvestmentCalcs")
      .config("spark.sql.warehouse.dir", "\\\\TJVRLAPTOP\\Users\\tjoha\\Google Drive\\Programming\\IntelliJ\\PropertyInvestmentCalcs\\spark-warehouse")
      //.config("fs.hdfs.impl", classOf[org.apache.hadoop.hdfs.DistributedFileSystem].getName)
      //.config("fs.file.impl", classOf[org.apache.hadoop.fs.LocalFileSystem].getName)

    //val sqlContext = new org.apache.spark.sql.SparkSession(spark)

    // Get number of data records in the table
    val nrRecordsDF ="jdbc")
      .option("url", "jdbc:mysql://localhost:3306/test")
      .option("driver", "com.mysql.jdbc.Driver")
      .option("dbtable", "(SELECT COUNT(*) AS nrRecords FROM test.propertydb) AS nrRecords_tmp")
      .option("user", "tjohannvr")
      .option("password", "[5010083]").load()
    val nrRecords = nrRecordsDF.head().getLong(0)
    println("nrRecords = " + nrRecords)

    // Select data from MySQL, with a specific number of records at a time
    val NrRecordsAtATime = nrRecords * 0 + 100000
    println("NrRecordsAtATime = " + NrRecordsAtATime)



The error which occurs when trying to set nrRecordsDF is provided below:

Exception in thread "main" No FileSystem for scheme: null

Is it perhaps unable to load hadoop classes, specifically those related to HDFS. Why would this be the case?

Below is the contents of my build.sbt file.

import sbt._

name := "PropertyInvestmentCalcs"

version := "0.1.0-SNAPSHOT"

organization := "TJVR"

scalaVersion := "2.11.8"

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0" //% "provided"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0" //% "provided"
libraryDependencies += "org.apache.spark" %% "spark-hive" % "2.0.0" //% "provided"
libraryDependencies += "mysql" % "mysql-connector-java" % "5.1.39"

lazy val commonSettings = Seq(
  version := "0.1-SNAPSHOT",
  organization := "TJVR",
  scalaVersion := "2.11.8"

lazy val app = (project in file("app")).
  settings(commonSettings: _*).
    // your settings here

artifact in (Compile, assembly) := {
  val art = (artifact in (Compile, assembly)).value
  art.copy(`classifier` = Some("assembly"))

addArtifact(artifact in (Compile, assembly), assembly)

Thanks in advance for any help.



Try enabling Spark debug and if possible get complete stack trace. Does the error only happen on Windows?

New Contributor

Hi @vshukla

Thanks for the answer. I don't intend to create a Linux or other machine to test as I prefer to stick with Windows.

Here is the stack trace:

16/08/04 13:08:58 INFO SharedState: Warehouse path is '\\SERVER\Users\USER\STORAGE\Programming\IntelliJ\PropertyInvestmentCalcs\spark-warehouse'. Exception in thread "main" No FileSystem for scheme: null at org.apache.hadoop.fs.FileSystem.getFileSystemClass( at org.apache.hadoop.fs.FileSystem.createFileSystem( at org.apache.hadoop.fs.FileSystem.access$200( at org.apache.hadoop.fs.FileSystem$Cache.getInternal( at org.apache.hadoop.fs.FileSystem$Cache.get( at org.apache.hadoop.fs.FileSystem.get( at org.apache.hadoop.fs.Path.getFileSystem( at org.apache.spark.sql.catalyst.catalog.SessionCatalog.makeQualifiedPath(SessionCatalog.scala:115) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createDatabase(SessionCatalog.scala:145) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.<init>(SessionCatalog.scala:89) at org.apache.spark.sql.internal.SessionState.catalog$lzycompute(SessionState.scala:95) at org.apache.spark.sql.internal.SessionState.catalog(SessionState.scala:95) at org.apache.spark.sql.internal.SessionState$anon$1.<init>(SessionState.scala:112) at org.apache.spark.sql.internal.SessionState.analyzer$lzycompute(SessionState.scala:112) at org.apache.spark.sql.internal.SessionState.analyzer(SessionState.scala:111) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64) at org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:382) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:143) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:122) at PropertyInvestmentCalcs$.main(PropertyInvestmentCalcs.scala:27) at PropertyInvestmentCalcs.main(PropertyInvestmentCalcs.scala)

16/08/04 13:09:00 INFO SparkContext: Invoking stop() from shutdown hook

16/08/04 13:09:00 INFO SparkUI: Stopped Spark web UI at

16/08/04 13:09:00 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 16/08/04 13:09:00 INFO MemoryStore: MemoryStore cleared

16/08/04 13:09:00 INFO BlockManager: BlockManager stopped

16/08/04 13:09:00 INFO BlockManagerMaster: BlockManagerMaster stopped

16/08/04 13:09:00 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 16/08/04 13:09:00 INFO SparkContext: Successfully stopped SparkContext

16/08/04 13:09:00 INFO ShutdownHookManager: Shutdown hook called

16/08/04 13:09:00 INFO ShutdownHookManager: Deleting directory C:\Users\USER\AppData\Local\Temp\spark-523c95ef-a46c-4b16-88c6-3da8f6f2a801

Does it give sufficient additional information?

Thanks in advance for any help!