- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
java.lang.IllegalArgumentException: java.net.UnknownHostException: user
- Labels:
-
Apache Spark
Created on ‎05-28-2014 04:15 AM - edited ‎09-16-2022 01:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi there,
I have installed hadoop cluster together with spark using cloudera free manager, i used all default values.
Spark ios working, i can start spark-shell and on the web UI i see master and all the workers but everytime i try to run anything in spark-shell i get:
java.lang.IllegalArgumentException: java.net.UnknownHostException: user
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:237)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:141)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:576)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:521)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:146)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2397)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:221)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:270)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:140)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:205)
at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:205)
at org.apache.spark.rdd.FlatMappedRDD.getPartitions(FlatMappedRDD.scala:30)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:205)
at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:207)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:205)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:205)
at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:58)
at org.apache.spark.rdd.PairRDDFunctions.reduceByKey(PairRDDFunctions.scala:355)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:14)
at $iwC$$iwC$$iwC.<init>(<console>:19)
at $iwC$$iwC.<init>(<console>:21)
at $iwC.<init>(<console>:23)
at <init>(<console>:25)
at .<init>(<console>:29)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:772)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1040)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:609)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:640)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:604)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:795)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:840)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:752)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:600)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:607)
at org.apache.spark.repl.SparkILoop.loop(SparkILoop.scala:610)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:935)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:883)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:883)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:883)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:981)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
Caused by: java.net.UnknownHostException: user
... 69 more
Does anybody have a clue what is the issue here?
i did find few posts that were taliking about localhost name and HADOOP_CONF_DIR defined on workers but nothing helped.
Help is appreciated.
regards,
DaliborJ
Created ‎05-28-2014 04:57 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The proximate problem is that some host name is configured as "user", which doesn't sound like a host name. I would first look through all the host-related settings in your Hadoop conf to see where "user" appears and identify if any of these should be something else.
Created ‎05-28-2014 05:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi srowen,
thanks for replay.
i m running 5.0.0-1.cdh5.0.0.p0.47 on two clusters (prod and dev) both installed using everything defualt.
I have no idea where this setting might be, have you had any other question like this?
Created ‎06-05-2014 07:36 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
so the issue was:
val file = spark.textFile("hdfs://...")
where after hdfs:// comes the full host name where spark master is.
Created ‎08-05-2014 08:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am using the Cloudera quick start VM.
I am not sure what is the host name to provide in here after hdfs://
I tried providing the host name I found at Cloudera manager.
"Spark Master at spark://localhost.localdomain:7077"
After using this host name, I am getting a different exception
java.io.IOException: Failed on local exception: java.io.EOFException;
Created ‎08-05-2014 09:31 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
One of the errors suggested the host name to be used.
Its working now..
Created ‎08-03-2016 05:32 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
can you descript the fix solution more detaily? I met same issue
Created ‎09-07-2016 02:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I guess that they were using a path
hdfs://user
or
//user
The Spark interpreter is understanding that this path is an hdfs path, but the server was not included in this path, because hdfs path must include the hostname of the server where the file is located, I mean
hdfs://hostname/user
rather than
hdfs://user
Created ‎09-21-2017 04:34 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you are running on a node with hdfs gateway or in one of our training VMs, or in other situations where the host that provides access to HDFS is 'localhost', then you use three forward slashes, as in 'hdfs:///user/'. Its a common mistake to miss the third slash.
Created ‎01-18-2022 01:20 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For my case, I observed the spark job was working fine on some hosts and hitting the above exception for a couple of worker hosts.
Found that the issue with spark-submit --version on hosts. working hosts spark-submit version was version 2.4.7.7.1.7.0-551 and non-working hosts spark-submit version was version 3.1.2
I created the symbolic link with the correct spark-submit version file and the issue got resolved.
```
[root@host bin]# cd /usr/local/bin
[root@hostbin]# ln -s /etc/alternatives/spark-submit spark-submit
```
