Support Questions
Find answers, ask questions, and share your expertise

Exercise 3--Error

New Contributor

I get this error:

 

<console>:25: error: not found: value order_items
val orders = order_items.map { x => (

 

 

After  copy paste :

scala> val orders = order_items.map { x => (
| x.get("order_item_product_id"),
| (x.get("order_item_order_id"), x.get("order_item_quantity")))
| }.join(
| products.map { x => (
| x.get("product_id"),
| (x.get("product_name")))
| }
| ).map(x => (
| scala.Int.unbox(x._2._1._1), // order_id
| (
| scala.Int.unbox(x._2._1._2), // quantity
| x._2._2.toString // product_name
| )
| )).groupByKey()

 


What am i doing wrong?.

 

 

 

3 REPLIES 3

Master Collaborator
There's a previous line that starts with "val order_items = " that has
apparently not been done successfully. Can you see the output from when
that line was entered?

New Contributor

scala> val order_items = rddFromParquetHdfsFile(warehouse + "order_items"); java.io.IOException: Incomplete HDFS URI, no host: hdfs://%7B%7Bcluster_data.manager_node_hostname%7D%7D/user/hive/warehouse/order_items at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:142) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2643) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2680) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2662) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:379) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:500) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:469) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.rddFromParquetHdfsFile(:29) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:31) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:36) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:38) at $iwC$$iwC$$iwC$$iwC$$iwC.(:40) at $iwC$$iwC$$iwC$$iwC.(:42) at $iwC$$iwC$$iwC.(:44) at $iwC$$iwC.(:46) at $iwC.(:48) at (:50) at .(:54) at .() at .(:7) at .() at $print() at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814) at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657) at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665) at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059) at org.apache.spark.repl.Main$.main(Main.scala:31) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Master Collaborator
Looks like you're pasting from a copy of the tutorial that has some blanks
you need to fill in. Cloudera Live clusters and QuickStart VMs both have a
copy of the tutorial that should fill these in for you. Specifically, it
looks like when warehouse is getting initialized, the HDFS URL does not
have the NameNode address replaced with a real value.