Member since
11-04-2016
74
Posts
16
Kudos Received
7
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3326 | 02-28-2019 03:22 AM | |
2944 | 02-01-2019 01:15 AM | |
4168 | 04-16-2018 03:38 AM | |
32649 | 09-16-2017 04:36 AM | |
9119 | 09-11-2017 02:43 PM |
02-20-2018
01:18 PM
Apache Livy is not exactly like Apache Spark on YARN in Zeppelin. There are things that you will notice you don't have even in Zeppelin 0.7.3 with Livy 0.4.0:
No multiple output: If you have a count and a show and bunch of other stuff in one block you will only see the last result in Livy No ZeppelinContext in Livy No dep() to add dependency by user without you adding them manually and restart the Livy server I like Livy, but I had to move to Spark interpreter because of these features missing in Livy. Also, time to time you see error 500 which is really really hard to debug and see what caused crashing your app as suppose in Spark interpreter it will just show you the error itself.
... View more
01-25-2018
08:40 AM
I just upgraded the entire cluster to 5.14 and the issue still remains: CDH: 5.14 CM: 5.14
... View more
01-22-2018
07:48 AM
1 Kudo
Unfortunately this happened again in 5.13.1 as I did "Update Hive Metastore NameNodes" and it added the port twice.
... View more
01-22-2018
03:11 AM
If I export the current job it shows me this: "applications" : [ {
"applicationId" : "application_1516618738289_0001",
"name" : "livy-session-0",
"startTime" : "1970-01-01T00:00:00.000Z",
"user" : "maziyar",
"pool" : "root.users.maziyar",
"state" : "RUNNING",
"progress" : 10.0,
"attributes" : { },
"mr2AppInformation" : { }
}, {
"applicationId" : "application_1516618738289_0002",
"name" : "Main",
"startTime" : "1970-01-01T00:00:00.000Z",
"user" : "maziyar",
"pool" : "root.users.maziyar",
"state" : "RUNNING",
"progress" : 10.0,
"attributes" : { },
"mr2AppInformation" : { }
} ],
"warnings" : [ ]
} The startTime is in 1970 for some reason! This date is really famouse in Unix: "January 1, 1970 is the so called Unix epoch. It's the date where they started counting the Unix time. If you get this date as a return value, it usually means that the conversion of your date to the Unix timestamp returned a (near-) zero result. So the date conversion doesn't succeed" So is it the backend of Cloudera Manager that has ` returns 0` or the MySQL conversion some where pass unsupported format .
... View more
01-18-2018
09:17 AM
I clear all the logs and previous jobs but the CM still have all the finished jobs with the wrong date. Also, still shows the new apps with that weird duration (which it looks like the converting the milliseconds to another time format went wrong). Does anybody know where is this data coming from? I have a MySQL setup for my CM. Can I look for this to see if this is a front-end issue or back-end or being inserted into file/table wrongly from the beginning. Many thanks.
... View more
01-15-2018
11:07 AM
1 Kudo
Hello, I am having a problem that I can't find any logical solution. Every job that requires YARN it will show up in "YARN Applications" UI on Cloudera Manager. Even though I can see all the running jobs on YARN Applications UI, ResourceManager UI, or Spark UI I have to widen my time selector to a year or two to see the finished jobs. I think this has something to do with displayed time. All the running jobs have the static `17540.7d` as their duration: At the same time these applications on `ResourceManager` are showing up with the right date/time: As you can see this makes it really hard to monitor and track anything in YARN Applications view in Cloudera Manager. Cloudera Manager express: 5.13.1 CDH: 5.13.1 Ubuntu Server 16.04 And I checked all the machines date/time to see if they are not sync. But unfortunately I can't find any issue in my cluster. NOTE: there is only one similar issue here, but I guess he can't see any jobs even by widening time window. (I can see jobs with wider time window 1-2yrs) http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Completed-YARN-applications-not-visible-in-Cloudera-Manager-s/m-p/19858#M617 Best, Maziyar
... View more
Labels:
- Labels:
-
Apache YARN
11-02-2017
01:57 AM
1 Kudo
I think you are missing this which it was mentioned here: [desktop] use_new_editor=true Hope it helps
... View more
09-16-2017
04:36 AM
Hi, Sorry I forgot to come back here and say how I found a quick workaround. So, here's how I do it: import org.apache.spark.sql.DataFrameWriter
val options = Map("path" -> "this is the path to your warehouse") // for me every database has a different warehouse. I am not using the default warehouse. I am using users' directory for warehousing DBs and tables
//and simply write it!
df.write.options(options).saveAsTable("db_name.table_name") So as you can see a simple path to the warehouse of the database will solve the problem. I want to say Spark 2 is not aware of these metadata, but when you look at your spark.catalog you can see everything is there! So I don't know why it can't decide where is the path to your database when you want to write.save. Hope this helps 🙂
... View more
09-12-2017
03:49 PM
OK, I have tried and it seems it's best to copy hive-site.xml into livy/conf/ and it will load it in every session. Best,
... View more
09-12-2017
03:31 PM
Hi, Something to be careful is when you do "Deploy Client Configuration" on your Spark2 service it will remove the link or the hive-site.xml if you have copied it. I have noticed all these config are in $SPARK_CONF_DIR/yarn-conf/ so I wish Livy could also load them when it starts up the Spark.
... View more