About FrozenWave

FrozenWave · ‎07-22-2017

Thanks mbigelow, following your suggestions I solved the massive error logging issue. I've processed in a Json validator the specific log file referenced in the Java stack trace: /user/spark/applicationHistory/application_1494352758818_0117_1 But the format was correct, according to the validator. So I just moved it away in a temporary directory. As soon as I did it, the error messages stopped clogging the system logs. So it was probably corrupted in a very subtle way... But it was definitely corrupted That Json file has been indeed generated by the Spark Action that is giving me problems, but it was an OLD file. New instances of that Spark Action are generating new Json logs, but they are not giving any troubles to the History Server (stopped having tons of exceptions logged as I just said) Unfortunately, the Spark job itself is still failing and it's needing further investigation on my side, so apparently this is not related to that specific error message. But I've solved an annoying problem, and at the same time I have cleared out the possibility of the Spark Action issue being related to that java exception Thanks!

FrozenWave · ‎07-19-2017

Hi mbigelow, I've tried what ypu suggested (stop hive + running the action from the dropdown menu). Process was successful, but the warning in spark CLI is still there...

FrozenWave · ‎07-19-2017

Additional info. If I run spark CLI (where my spark procedures are working, btw, differently from when they are launched in Oozie), as soon as I try to define a Dataframe I receive the following warning that I've never seen before the upgrade: In [17]: utenti_DF = sqlContext.table("xxxx.yyyy") 17/07/19 15:48:58 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.1.0 17/07/19 15:48:58 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException Anyway, as I repeat, from CLI things work. I just thought this could be relevant

FrozenWave · ‎07-19-2017

Hi all, after recently upgrading to CDH 5.11 I get tons of the following "Unexpected end-of-input" log entries related to "SPARK" (running on YARN) and classified as "ERRORS". I'm experiencing malfunctionings (failed Oozie Jobs) and I believe they are related to these errors, so I'd really like to solve the causing issue and see if the situation gets any better. In the logs, "source" is: FsHistoryProvider And "message" is: Exception encountered when attempting to load application log hdfs://xxxxx.xxxxx.zz:8020/user/spark/applicationHistory/application_1494352758818_0117_1 com.fasterxml.jackson.core.JsonParseException: Unexpected end-of-input: was expecting closing quote for a string value at [Source: java.io.StringReader@1fec7fc4; line: 1, column: 3655] at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1369) at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:599) at com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:532) at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._finishString2(ReaderBasedJsonParser.java:1517) at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._finishString(ReaderBasedJsonParser.java:1505) at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.getText(ReaderBasedJsonParser.java:205) at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:20) at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:42) at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:35) at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:28) at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:42) at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:35) at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:42) at org.json4s.jackson.JValueDeserializer.deserialize(JValueDeserializer.scala:35) at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:2888) at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2034) at org.json4s.jackson.JsonMethods$class.parse(JsonMethods.scala:19) at org.json4s.jackson.JsonMethods$.parse(JsonMethods.scala:44) at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:58) at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$replay(FsHistoryProvider.scala:583) at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$16.apply(FsHistoryProvider.scala:410) at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$16.apply(FsHistoryProvider.scala:407) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105) at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$mergeApplicationListing(FsHistoryProvider.scala:407) at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$checkForLogs$3$$anon$4.run(FsHistoryProvider.scala:309) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Any suggestions/ideas? Thanks!

FrozenWave · ‎06-27-2017

In the end I've been able to solve the issue. I've been tricked by the fact that applying again from scratch the "YARN Resources Allocation Tuning Guide" proposed a (in my opinion) misleading way of calculating a few important parameters. Guide can be found here: https://www.cloudera.com/documentation/enterprise/5-10-x/topics/cdh_ig_yarn_tuning.html In a matter of fact, the Guide contains a downloadable XLS file which is a tool for calculating optimal parameters. This XLS automatically calculates and proposes a few values to be assigned to YARN configuration: As you can see above, at step 4 I got proposed "2" for "yarn.nodemanager.resource.cpu-vcores" and "5632" for "yarn.nodemanager.resource.memory-mb" I later found out that the correct values to be assigned to those configurations are the 2 values proposed at "step 5" Definitely, partly my fault (I do not have deep knowledge of YARN configuration). But partly misleading doc indeed. I am now fine tuning, trying different settings for the various java heap sizes etc Still I have no idea why everything was working fine until recently and stopped working after upgrading to 5.11, as I did not change any configuration while upgrading and physical resources are identical

FrozenWave · ‎06-26-2017

As I believe that the problem is definitely due to differences betweek CDH 5.7 and CDH 5.11 in how resources are allocated to containers by YARN, I've tried to follow again from scratch the YARN Tuning Guide. The latest version of the YARN Tuning Guide available is apparently for CDH 5.10: https://www.cloudera.com/documentation/enterprise/5-10-x/topics/cdh_ig_yarn_tuning.html In that page, an XLS Sheet is available to help out planning the various parameters in a correct and working fashion. No luck. I always find myself with jobs stuck in "ACCEPTED" mode and never starting to run. I also found this interesting thread suggesting how to configure Dynamic Resource Pools for YARN: https://www.cloudera.com/documentation/enterprise/5-10-x/topics/cm_mc_resource_pools.html#concept_xkk_l1d_wr__section_c3f_vwf_4n I tried to limit the "number of concurrent jobs" to just 2 in the relevant Configuration Page of the Dynamic Resource Pools, but again, no success. Can anybody please point me out whatever new feature that could have been implemented in CDH 5.11 and related to YARN Resources Allocation (and that I have not mentioned here), because my Workflows were running smoothly before the upgrade, and now I'm facing heavy troubles! Workarounds are welcome too, as well as methods for monitoring/tracing resources usage in a way allowing me to understand what parameters I've been set up in a way that is not functional anymore in CDH 5.11 Thanks a lot for any hints or insights!

FrozenWave · ‎06-22-2017

FrozenWave · ‎06-21-2017

Hello, after successfully upgrading a small (5 nodes) CDH 5.7 cluster to CDH 5.11, I am experiencing various problems on existing Oozie Workflows that used to work correctly. The most significant example: I have this Workflow scheduling 8 jobs in parallel (mix of Hive, Shell and Sqoop actions). The 8 jobs are acquired and start running. But the 8 sub-jobs performing the action are stuck in "ACCEPTED" status and never switch to "RUNNING" state. After hours of work, I've not been able to find anything significant in the logs, apart from a few complaining about log4j. So I decided to upgrade JDK from 1.7 to 1.8 too, but without any improvement in the situation. Any help or suggestion pointing me in the right direction in solving this would be very very much appreciated! Thanks

FrozenWave · ‎01-03-2017

Hi AdrianMonter, sorry to say I haven't found a specific solution for the Avro file format in the meanwhile. I'm sticking to Parquet file format since I had this problem, and for now it covers all my needs... Maybe in the latest CDH/Spark releases this has been fixed? Maybe somebody from @Former Member can tell us someting more?

FrozenWave · ‎10-17-2016

Hi aj, yes I did manage to solve it. Please, take a look at the following thread and see if it can be of help. It may seem a bit unrelated from the "test.py not found" issue, but it contains detailed info about how to specify all the needed parameters to let the whole thing run smoothly: http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Oozie-workflow-Spark-action-using-simple-Dataframe-quot-Table/m-p/40834 HTH

Online	Offline
Last Visited	‎11-27-2024 10:58 AM

Member Since	‎01-05-2016 04:28 AM
Last Visited	‎11-27-2024 10:58 AM
Posts	60
Kudos received	42

Cloudera Community

Re: CDH Express 6.3.1 - Can't start HDFS after clu...

Re: I would to write a script which will check the...

Re: Running shell scripts in oozie using hue

Re: Oozie Workflows problems after upgrading from ...

Re: Oozie workflow, Spark action (using simple Dat...

Re: Spark on Yarn - Unexpected end-of-input: was e...

Re: Spark on Yarn - Unexpected end-of-input: was e...

Re: Spark on Yarn - Unexpected end-of-input: was e...

Spark on Yarn - Unexpected end-of-input: was expec...

Re: Oozie Workflows problems after upgrading from ...

Re: Oozie Workflows problems after upgrading from ...

Re: Oozie Workflows problems after upgrading from ...

Oozie Workflows problems after upgrading from CDH ...

Re: Pyspark: Table Dataframe returning empty recor...

Re: "Failing Oozie Launcher - key not found: SPARK...