About MilesYao

MilesYao · ‎07-21-2017

That's good news. But I think the requester would like to know when Cloudera plans to integrate Spark 2 into CDH, not as a separate install (like what Hortonworks does). Thanks, Miles

MilesYao · ‎07-12-2017

On HDP 2.6, appending $CLASSPATH seems to break Spark2 interpreter with: "org.apache.zeppelin.interpreter.InterpreterException: Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;" Is the included Phoenix-Spark driver (phoenix-spark-4.7.0.2.6.1.0-129.jar) certified to work with Spark2? I thought it's the preferred way rather than via JDBC. Thanks!

MilesYao · ‎07-07-2017

I had the same problem with a valid Linux/HDFS user as Ambari ID, the solution worked - thanks!

MilesYao · ‎05-22-2017

Debian 8 (Jessie) has been made current stable version for a year now. When do you plan to support it? Is there any known issue that blocks its adoption?

MilesYao · ‎05-22-2017

We have HDP 2.4 on Debian 7. There is no /usr/lib/python2.6/site-packages/ambari_server/os_type_check.sh installed - only os_check_type.py. And all it checks is whether the current node OS matches the cluster, not whether the OS version is supported. /usr/lib/ambari-server/lib/ambari_commons/resources/os_family.json seems to list the supported OS versions (e.g. RedHat 6/7, Debian 7, Ubuntu 12/14) which matches documentation.

MilesYao · ‎03-16-2017

First, thanks for the helpful detailed explanation. We have a similar issue of migrating from default embedded DB to a separate PostgreSQL instance. Some comments: The documentation needs to be clearer - the criteria for determining "embeddedness" you listed is not intuitive and could not have been inferred from the documentation. Your writeup should have been included right there. The embeddedness criteria seem over-strict. Insisting the DB be off-cluster is based on the old 3-tier architecture assumption - on the other hand, the Hadoop architectural principle is about co-hosting data and software. On the practical side, basing such a central component off-cluster just seems needlessly inefficient and difficult to manage. Can't the best practice be to use one dedicated node for CM, CMS, and DB? Can Cloudera provide some guidelines? For production use, the external DB option requires too many manual steps across multiple services. Can Cloudera Manager provide more central admin and integration? Including transparent migration from embedded DB. This again requires the DB node to be part of the cluster under CM management. Thanks, Miles Yao

MilesYao · ‎01-19-2017

Can you elaborate a bit on how to set up the environment properly in the shell wrapper before calling spark-submit? Which login to get the action to run as? (owner/yarn/spark/oozie) We've had a lot of problems getting the setup right when we implemented shell actions that wrap Hive queries (to process query output). spark-submit itself is a shell wrapper that does a lot of environment initialization, so I imagine it won't be smooth. Thanks! Miles

MilesYao · ‎01-04-2017

We were able to install the official parcel. The only problem encountered was that all the signature files in the repository have extension .sha1. Our CM (5.8.3) were expecting .sha . Manually renaming it allowed the install to complete.

MilesYao · ‎12-13-2016

Hi Cloudera folks: The new official Spark2 release looks identical to the beta version released last month. Any difference to expect if we already have the beta installed? Should we re-install? Thanks, Miles

MilesYao · ‎11-08-2016

Yes, that works. "CSD file" sounds like a text config file. Adding a simple description that it's a JAR in the instruction page would have clarified. Thanks again. Miles

Online	Offline
Last Visited	‎03-25-2021 12:17 PM

Member Since	‎03-04-2015 03:05 PM
Last Visited	‎03-25-2021 12:17 PM
Posts	96
Kudos received	10

Cloudera Community

Re: Spark 2

Re: Cannot pass value from Hive query output direc...

Re: spark 2.2 parcel availability in CDH

Re: Enable phoenix access from Zeppelin in secure ...

Re: Issue while using Hive View in Ambari console

Re: HDP 2.6/Ambari 2.5 GA!

Re: How to register host with different OS to Amba...

Re: change from embedded to external database uncl...

Re: Run Oozie Shell Action instead of Oozie Spark ...

Re: Spark 2

Spark 2 - official and beta

Re: What to do with Spark 2.0 CSD jar