About darouwan

darouwan · ‎04-01-2018

I have submitted a spark java program via "Spark Submit Jar" and it looks running well. However, when I click the logs link in specified application in Job tab in hue, it shows "cannot acces: /jobbrowser/jobs/appliacation_****/single_logs." So how can I find logs of running spark application?

darouwan · ‎03-26-2018

Fixed it by recovering spark home setting

darouwan · ‎03-23-2018

I am testing spark within zeppelin. But in running tutorial %spark2.spark spark.verson It throws the following error: java.lang.NullPointerException at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38) at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_2(SparkInterpreter.java:391) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:380) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:828) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:483) at org.apache.zeppelin.scheduler.Job.run(Job.java:175) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Then I disable the hive context according to https://stackoverflow.com/questions/43289067/getting-nullpointerexception-when-running-spark-code-in-zeppelin-0-7-1 , the same exception is still thrown. How to solve it? ========================================================= Update 1: I have checked the spark interpreter log, and get the following error: requirement failed:/python/lib/pyspark.zip not found;cannot run pyspark application in YARN mode. How to locate this file or config the path?

darouwan · ‎03-19-2018

I am trying on installing and running Apache Ranger, however, it throws "Java patch PatchPasswordEncryption_J10001is being applied by some other process" warning and stuck on this stage. I have followed this instruction https://community.hortonworks.com/content/supportkb/148592/errorjava-patch-patchpasswordencryption-j10001-is.html , but it still not work. Can anyone help me?

darouwan · ‎03-13-2018

Thanks, I have resolved it. I was regarding "Service Check" as "Pre-upgrade check".

darouwan · ‎03-13-2018

When I try to upgrade hdp version from 2.6.1 to 2.6.4, I meet the following pre-upgrade checks error : The following service configurations have been updated and their Service Checks should be run again: HDFS, OOZIE,ZOOKEEPER,HIVE,FLUME,KAFKA,SPARK2 Failed on: HDFS, OOZIE,ZOOKEEPER,HIVE,FLUME,KAFKA,SPARK2

darouwan · ‎03-08-2018

I am trying to read data from kafka and save to parquet file on hdfs. My code is similar to following, that the difference is I am writing in Java. val df = spark .readStream .format("kafka") .option("kafka.bootstrap.servers", "host1:port1,host2:port2") .option("subscribe", "topic1") .load() df.selectExpr("CAST(key AS STRING)","CAST(value AS STRING)").writeStream.format("parquet").option("path",outputPath).option("checkpointLocation", "/tmp/sparkcheckpoint1/").outputMode("append").start().awaiteTermination() However it threw "Uri without authority: hdfs:/data/_spark_metadata" exception, where "hdfs:///data" is the output path. When I change the code to spark.read and df.write to write out parquet file once, there is no any exception, so I guess it is not related to my hdfs config. Can anyone help me?

darouwan · ‎01-10-2018

Hi, buddies, I meet a problem when I tried to install cdh using mysql as my external database. After I installing the mysql and configuring it well, the cloudera scm server service log shows : Tabes hive unsupported engin type [MyISAM, CSV]. InnoDB is required,. Table mapping : **** I have set the innodb as the default engine of Mysql, and tried to change the existed tables to InnoDB engine. However, InnoDB engine cannot be applied on some tables like user, tables_priv and slow_log. The error message is: ERROR 1579: This storage engine cannot be used for table *** How to resolve it？ Thanks a lot！

darouwan · ‎01-10-2018

Thanks a lot! It may be a good idea for my issue!

darouwan · ‎01-10-2018

Hi , I have tried it, however I have no authority to change the hosts file on my computer since the computer is belong to company and I am not an administrator user. Even though I can access the web ui directly by changing local host file, it is still not available for other users. After all I cannot force every user to change his/her local host file.

Online	Offline
Last Visited	‎04-02-2018 12:02 AM

Member Since	‎12-21-2017 12:43 AM
Last Visited	‎04-02-2018 12:02 AM
Posts	67
Kudos received	3

Cloudera Community

Re: Access file on hdfs via proxy

Re: NullPointerException when running spark on Zep...

How to view the log of submitted spark jar program...

Re: NullPointerException when running spark on Zep...

NullPointerException when running spark on Zeppeli...

"Java patch PatchPasswordEncryption_J10001 is bein...

Re: The following service configurations have been...

The following service configurations have been upd...

Uri without authority: hdfs:/data/_spark_metadata ...

MySQL with InnoDB failed

Re: How to configure to access web ui (like hue) v...

Re: How to configure to access web ui (like hue) v...