Member since
01-24-2017
69
Posts
2
Kudos Received
0
Solutions
04-28-2017
10:09 PM
Can a spark job running under yarn write a file not to HDFS (that works fine) but to a shared file system (we use GPFS but I doubt it matters). So far I could not make it work. The command that fails is: ts.saveAsTextFile("file:///home/me/z11") Notice that /home/me is mounted on all the nodes of the Hadoop cluster. The error that I am getting is: ============ at py4j.GatewayConnection.run(GatewayConnection.java:214) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Mkdirs failed to create file:/home/me/z11/_temporary/0/_temporary/attempt_201704290002_0002_m_000000_15 (exists=false, cwd=file:/data/6/yarn/nm/usercache/ivy2/appcache/application_1490816225123_1660/container_e04_1490816225123_1660_01_000002) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:447) ============ The empty directory /home/me/z11/_temporary/0/ was created but that's all.
... View more
Labels:
- Labels:
-
Apache Spark
04-06-2017
09:36 PM
Hi Jordan, Yes, Cloudera also recommended increasing the heap size and after I did it couple weeks ago, I did not see any more crashes. It is rather surprising though that the default configuration causes crashes. That raises the question how optimal or even acceptable other parameters are and how to tune them. Thank you, Igor
... View more
03-07-2017
09:04 AM
All of a sudden (I do not think it was really used), HBase shows red status. The log message says: ============== Mar 6, 2:05:39.141 PM ERROR org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer Thrift error occurred during processing of message. org.apache.thrift.protocol.TProtocolException: Bad version in readMessageBegin at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:223) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) at org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) ============= I have started HBase Thrift Server and so far it appears to be running. The only recent significant change to the cluster I can think of was fully enabling TLS. Could that affect HBase?
... View more
Labels:
- Labels:
-
Apache HBase
03-02-2017
10:26 PM
And a lot of other components have TLS in Security section ... Are those mandatory or only needed for Kerberos?
... View more
03-02-2017
10:21 PM
It looks like Yarn also needs to be told about TLS? Would it work without it if TLS is fully enabled?
... View more
03-02-2017
08:37 PM
I have undone all the TLS enabling and still had the same problem. Eventually it occurred to me that some processes might be stuck. So I physically rebooted all the Hadoop machines and that resolved the problem. After that I was able to reenable all the steps for TLS. However, I still have a problem with pyspark and spark-submit: they run with --master=yarn --deploy-mode=client but fail with --master=yarn --deploy-mode=cluster ============== [ivy2@md01 ~]$ pyspark --master=yarn --deploy-mode=cluster Python 2.7.5 (default, Nov 20 2015, 02:00:19) [GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. Error: Cluster deploy mode is not applicable to Spark shells. Run with --help for usage help or --verbose for debug output Traceback (most recent call last): File "/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/spark/python/pyspark/shell.py", line 43, in <module> sc = SparkContext(pyFiles=add_files) File "/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/spark/python/pyspark/context.py", line 112, in __init__ SparkContext._ensure_initialized(self, gateway=gateway) File "/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/spark/python/pyspark/context.py", line 245, in _ensure_initialized SparkContext._gateway = gateway or launch_gateway() File "/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/spark/python/pyspark/java_gateway.py", line 94, in launch_gateway raise Exception("Java gateway process exited before sending the driver its port number") Exception: Java gateway process exited before sending the driver its port number >>> ========== Could TLS intefere with spark?
... View more
03-02-2017
02:40 PM
As a back up solution, how do I disable TLS? use_tls=0 in /etc/cloudera-scm-agent/config.ini plus undo all TLS/SSL enables on two web pages, then restart server, agents, cloudera management services? I need to have cluster in production within a few days.
... View more
03-02-2017
02:23 PM
Also, pig actually seem to work both with -x local and -x mapreduce. I think I just mess up the directories the first time. But spark is definitely a problem.
... View more
03-02-2017
02:17 PM
I think, I misinterpreted it: pyspark or spark-submit crash on the cluster but not locally. As far as I understand --master=yarn --deploy-mode=client is running locally and --master=yarn --deploy-mode=cluster is running on the cluster and pyspark by default is probably trying to run on a cluster.
... View more