Member since
08-15-2016
22
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1264 | 08-25-2016 11:12 PM |
04-12-2017
08:05 PM
Any one has any ideas ?
... View more
04-11-2017
09:05 PM
@bgooley Here is my issue and log. The Zookeeper service has been added successfully. But when adding HDFS,the issue just came out. Any idea about this?
... View more
04-11-2017
09:02 PM
I have meet an issue when add hdfs service, the message is : This is part of my cloudera-scm-agent.log on host 'SZB-L0020042': [12/Apr/2017 11:40:32 +0000] 1128 MainThread heartbeat_tracker INFO HB stats (seconds): num:43 LIFE_MIN:0.05 min:0.00 mean:0.05 max:0.31 LIFE_MAX:0.20
[12/Apr/2017 11:44:56 +0000] 1128 CP Server Thread-7 _cplogging INFO 192.168.0.200 - - [12/Apr/2017:11:44:56] "GET /heartbeat HTTP/1.1" 200 2 "" "NING/1.0"
[12/Apr/2017 11:44:56 +0000] 1128 MainThread util INFO Using generic audit plugin for process cluster-host-inspector
[12/Apr/2017 11:44:56 +0000] 1128 MainThread util INFO Creating metadata plugin for process cluster-host-inspector
[12/Apr/2017 11:44:56 +0000] 1128 MainThread util INFO Using specific metadata plugin for process cluster-host-inspector
[12/Apr/2017 11:44:56 +0000] 1128 MainThread util INFO Using generic metadata plugin for process cluster-host-inspector
[12/Apr/2017 11:44:56 +0000] 1128 MainThread agent WARNING Not creating a monitor for 146-cluster-host-inspector: MonitorDaemon not created yet
[12/Apr/2017 11:44:56 +0000] 1128 MainThread agent INFO First time to activate the process [146-cluster-host-inspector].
[12/Apr/2017 11:44:56 +0000] 1128 MainThread agent INFO Created /opt/cm-5.7.0/run/cloudera-scm-agent/process/146-cluster-host-inspector
[12/Apr/2017 11:44:56 +0000] 1128 MainThread agent INFO Chowning /opt/cm-5.7.0/run/cloudera-scm-agent/process/146-cluster-host-inspector to root (0) root (0)
[12/Apr/2017 11:44:56 +0000] 1128 MainThread agent INFO Chmod'ing /opt/cm-5.7.0/run/cloudera-scm-agent/process/146-cluster-host-inspector to 0751
[12/Apr/2017 11:44:56 +0000] 1128 MainThread parcel INFO prepare_environment begin: {}, [], []
[12/Apr/2017 11:44:56 +0000] 1128 MainThread parcel INFO No parcels activated for use
[12/Apr/2017 11:44:56 +0000] 1128 MainThread util INFO Extracted 1 files and 0 dirs to /opt/cm-5.7.0/run/cloudera-scm-agent/process/146-cluster-host-inspector.
[12/Apr/2017 11:44:56 +0000] 1128 MainThread agent INFO Created /opt/cm-5.7.0/run/cloudera-scm-agent/process/146-cluster-host-inspector/logs
[12/Apr/2017 11:44:56 +0000] 1128 MainThread agent INFO Chowning /opt/cm-5.7.0/run/cloudera-scm-agent/process/146-cluster-host-inspector/logs to root (0) root (0)
[12/Apr/2017 11:44:56 +0000] 1128 MainThread agent INFO Chmod'ing /opt/cm-5.7.0/run/cloudera-scm-agent/process/146-cluster-host-inspector/logs to 0751
[12/Apr/2017 11:44:56 +0000] 1128 MainThread agent INFO Triggering supervisord update.
[12/Apr/2017 11:44:56 +0000] 1128 MainThread agent INFO Begin audit plugin refresh
[12/Apr/2017 11:44:56 +0000] 1128 MainThread navigator_plugin INFO Scheduling a refresh for Audit Plugin for cluster-host-inspector with pipelines []
[12/Apr/2017 11:44:56 +0000] 1128 MainThread agent INFO Begin metadata plugin refresh
[12/Apr/2017 11:44:56 +0000] 1128 MainThread navigator_plugin INFO Scheduling a refresh for Metadata Plugin for cluster-host-inspector with pipelines []
[12/Apr/2017 11:44:56 +0000] 1128 MainThread agent INFO Daemon refresh complete.
[12/Apr/2017 11:44:58 +0000] 1128 Metadata-Plugin navigator_plugin INFO Refreshing Metadata Plugin for cluster-host-inspector with pipelines []
[12/Apr/2017 11:44:58 +0000] 1128 Metadata-Plugin navigator_plugin_pipeline INFO Stopping Navigator Plugin Pipeline '' for cluster-host-inspector (log dir: None)
[12/Apr/2017 11:44:58 +0000] 1128 Audit-Plugin navigator_plugin INFO Refreshing Audit Plugin for cluster-host-inspector with pipelines []
[12/Apr/2017 11:44:58 +0000] 1128 Audit-Plugin navigator_plugin_pipeline INFO Stopping Navigator Plugin Pipeline '' for cluster-host-inspector (log dir: None)
[12/Apr/2017 11:45:11 +0000] 1128 CP Server Thread-8 _cplogging INFO 192.168.0.200 - - [12/Apr/2017:11:45:11] "GET /process/146-cluster-host-inspector/files/inspector HTTP/1.1" 200 1978 "" "Java/1.7.0_67"
[12/Apr/2017 11:45:26 +0000] 1128 MainThread agent INFO Process with same id has changed: 146-cluster-host-inspector.
[12/Apr/2017 11:45:26 +0000] 1128 MainThread agent INFO Deactivating process 146-cluster-host-inspector
[12/Apr/2017 11:45:28 +0000] 1128 Metadata-Plugin navigator_plugin INFO stopping Metadata Plugin for cluster-host-inspector with pipelines []
[12/Apr/2017 11:45:28 +0000] 1128 Metadata-Plugin navigator_plugin_pipeline INFO Stopping Navigator Plugin Pipeline '' for cluster-host-inspector (log dir: None)
[12/Apr/2017 11:45:28 +0000] 1128 Audit-Plugin navigator_plugin INFO stopping Audit Plugin for cluster-host-inspector with pipelines []
[12/Apr/2017 11:45:28 +0000] 1128 Audit-Plugin navigator_plugin_pipeline INFO Stopping Navigator Plugin Pipeline '' for cluster-host-inspector (log dir: None)
[12/Apr/2017 11:49:26 +0000] 1128 CP Server Thread-9 _cplogging INFO 192.168.0.200 - - [12/Apr/2017:11:49:26] "GET /heartbeat HTTP/1.1" 200 2 "" "NING/1.0"
[12/Apr/2017 11:49:26 +0000] 1128 MainThread client_configs INFO Client configs to delete: set([])
[12/Apr/2017 11:49:26 +0000] 1128 MainThread client_configs INFO Client configs to deploy: set([<hadoop-conf,/etc/hadoop/conf.cloudera.hdfs>])
[12/Apr/2017 11:49:26 +0000] 1128 MainThread client_configs INFO Client configs to update: set([])
[12/Apr/2017 11:49:26 +0000] 1128 MainThread parcel INFO prepare_environment begin: {u'CDH': u'5.7.0-1.cdh5.7.0.p0.45'}, [u'cdh'], [u'cdh-plugin', u'hdfs-plugin']
[12/Apr/2017 11:49:26 +0000] 1128 MainThread parcel INFO The following requested parcels are not available: {}
[12/Apr/2017 11:49:26 +0000] 1128 MainThread parcel INFO Obtained tags ['cdh', 'impala', 'sentry', 'solr', 'spark'] for parcel CDH
[12/Apr/2017 11:49:26 +0000] 1128 MainThread parcel INFO prepare_environment end: {'CDH': '5.7.0-1.cdh5.7.0.p0.45'}
[12/Apr/2017 11:49:26 +0000] 1128 MainThread util INFO Extracted 7 files and 0 dirs to /opt/cm-5.7.0/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.hdfs_6724255950192424543.
[12/Apr/2017 11:49:26 +0000] 1128 MainThread throttling_logger ERROR Could not load post-deployment data from /etc/hadoop/conf.cloudera.hdfs
Traceback (most recent call last):
File "/opt/cm-5.7.0/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.7.0-py2.6.egg/cmf/client_configs.py", line 276, in load
deploy_pm = open(os.path.join(conf_path, METADATA_FILENAME), "r")
IOError: [Errno 2] No such file or directory: u'/etc/hadoop/conf.cloudera.hdfs/__cloudera_metadata__'
[12/Apr/2017 11:49:41 +0000] 1128 MainThread client_configs INFO Client configs to delete: set([])
[12/Apr/2017 11:49:41 +0000] 1128 MainThread client_configs INFO Client configs to deploy: set([<hadoop-conf,/etc/hadoop/conf.cloudera.hdfs>])
[12/Apr/2017 11:49:41 +0000] 1128 MainThread client_configs INFO Client configs to update: set([])
[12/Apr/2017 11:49:41 +0000] 1128 MainThread parcel INFO prepare_environment begin: {u'CDH': u'5.7.0-1.cdh5.7.0.p0.45'}, [u'cdh'], [u'cdh-plugin', u'hdfs-plugin']
[12/Apr/2017 11:49:41 +0000] 1128 MainThread parcel INFO The following requested parcels are not available: {}
[12/Apr/2017 11:49:41 +0000] 1128 MainThread parcel INFO Obtained tags ['cdh', 'impala', 'sentry', 'solr', 'spark'] for parcel CDH
[12/Apr/2017 11:49:41 +0000] 1128 MainThread parcel INFO prepare_environment end: {'CDH': '5.7.0-1.cdh5.7.0.p0.45'}
[12/Apr/2017 11:49:41 +0000] 1128 MainThread util INFO Extracted 7 files and 0 dirs to /opt/cm-5.7.0/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.hdfs_6724255950192424543.
[12/Apr/2017 11:49:56 +0000] 1128 MainThread client_configs INFO Client configs to delete: set([])
[12/Apr/2017 11:49:56 +0000] 1128 MainThread client_configs INFO Client configs to deploy: set([<hadoop-conf,/etc/hadoop/conf.cloudera.hdfs>])
[12/Apr/2017 11:49:56 +0000] 1128 MainThread client_configs INFO Client configs to update: set([])
[12/Apr/2017 11:49:56 +0000] 1128 MainThread parcel INFO prepare_environment begin: {u'CDH': u'5.7.0-1.cdh5.7.0.p0.45'}, [u'cdh'], [u'cdh-plugin', u'hdfs-plugin']
[12/Apr/2017 11:49:56 +0000] 1128 MainThread parcel INFO The following requested parcels are not available: {}
[12/Apr/2017 11:49:56 +0000] 1128 MainThread parcel INFO Obtained tags ['cdh', 'impala', 'sentry', 'solr', 'spark'] for parcel CDH
[12/Apr/2017 11:49:56 +0000] 1128 MainThread parcel INFO prepare_environment end: {'CDH': '5.7.0-1.cdh5.7.0.p0.45'}
[12/Apr/2017 11:49:56 +0000] 1128 MainThread util INFO Extracted 7 files and 0 dirs to /opt/cm-5.7.0/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.hdfs_6724255950192424543.
[12/Apr/2017 11:50:11 +0000] 1128 MainThread client_configs INFO Client configs to delete: set([])
[12/Apr/2017 11:50:11 +0000] 1128 MainThread client_configs INFO Client configs to deploy: set([<hadoop-conf,/etc/hadoop/conf.cloudera.hdfs>])
[12/Apr/2017 11:50:11 +0000] 1128 MainThread client_configs INFO Client configs to update: set([])
[12/Apr/2017 11:50:11 +0000] 1128 MainThread parcel INFO prepare_environment begin: {u'CDH': u'5.7.0-1.cdh5.7.0.p0.45'}, [u'cdh'], [u'cdh-plugin', u'hdfs-plugin']
[12/Apr/2017 11:50:11 +0000] 1128 MainThread parcel INFO The following requested parcels are not available: {}
[12/Apr/2017 11:50:11 +0000] 1128 MainThread parcel INFO Obtained tags ['cdh', 'impala', 'sentry', 'solr', 'spark'] for parcel CDH
[12/Apr/2017 11:50:11 +0000] 1128 MainThread parcel INFO prepare_environment end: {'CDH': '5.7.0-1.cdh5.7.0.p0.45'}
[12/Apr/2017 11:50:11 +0000] 1128 MainThread util INFO Extracted 7 files and 0 dirs to /opt/cm-5.7.0/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.hdfs_6724255950192424543.
[12/Apr/2017 11:50:26 +0000] 1128 MainThread client_configs INFO Client configs to delete: set([])
[12/Apr/2017 11:50:26 +0000] 1128 MainThread client_configs INFO Client configs to deploy: set([<hadoop-conf,/etc/hadoop/conf.cloudera.hdfs>])
[12/Apr/2017 11:50:26 +0000] 1128 MainThread client_configs INFO Client configs to update: set([])
[12/Apr/2017 11:50:26 +0000] 1128 MainThread parcel INFO prepare_environment begin: {u'CDH': u'5.7.0-1.cdh5.7.0.p0.45'}, [u'cdh'], [u'cdh-plugin', u'hdfs-plugin']
[12/Apr/2017 11:50:26 +0000] 1128 MainThread parcel INFO The following requested parcels are not available: {}
[12/Apr/2017 11:50:26 +0000] 1128 MainThread parcel INFO Obtained tags ['cdh', 'impala', 'sentry', 'solr', 'spark'] for parcel CDH
[12/Apr/2017 11:50:26 +0000] 1128 MainThread parcel INFO prepare_environment end: {'CDH': '5.7.0-1.cdh5.7.0.p0.45'}
[12/Apr/2017 11:50:26 +0000] 1128 MainThread util INFO Extracted 7 files and 0 dirs to /opt/cm-5.7.0/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.hdfs_6724255950192424543.
[12/Apr/2017 11:50:41 +0000] 1128 MainThread heartbeat_tracker INFO HB stats (seconds): num:41 LIFE_MIN:0.03 min:0.02 mean:0.05 max:0.10 LIFE_MAX:0.20
[12/Apr/2017 11:50:41 +0000] 1128 MainThread client_configs INFO Client configs to delete: set([])
[12/Apr/2017 11:50:41 +0000] 1128 MainThread client_configs INFO Client configs to deploy: set([<hadoop-conf,/etc/hadoop/conf.cloudera.hdfs>])
[12/Apr/2017 11:50:41 +0000] 1128 MainThread client_configs INFO Client configs to update: set([])
[12/Apr/2017 11:50:41 +0000] 1128 MainThread parcel INFO prepare_environment begin: {u'CDH': u'5.7.0-1.cdh5.7.0.p0.45'}, [u'cdh'], [u'cdh-plugin', u'hdfs-plugin']
[12/Apr/2017 11:50:41 +0000] 1128 MainThread parcel INFO The following requested parcels are not available: {}
[12/Apr/2017 11:50:41 +0000] 1128 MainThread parcel INFO Obtained tags ['cdh', 'impala', 'sentry', 'solr', 'spark'] for parcel CDH
[12/Apr/2017 11:50:41 +0000] 1128 MainThread parcel INFO prepare_environment end: {'CDH': '5.7.0-1.cdh5.7.0.p0.45'}
[12/Apr/2017 11:50:41 +0000] 1128 MainThread util INFO Extracted 7 files and 0 dirs to /opt/cm-5.7.0/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.hdfs_6724255950192424543.
[12/Apr/2017 11:50:56 +0000] 1128 MainThread client_configs INFO Client configs to delete: set([])
[12/Apr/2017 11:50:56 +0000] 1128 MainThread client_configs INFO Client configs to deploy: set([<hadoop-conf,/etc/hadoop/conf.cloudera.hdfs>])
[12/Apr/2017 11:50:56 +0000] 1128 MainThread client_configs INFO Client configs to update: set([])
[12/Apr/2017 11:50:56 +0000] 1128 MainThread parcel INFO prepare_environment begin: {u'CDH': u'5.7.0-1.cdh5.7.0.p0.45'}, [u'cdh'], [u'cdh-plugin', u'hdfs-plugin']
[12/Apr/2017 11:50:56 +0000] 1128 MainThread parcel INFO The following requested parcels are not available: {}
[12/Apr/2017 11:50:56 +0000] 1128 MainThread parcel INFO Obtained tags ['cdh', 'impala', 'sentry', 'solr', 'spark'] for parcel CDH
[12/Apr/2017 11:50:56 +0000] 1128 MainThread parcel INFO prepare_environment end: {'CDH': '5.7.0-1.cdh5.7.0.p0.45'}
[12/Apr/2017 11:50:56 +0000] 1128 MainThread util INFO Extracted 7 files and 0 dirs to /opt/cm-5.7.0/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.hdfs_6724255950192424543.
[12/Apr/2017 11:51:11 +0000] 1128 MainThread client_configs INFO Client configs to delete: set([])
[12/Apr/2017 11:51:11 +0000] 1128 MainThread client_configs INFO Client configs to deploy: set([<hadoop-conf,/etc/hadoop/conf.cloudera.hdfs>])
[12/Apr/2017 11:51:11 +0000] 1128 MainThread client_configs INFO Client configs to update: set([])
[12/Apr/2017 11:51:11 +0000] 1128 MainThread parcel INFO prepare_environment begin: {u'CDH': u'5.7.0-1.cdh5.7.0.p0.45'}, [u'cdh'], [u'cdh-plugin', u'hdfs-plugin']
[12/Apr/2017 11:51:11 +0000] 1128 MainThread parcel INFO The following requested parcels are not available: {}
[12/Apr/2017 11:51:11 +0000] 1128 MainThread parcel INFO Obtained tags ['cdh', 'impala', 'sentry', 'solr', 'spark'] for parcel CDH
[12/Apr/2017 11:51:11 +0000] 1128 MainThread parcel INFO prepare_environment end: {'CDH': '5.7.0-1.cdh5.7.0.p0.45'}
[12/Apr/2017 11:51:11 +0000] 1128 MainThread util INFO Extracted 7 files and 0 dirs to /opt/cm-5.7.0/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.hdfs_6724255950192424543.
[12/Apr/2017 11:51:26 +0000] 1128 MainThread client_configs INFO Client configs to delete: set([])
[12/Apr/2017 11:51:26 +0000] 1128 MainThread client_configs INFO Client configs to deploy: set([<hadoop-conf,/etc/hadoop/conf.cloudera.hdfs>])
[12/Apr/2017 11:51:26 +0000] 1128 MainThread client_configs INFO Client configs to update: set([])
[12/Apr/2017 11:51:26 +0000] 1128 MainThread parcel INFO prepare_environment begin: {u'CDH': u'5.7.0-1.cdh5.7.0.p0.45'}, [u'cdh'], [u'cdh-plugin', u'hdfs-plugin']
[12/Apr/2017 11:51:26 +0000] 1128 MainThread parcel INFO The following requested parcels are not available: {}
[12/Apr/2017 11:51:26 +0000] 1128 MainThread parcel INFO Obtained tags ['cdh', 'impala', 'sentry', 'solr', 'spark'] for parcel CDH
[12/Apr/2017 11:51:26 +0000] 1128 MainThread parcel INFO prepare_environment end: {'CDH': '5.7.0-1.cdh5.7.0.p0.45'}
[12/Apr/2017 11:51:26 +0000] 1128 MainThread util INFO Extracted 7 files and 0 dirs to /opt/cm-5.7.0/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.hdfs_6724255950192424543.
[12/Apr/2017 11:51:41 +0000] 1128 MainThread client_configs INFO Client configs to delete: set([])
[12/Apr/2017 11:51:41 +0000] 1128 MainThread client_configs INFO Client configs to deploy: set([<hadoop-conf,/etc/hadoop/conf.cloudera.hdfs>])
[12/Apr/2017 11:51:41 +0000] 1128 MainThread client_configs INFO Client configs to update: set([])
[12/Apr/2017 11:51:41 +0000] 1128 MainThread parcel INFO prepare_environment begin: {u'CDH': u'5.7.0-1.cdh5.7.0.p0.45'}, [u'cdh'], [u'cdh-plugin', u'hdfs-plugin']
[12/Apr/2017 11:51:41 +0000] 1128 MainThread parcel INFO The following requested parcels are not available: {}
[12/Apr/2017 11:51:41 +0000] 1128 MainThread parcel INFO Obtained tags ['cdh', 'impala', 'sentry', 'solr', 'spark'] for parcel CDH
[12/Apr/2017 11:51:41 +0000] 1128 MainThread parcel INFO prepare_environment end: {'CDH': '5.7.0-1.cdh5.7.0.p0.45'}
[12/Apr/2017 11:51:41 +0000] 1128 MainThread util INFO Extracted 7 files and 0 dirs to /opt/cm-5.7.0/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.hdfs_6724255950192424543.
[12/Apr/2017 11:51:56 +0000] 1128 MainThread client_configs INFO Client configs to delete: set([])
[12/Apr/2017 11:51:56 +0000] 1128 MainThread client_configs INFO Client configs to deploy: set([<hadoop-conf,/etc/hadoop/conf.cloudera.hdfs>])
[12/Apr/2017 11:51:56 +0000] 1128 MainThread client_configs INFO Client configs to update: set([])
[12/Apr/2017 11:51:56 +0000] 1128 MainThread parcel INFO prepare_environment begin: {u'CDH': u'5.7.0-1.cdh5.7.0.p0.45'}, [u'cdh'], [u'cdh-plugin', u'hdfs-plugin']
[12/Apr/2017 11:51:56 +0000] 1128 MainThread parcel INFO The following requested parcels are not available: {}
[12/Apr/2017 11:51:56 +0000] 1128 MainThread parcel INFO Obtained tags ['cdh', 'impala', 'sentry', 'solr', 'spark'] for parcel CDH
[12/Apr/2017 11:51:56 +0000] 1128 MainThread parcel INFO prepare_environment end: {'CDH': '5.7.0-1.cdh5.7.0.p0.45'}
[12/Apr/2017 11:51:56 +0000] 1128 MainThread util INFO Extracted 7 files and 0 dirs to /opt/cm-5.7.0/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.hdfs_6724255950192424543.
... View more
Labels:
04-11-2017
08:17 PM
@bgooley ok. I'll submit a new post and upload my agent log.
... View more
04-10-2017
06:21 PM
@bgooley My parcel version is CDH-5.7.0-1.cdh5.7.0.p0.45-el6.parcel I use cm 5.7.0. Have the same issue?
... View more
04-09-2017
08:58 PM
@naimishv I have almost same problem with you. Do you get the solution ? my error log is here: (95 skipped) Could not load post-deployment data from /etc/hadoop/conf.cloudera.hdfs
Traceback (most recent call last):
File "/opt/cm-5.7.0/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.7.0-py2.6.egg/cmf/client_configs.py", line 276, in load
deploy_pm = open(os.path.join(conf_path, METADATA_FILENAME), "r")
IOError: [Errno 2] No such file or directory: u'/etc/hadoop/conf.cloudera.hdfs/__cloudera_metadata__'
... View more
03-02-2017
01:38 AM
The reason is this --conf not work. spark.driver.extraClassPath how to figure it out ?
... View more
02-22-2017
11:47 PM
CDH verison is 5.4.7 oozie 0.5 spark 1.3.0
... View more
02-22-2017
11:39 PM
The spark batch job worked well when submitting with a shell cmd. spark-submit --class com.raiyi.spark.smscount.batch.SmsStatBy3DayDrive \
--master yarn-cluster \
--num-executors 5 \
--driver-memory 3g \
--executor-memory 3g \
--executor-cores 1 \
--conf "spark.driver.extraClassPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar" \
--conf "spark.executor.extraClassPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar" \
--conf "spark.driver.extraJavaOptions=-XX:MaxPermSize=1024m -XX:PermSize=256m" \
--conf "spark.executor.extraJavaOptions=-XX:MaxPermSize=1024m -XX:PermSize=256m" \
spark_demo-1.0-SNAPSHOT-shaded.jar 20170219 But I use oozie to submit spark batch job,exception happen.Here is log: 17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with environment: Map(CLASSPATH -> /opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar<CPS>{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CLIENT_CONF_DIR<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$MR2_CLASSPATH<CPS>/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*, SPARK_LOG_URL_STDERR -> http://datanode3:8042/node/containerlogs/container_1487752257960_0334_02_000006/zhuj/stderr?start=0, SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1487752257960_0334, SPARK_YARN_CACHE_FILES_FILE_SIZES -> 98343005,176766064, SPARK_USER -> zhuj, SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PUBLIC, SPARK_YARN_MODE -> true, SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1487832656812,1487815532194, SPARK_LOG_URL_STDOUT -> http://datanode3:8042/node/containerlogs/container_1487752257960_0334_02_000006/zhuj/stdout?start=0, SPARK_YARN_CACHE_FILES -> hdfs://nameservice/user/zhuj/.sparkStaging/application_1487752257960_0334/spark-assembly.jar#__spark__.jar,hdfs://nameservice/user/zhuj/batchjars/spark_demo-1.0-SNAPSHOT-shaded.jar#__app__.jar)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with environment: Map(CLASSPATH -> /opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar<CPS>{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CLIENT_CONF_DIR<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$MR2_CLASSPATH<CPS>/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*, SPARK_LOG_URL_STDERR -> http://namenode:8042/node/containerlogs/container_1487752257960_0334_02_000003/zhuj/stderr?start=0, SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1487752257960_0334, SPARK_YARN_CACHE_FILES_FILE_SIZES -> 98343005,176766064, SPARK_USER -> zhuj, SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PUBLIC, SPARK_YARN_MODE -> true, SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1487832656812,1487815532194, SPARK_LOG_URL_STDOUT -> http://namenode:8042/node/containerlogs/container_1487752257960_0334_02_000003/zhuj/stdout?start=0, SPARK_YARN_CACHE_FILES -> hdfs://nameservice/user/zhuj/.sparkStaging/application_1487752257960_0334/spark-assembly.jar#__spark__.jar,hdfs://nameservice/user/zhuj/batchjars/spark_demo-1.0-SNAPSHOT-shaded.jar#__app__.jar)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with environment: Map(CLASSPATH -> /opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar<CPS>{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CLIENT_CONF_DIR<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$MR2_CLASSPATH<CPS>/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*, SPARK_LOG_URL_STDERR -> http://datanode:8042/node/containerlogs/container_1487752257960_0334_02_000004/zhuj/stderr?start=0, SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1487752257960_0334, SPARK_YARN_CACHE_FILES_FILE_SIZES -> 98343005,176766064, SPARK_USER -> zhuj, SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PUBLIC, SPARK_YARN_MODE -> true, SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1487832656812,1487815532194, SPARK_LOG_URL_STDOUT -> http://datanode:8042/node/containerlogs/container_1487752257960_0334_02_000004/zhuj/stdout?start=0, SPARK_YARN_CACHE_FILES -> hdfs://nameservice/user/zhuj/.sparkStaging/application_1487752257960_0334/spark-assembly.jar#__spark__.jar,hdfs://nameservice/user/zhuj/batchjars/spark_demo-1.0-SNAPSHOT-shaded.jar#__app__.jar)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with environment: Map(CLASSPATH -> /opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar<CPS>{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CLIENT_CONF_DIR<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$MR2_CLASSPATH<CPS>/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*, SPARK_LOG_URL_STDERR -> http://datanode2:8042/node/containerlogs/container_1487752257960_0334_02_000002/zhuj/stderr?start=0, SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1487752257960_0334, SPARK_YARN_CACHE_FILES_FILE_SIZES -> 98343005,176766064, SPARK_USER -> zhuj, SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PUBLIC, SPARK_YARN_MODE -> true, SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1487832656812,1487815532194, SPARK_LOG_URL_STDOUT -> http://datanode2:8042/node/containerlogs/container_1487752257960_0334_02_000002/zhuj/stdout?start=0, SPARK_YARN_CACHE_FILES -> hdfs://nameservice/user/zhuj/.sparkStaging/application_1487752257960_0334/spark-assembly.jar#__spark__.jar,hdfs://nameservice/user/zhuj/batchjars/spark_demo-1.0-SNAPSHOT-shaded.jar#__app__.jar)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with environment: Map(CLASSPATH -> /opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar<CPS>{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CLIENT_CONF_DIR<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$MR2_CLASSPATH<CPS>/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*, SPARK_LOG_URL_STDERR -> http://datanode0:8042/node/containerlogs/container_1487752257960_0334_02_000005/zhuj/stderr?start=0, SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1487752257960_0334, SPARK_YARN_CACHE_FILES_FILE_SIZES -> 98343005,176766064, SPARK_USER -> zhuj, SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PUBLIC, SPARK_YARN_MODE -> true, SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1487832656812,1487815532194, SPARK_LOG_URL_STDOUT -> http://datanode0:8042/node/containerlogs/container_1487752257960_0334_02_000005/zhuj/stdout?start=0, SPARK_YARN_CACHE_FILES -> hdfs://nameservice/user/zhuj/.sparkStaging/application_1487752257960_0334/spark-assembly.jar#__spark__.jar,hdfs://nameservice/user/zhuj/batchjars/spark_demo-1.0-SNAPSHOT-shaded.jar#__app__.jar)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with commands: List({{JAVA_HOME}}/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', -Xms3072m, -Xmx3072m, '-XX:PermSize=1024m', -Djava.io.tmpdir={{PWD}}/tmp, '-Dspark.driver.port=40826', '-Dspark.ui.port=0', -Dspark.yarn.app.container.log.dir=<LOG_DIR>, org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, akka.tcp://sparkDriver@datanode3:40826/user/CoarseGrainedScheduler, --executor-id, 1, --hostname, datanode2, --cores, 1, --app-id, application_1487752257960_0334, --user-class-path, file:$PWD/__app__.jar, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with commands: List({{JAVA_HOME}}/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', -Xms3072m, -Xmx3072m, '-XX:PermSize=1024m', -Djava.io.tmpdir={{PWD}}/tmp, '-Dspark.driver.port=40826', '-Dspark.ui.port=0', -Dspark.yarn.app.container.log.dir=<LOG_DIR>, org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, akka.tcp://sparkDriver@datanode3:40826/user/CoarseGrainedScheduler, --executor-id, 5, --hostname, datanode3, --cores, 1, --app-id, application_1487752257960_0334, --user-class-path, file:$PWD/__app__.jar, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with commands: List({{JAVA_HOME}}/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', -Xms3072m, -Xmx3072m, '-XX:PermSize=1024m', -Djava.io.tmpdir={{PWD}}/tmp, '-Dspark.driver.port=40826', '-Dspark.ui.port=0', -Dspark.yarn.app.container.log.dir=<LOG_DIR>, org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, akka.tcp://sparkDriver@datanode3:40826/user/CoarseGrainedScheduler, --executor-id, 3, --hostname, datanode, --cores, 1, --app-id, application_1487752257960_0334, --user-class-path, file:$PWD/__app__.jar, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with commands: List({{JAVA_HOME}}/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', -Xms3072m, -Xmx3072m, '-XX:PermSize=1024m', -Djava.io.tmpdir={{PWD}}/tmp, '-Dspark.driver.port=40826', '-Dspark.ui.port=0', -Dspark.yarn.app.container.log.dir=<LOG_DIR>, org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, akka.tcp://sparkDriver@datanode3:40826/user/CoarseGrainedScheduler, --executor-id, 2, --hostname, namenode, --cores, 1, --app-id, application_1487752257960_0334, --user-class-path, file:$PWD/__app__.jar, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with commands: List({{JAVA_HOME}}/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', -Xms3072m, -Xmx3072m, '-XX:PermSize=1024m', -Djava.io.tmpdir={{PWD}}/tmp, '-Dspark.driver.port=40826', '-Dspark.ui.port=0', -Dspark.yarn.app.container.log.dir=<LOG_DIR>, org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, akka.tcp://sparkDriver@datanode3:40826/user/CoarseGrainedScheduler, --executor-id, 4, --hostname, datanode0, --cores, 1, --app-id, application_1487752257960_0334, --user-class-path, file:$PWD/__app__.jar, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
17/02/23 14:51:46 INFO impl.ContainerManagementProtocolProxy: Opening proxy : namenode:8041
17/02/23 14:51:46 INFO impl.ContainerManagementProtocolProxy: Opening proxy : datanode:8041
17/02/23 14:51:46 INFO impl.ContainerManagementProtocolProxy: Opening proxy : datanode2:8041
17/02/23 14:51:46 INFO impl.ContainerManagementProtocolProxy: Opening proxy : datanode3:8041
17/02/23 14:51:46 INFO impl.ContainerManagementProtocolProxy: Opening proxy : datanode0:8041
17/02/23 14:51:49 INFO cluster.YarnClusterSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@datanode:36494/user/Executor#311844886] with ID 3
17/02/23 14:51:49 INFO storage.BlockManagerMasterActor: Registering block manager datanode:40821 with 1589.8 MB RAM, BlockManagerId(3, datanode, 40821)
17/02/23 14:51:49 INFO cluster.YarnClusterSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@datanode0:44241/user/Executor#-1406859909] with ID 4
17/02/23 14:51:49 INFO cluster.YarnClusterSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@datanode2:35227/user/Executor#66771502] with ID 1
17/02/23 14:51:49 INFO storage.BlockManagerMasterActor: Registering block manager datanode0:34517 with 1589.8 MB RAM, BlockManagerId(4, datanode0, 34517)
17/02/23 14:51:49 INFO storage.BlockManagerMasterActor: Registering block manager datanode2:59608 with 1589.8 MB RAM, BlockManagerId(1, datanode2, 59608)
17/02/23 14:51:50 INFO cluster.YarnClusterSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@datanode3:40349/user/Executor#1475870089] with ID 5
17/02/23 14:51:50 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
17/02/23 14:51:50 INFO cluster.YarnClusterScheduler: YarnClusterScheduler.postStartHook done
17/02/23 14:51:50 ERROR yarn.ApplicationMaster: User class threw exception: org/apache/hadoop/hive/conf/HiveConf
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf
at com.raiyi.spark.smscount.batch.SmsStatBy3DayDrive$.main(SmsStatBy3DayDrive.scala:87)
at com.raiyi.spark.smscount.batch.SmsStatBy3DayDrive.main(SmsStatBy3DayDrive.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) How to set the spark extra options ? My oozie job.xml is here: <workflow-app name="SmsStatBy3DayDrive" xmlns="uri:oozie:workflow:0.5">
<global>
<configuration>
<property>
<name></name>
<value></value>
</property>
</configuration>
</global>
<start to="spark-3b65"/>
<kill name="Kill">
<message>操作失败,错误消息[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="spark-3b65">
<spark xmlns="uri:oozie:spark-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<master>yarn-cluster</master>
<mode>cluster</mode>
<name>SmsStatBy3DayDrive</name>
<class>com.raiyi.spark.smscount.batch.SmsStatBy3DayDrive</class>
<jar>${nameNode}/user/zhuj/batchjars/spark_demo-1.0-SNAPSHOT-shaded.jar</jar>
<spark-opts>--num-executors 5 --driver-memory 3g --executor-memory 3g --executor-cores 1 --conf spark.driver.extraClassPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar --conf spark.executor.extraClassPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar --conf spark.driver.extraJavaOptions=-XX:PermSize=1024m --conf spark.executor.extraJavaOptions=-XX:PermSize=1024m</spark-opts>
<arg>$(executeDate)</arg>
</spark>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app> Please help me !
... View more
Labels:
02-14-2017
01:29 AM
--conf "spark.driver.extraClassPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar" \ --conf "spark.executor.extraClassPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar" \ --conf "spark.driver.extraJavaOptions=-XX:MaxPermSize=1024m -XX:PermSize=256m" \ --conf "spark.executor.extraJavaOptions=-XX:MaxPermSize=1024m -XX:PermSize=256m" \ work for me
... View more
02-13-2017
07:14 PM
I just boot a demon thread scheduleAtFixRate clean these "empty" and has file "_SUCCESS" directory and another thread to run hive cmd "alter xxx concatenate" Executors . newSingleThreadScheduledExecutor ().scheduleAtFixedRate( new Runnable { override def run(): Unit = { val fs = FileSystem. get ( new Configuration()) val status = fs.listStatus( new Path( s"hdfs://nameservice/user/xxx/warehouse/ $ tableName /" )) status.foreach(stat => if (stat.isDirectory && stat.getPath.getName.contains( "hive-staging" ) && fs.getContentSummary (stat.getPath).getSpaceConsumed < 1024 ) { println ( "empty path : " + stat.getPath) if ( directoryHasSuccess (stat.getPath, fs)) { fs.delete(stat.getPath, true ) } val now = new Date().getTime if (now - stat.getModificationTime > 5 * 60 * 1000 && (now - stat.getAccessTime > 5 * 60 * 1000 )) { //5m before println ( "delete path " + stat.getPath) fs.delete(stat.getPath, true ) } } ) } }, 5 , interval , TimeUnit. SECONDS );
... View more
01-17-2017
07:45 PM
Add two background thread, one for delete empty dir, another for hive Concatenate. But It's really a ugly way.
... View more
01-16-2017
11:25 PM
@mbigelow my active partition's data also split into small files too, It's really strange .
... View more
01-16-2017
11:21 PM
@mbigelow Some of these directories only have a _success file which is a result of batch. In spark-streaming scene, should I periodly remove the empty directory since they're too much? Could you explain your solution in more detail, I'm a stupid guy.
... View more
01-16-2017
10:54 PM
It seems like , every batch of streaming create a .hive-staging directory. Any suggestions ??
... View more
01-16-2017
10:53 PM
This seems a bug of spark, see https://github.com/apache/spark/pull/16325 my code is here,it's work.But after a while,there are a lot of .hive-stagingXXXX directory. a photo of my warehouse as attachment. val sparkConf = new SparkConf() .setAppName( "SendHistoryConsumer" ) // .setMaster("local[2]") // for debug val ssc = new StreamingContext(sparkConf, Seconds (batchInterval.toInt)) ssc.checkpoint( "/opt/cloudera/checkPoint" ) // kafka configuration val topicsSet = Set (KafkaConfigConstant. PRIORITY_10 ) val kafkaParams = Map [ String , String ]( "metadata.broker.list" -> brokers, "serializer.class" -> "kafka.serializer.StringEncoder" , "group.id" -> groupId, "auto.offset.reset" -> "smallest" val messages = KafkaUtils. createDirectStream [ String , String , StringDecoder, StringDecoder](ssc, kafkaParams, topicsSet) //messages.print() val dStream = messages .map(_._2) .filter(message => !message.isEmpty ) .map( parseTopic10 ) dStream.foreachRDD(rdd => if (!rdd.isEmpty && !rdd.partitions.isEmpty) { val hiveCtx = new HiveContext(rdd.sparkContext) val sendCallBackPoolDF = hiveCtx.createDataFrame(rdd, classOf [SmsSendHistDataSede]) hiveCtx.sql( "use db" ) sendCallBackPoolDF.registerTempTable( "sendCallBack" ) hiveCtx.sql( "set hive.exec.dynamic.partition=true" ) hiveCtx.sql( "set hive.exec.dynamic.partition.mode=nonstrict" ) val smTrgPart = hiveCtx.sql( "insert into table SmsSendHistData_orc partition" + "(partittime) select userid,city,mobile,msgcount,msgbody,sender,accepttype,sendcode," + "msgid,bmsgid,usermsgid,userbulkmsgid,channelmsgid,channelid,fileid,notifyurl,smscode ,smsstat," + "notifycallnum,sendsmsbody,parentmsgid,errormessage,queuename,startdeliveryid,acctime,sendtime," + "notifytime,smstype,from_unixtime(floor(acctime/1000),'yyyyMMdd') as partittime from sendCallBack" ) smTrgPart.saveAsTable( "SmsSendHistData_orc" , SaveMode. Append ) }
... View more
01-15-2017
11:30 PM
Have the same problem,CDH 5.4.7.After streaming job with HiveContext
... View more
- Tags:
- Th
08-25-2016
11:12 PM
I have resolve this.It's weird.I just pause the download and retry.It's get work!
... View more
08-25-2016
08:23 AM
I have put the parcels in my local httpd server,and I have see that CM can get it.But when do install parcels,CM block in 0%. any ideas? Is there some log I can trace this problem??
... View more
08-24-2016
08:53 AM
resolve this by set hostname,centos7 hostname is not set by /etc/sysconfig/network.
... View more
08-24-2016
07:38 AM
the host change.example.com ,I have ever config,but now it's should not exists...
... View more
08-24-2016
07:36 AM
I have a similar problem,log is here: >>[24/Aug/2016 21:54:58 +0000] 13517 MainThread _cplogging INFO [24/Aug/2016:21:54:58] ENGINE Started monitor thread '_TimeoutMonitor'. >>[24/Aug/2016 21:54:58 +0000] 13517 HTTPServer Thread-2 _cplogging ERROR [24/Aug/2016:21:54:58] ENGINE Error in HTTP server: shutting down >>Traceback (most recent call last): >> File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/process/servers.py", line 187, in _start_http_thread >> self.httpserver.start() >> File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/CherryPy-3.2.2-py2.7.egg/cherrypy/wsgiserver/wsgiserver2.py", line 1825, in start >> raise socket.error(msg) >>error: No socket could be created on ('change.example.com', 9000) -- [Errno 99] Cannot assign requested address >> [root@change ~]# python -c "import socket; print socket.getfqdn(); print socket.gethostbyname(socket.getfqdn())" change.example.com 202.102.110.203 but my hosts config is this: 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.0.111 hadoop1.example.com hadoop1 192.168.0.112 hadoop2.example.com hadoop2 192.168.0.113 hadoop3.example.com hadoop3 192.168.0.114 hadoop4.example.com hadoop4 192.168.0.115 hadoop5.example.com hadoop5 192.168.0.110 base.example.com base any idea?
... View more