Member since
02-12-2016
37
Posts
6
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
30468 | 08-08-2016 08:35 AM | |
6098 | 04-06-2016 03:29 PM | |
33447 | 02-25-2016 07:25 AM |
08-03-2016
02:02 PM
Hi all ! Its now time to create replace our Cloudera 4 by Cloudera 5 ! I start the first,as install new hardware ! Good thing done ! Now I start the application installation/configuration. FYI, I will install Cloudera 5.7.1. I have a machine for the Cloudera Manager, one for the a MySQL server with "monitor/metastore". I install the Manager with the MySQL information so fr the "scm" database. Everything fine. Then I intregrate the Manager machine to the a new cluster, configure the Host Monitor to access to the monitor database. It works fine, now all Cloudera Manager Service works fine. I think ! Now I would like to integrate 4 new node to cluster. 2 will serve for "HA" name node, ans 2 others as datanode. Six others nodes will join the cluster later. So with Cloudera Manager, I launch the wizard to add the new hosts. Answer to all question, and the installation start. Everything fine and installation is done with success. But after that, it have to run the "inspector job" on all hosts. And here I encount this warning: newnode.domain.ltd: Command aborted because of exception: Command timed-out after 150 seconds
4 hosts are reporting with NONE CDH version
There are mismatched versions across the system, which will cause failures. See below for details on which hosts are running what versions of components. I use a local reposync, and with sync only the 5.7.1 version. So no problems with differents version, 5.7.1 is installed everywhere. Then, if I go to the Manager interface, I can see all hosts int the "Hosts" section. But all are in red status. If I select one, I have this message: This host is in contact with the Cloudera Manager Server. This host is not in contact with the Host Monitor. So the Cloudera Manager Agent seems to be ok, but it seems to can't contact the "Host Monitor"... But the "Host Monitor" is install and configured on the same server. So... And the status of ths "Host Monitor" is green, and can contact my remote MySQL database. Si I don't know why I get this error. No firewall between my machine, no Selinux. I don't know why they can contact the "Cloudera Manager" but not the "Host Monitor" The /etc/cloudera-scm-agent/config.ini config file is ok. Good IP and port. server_host=10.x.x.x
server_port=7182 On cloudera Manager, the "cloudera-scm-server" and "cloudera-scm-agent" run nicely. On my new node, "cloudera-scm-agent" run nicely too. But I get this error in log. But not sure it could be the reason, and if it is, I don't know how to solve it. [03/Aug/2016 20:56:32 +0000] 158533 MainThread agent INFO Flood daemon (re)start attempt
[03/Aug/2016 20:56:32 +0000] 158533 MainThread agent ERROR Failed to handle Heartbeat Response: {u'firehoses': [{u'roletype': u'ACTIVITYMONITOR', u'rolename': u'mgmt-ACTIVITYMONITOR-728c1b31088c1d8ddc2547d70b884cf7', u'port': 9999, u'report_interval': 60, u'address': u'clouderamanager.domain.ltd'}, {u'roletype': u'SERVICEMONITOR', u'rolename': u'mgmt-SERVICEMONITOR-728c1b31088c1d8ddc2547d70b884cf7', u'port': 9997, u'report_interval': 60, u'address': u'clouderamanager.domain.ltd'}, {u'roletype': u'HOSTMONITOR', u'rolename': u'mgmt-HOSTMONITOR-728c1b31088c1d8ddc2547d70b884cf7', u'port': 9995, u'report_interval': 60, u'address': u'clouderamanager.domain.ltd'}], u'rm_enabled': False, u'client_configs': [], u'create_parcel_symlinks': True, u'server_managed_parcels': [], u'extra_configs': None, u'host_collection_config_data': [{u'config_name': u'host_network_interface_collection_filter', u'config_value': u'^lo$'}, {u'config_name': u'host_disk_collection_filter', u'config_value': u'^$'}, {u'config_name': u'host_fs_collection_filter', u'config_value': u'^$'}, {u'config_name': u'host_log_tailing_config', u'config_value': u'{}\n'}, {u'config_name': u'host_dns_resolution_duration_thresholds', u'config_value': u'{"critical":"never","warning":"1000.0"}'}, {u'config_name': u'host_dns_resolution_enabled', u'config_value': u'true'}, {u'config_name': u'host_clock_offset_thresholds', u'config_value': u'{"critical":"10000.0","warning":"3000.0"}'}], u'apply_parcel_users_groups_permissions': True, u'flood_torrent_port': 7191, u'log_tailing_config': u'{}\n', u'active_parcels': {}, u'flood_rack_peers': [u'10.2.0.33:7191', u'10.2.0.31:7191', u'10.2.0.34:7191', u'10.2.0.29:7191', u'10.2.0.30:7191'], u'retain_parcels_in_cache': True, u'processes': [{u'status_links': {}, u'name': u'cluster-host-inspector', u'config_generation': 0, u'configuration_data': 'PK\x03\x04\x14\x00\x08\x08\x08\x00\x83\x90\x03I\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\n\x00\x00\x00input.json\xb5\xd2\xbb\x0e\xc2 \x18\x05\xe0\xbdOA\x98[\x02\xbd\x98\xe8\xd6\xe8\xd0\xc5\xd4\xb8\x1a\x07\x14\x92\x12)4\xa5\x9d\x9a\xbe\xbb\x80q\x04\xbb8r\xfe\xc3\x07\t,\t\x00\x90J\xd9h3\x19\x08\x0e\xe0\x06\x16\x1b\xd9\xb0\xb3\x89\xa2=w!4Jd\x1deZ\x0f\x19\xc69\x923\x13\x14\xd9Y\xfa\xe9\n\xe6Z\xd5w5\xd4\x8c\x8d\xdcx\x0f\x12\x8cr\x84Q\x81\xa1\x9d\xae\xe9o~\x17\xe0\xf3(_n\xe5I\x80/c|\xbe\xdf\xcaW\x01\xbe\x88\xde\xbe\xd8\xca\x17\x01\x9eDy\xe2ypw%(\x94\x19\xf8s\x12Z\xf9\x92\x9a\xa5\xf4\xf9\x8b\x8f\x0f>js\xe5T\xf6~{S\x9f\xda\xf6\x82\x8e\xed\xd9\x1f\x06\xa7N\x18\xf7Q\xdc\xf0\xaf\xcf\x98\xacoPK\x07\x089m\\\xdd\xbe\x00\x00\x00\x98\x02\x00\x00PK\x01\x02\x14\x00\x14\x00\x08\x08\x08\x00\x83\x90\x03I9m\\\xdd\xbe\x00\x00\x00\x98\x02\x00\x00\n\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00input.jsonPK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x008\x00\x00\x00\xf6\x00\x00\x00\x00\x00', u'refresh_files': [], u'user': u'root', u'parcels': {}, u'auto_restart': False, u'run_generation': 2, u'extra_groups': [], u'environment': {}, u'optional_tags': [], u'running': False, u'program': u'mgmt/mgmt.sh', u'required_tags': [], u'arguments': [u'inspector', u'input.json', u'output.json', u'DEFAULT'], u'special_file_info': [], u'group': u'root', u'id': 34, u'resources': [], u'one_off': True}], u'server_manages_parcels': True, u'heartbeat_interval': 15, u'parcels_directory': u'/opt/cloudera/parcels', u'host_id': u'8a18c6eb-7f32-4e90-a6f9-88d1feeacd21', u'eventserver_host': u'clouderamanager.domain.ltd', u'enabled_metric_reporters': [u'ACCUMULO16', u'ACCUMULO16', u'KEYTRUSTEE-KMS_KEYTRUSTEE', u'KMS_KEYTRUSTEE', u'SPARK_ON_YARN-SPARK_YARN_HISTORY_SERVER', u'SPARK_YARN_HISTORY_SERVER', u'SOLR-SOLR_SERVER', u'SOLR_SERVER', u'HBASE-HBASERESTSERVER', u'HBASERESTSERVER', u'HOST', u'KEYTRUSTEE_SERVER-KEYTRUSTEE_PASSIVE_SERVER', u'KEYTRUSTEE_PASSIVE_SERVER', u'IMPALA-STATESTORE', u'STATESTORE', u'SPARK', u'SPARK', u'HBASE', u'HBASE', u'ACCUMULO-ACCUMULO_TRACER', u'ACCUMULO_TRACER', u'HDFS-DATANODE', u'DATANODE', u'ACCUMULO-ACCUMULO_MASTER', u'ACCUMULO_MASTER', u'YARN-RESOURCEMANAGER', u'RESOURCEMANAGER', u'HUE-HUE_SERVER', u'HUE_SERVER', u'ACCUMULO-ACCUMULO_MONITOR', u'ACCUMULO_MONITOR', u'MGMT-EVENTSERVER', u'EVENTSERVER', u'MGMT-NAVIGATORMETASERVER', u'NAVIGATORMETASERVER', u'HBASE-MASTER', u'MASTER', u'KAFKA-KAFKA_BROKER', u'KAFKA_BROKER', u'KEYTRUSTEE_SERVER-DB_PASSIVE', u'DB_PASSIVE', u'HBASE-REGIONSERVER', u'REGIONSERVER', u'SPARK_ON_YARN', u'SPARK_ON_YARN', u'MGMT-REPORTSMANAGER', u'REPORTSMANAGER', u'MGMT-SERVICEMONITOR', u'SERVICEMONITOR', u'IMPALA-IMPALAD', u'IMPALAD', u'MGMT-ALERTPUBLISHER', u'ALERTPUBLISHER', u'HIVE-HIVESERVER2', u'HIVESERVER2', u'MGMT-ACTIVITYMONITOR', u'ACTIVITYMONITOR', u'ISILON', u'ISILON', u'YARN-NODEMANAGER', u'NODEMANAGER', u'MAPREDUCE-FAILOVERCONTROLLER', u'FAILOVERCONTROLLER', u'ACCUMULO', u'ACCUMULO', u'MAPREDUCE', u'MAPREDUCE', u'ZOOKEEPER', u'ZOOKEEPER', u'KMS', u'KMS', u'ACCUMULO16-ACCUMULO16_TRACER', u'ACCUMULO16_TRACER', u'ACCUMULO16-ACCUMULO16_MONITOR', u'ACCUMULO16_MONITOR', u'MGMT-HOSTMONITOR', u'HOSTMONITOR', u'YARN-JOBHISTORY', u'JOBHISTORY', u'KEYTRUSTEE', u'KEYTRUSTEE', u'HDFS-JOURNALNODE', u'JOURNALNODE', u'KAFKA', u'KAFKA', u'IMPALA', u'IMPALA', u'SPARK-SPARK_HISTORY_SERVER', u'SPARK_HISTORY_SERVER', u'KEYTRUSTEE_SERVER-KEYTRUSTEE_ACTIVE_SERVER', u'KEYTRUSTEE_ACTIVE_SERVER', u'HDFS-NAMENODE', u'NAMENODE', u'HUE-BEESWAX_SERVER', u'BEESWAX_SERVER', u'SOLR', u'SOLR', u'ACCUMULO16-ACCUMULO16_TSERVER', u'ACCUMULO16_TSERVER', u'MAPREDUCE-TASKTRACKER', u'TASKTRACKER', u'IMPALA-CATALOGSERVER', u'CATALOGSERVER', u'HDFS-DSSDDATANODE', u'DSSDDATANODE', u'SENTRY', u'SENTRY', u'ACCUMULO16-ACCUMULO16_GC', u'ACCUMULO16_GC', u'MGMT-NAVIGATOR', u'NAVIGATOR', u'HIVE', u'HIVE', u'HBASE-HBASETHRIFTSERVER', u'HBASETHRIFTSERVER', u'SQOOP-SQOOP_SERVER', u'SQOOP_SERVER', u'KAFKA-KAFKA_MIRROR_MAKER', u'KAFKA_MIRROR_MAKER', u'FLUME', u'FLUME', u'HUE', u'HUE', u'HDFS-SECONDARYNAMENODE', u'SECONDARYNAMENODE', u'SENTRY-SENTRY_SERVER', u'SENTRY_SERVER', u'ACCUMULO-ACCUMULO_TSERVER', u'ACCUMULO_TSERVER', u'ACCUMULO-ACCUMULO_GC', u'ACCUMULO_GC', u'HIVE-HIVEMETASTORE', u'HIVEMETASTORE', u'IMPALA-LLAMA', u'LLAMA', u'ACCUMULO16-ACCUMULO16_MASTER', u'ACCUMULO16_MASTER', u'SPARK-SPARK_WORKER', u'SPARK_WORKER', u'MGMT', u'MGMT', u'HIVE-WEBHCAT', u'WEBHCAT', u'SQOOP', u'SQOOP', u'HUE-HUE_LOAD_BALANCER', u'HUE_LOAD_BALANCER', u'ACCUMULO-ACCUMULO_LOGGER', u'ACCUMULO_LOGGER', u'HDFS', u'HDFS', u'FLUME-AGENT', u'AGENT', u'OOZIE', u'OOZIE', u'SQOOP_CLIENT', u'SQOOP_CLIENT', u'OOZIE-OOZIE_SERVER', u'OOZIE_SERVER', u'KMS-KMS', u'KMS', u'HDFS-FAILOVERCONTROLLER', u'FAILOVERCONTROLLER', u'KS_INDEXER', u'KS_INDEXER', u'SPARK-SPARK_MASTER', u'SPARK_MASTER', u'YARN', u'YARN', u'ZOOKEEPER-SERVER', u'SERVER', u'HDFS-NFSGATEWAY', u'NFSGATEWAY', u'HDFS-HTTPFS', u'HTTPFS', u'HUE-KT_RENEWER', u'KT_RENEWER', u'KEYTRUSTEE_SERVER', u'KEYTRUSTEE_SERVER', u'KEYTRUSTEE_SERVER-DB_ACTIVE', u'DB_ACTIVE', u'MAPREDUCE-JOBTRACKER', u'JOBTRACKER', u'KS_INDEXER-HBASE_INDEXER', u'HBASE_INDEXER'], u'flood_seed_timeout': 100, u'eventserver_port': 7185}
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 1335, in handle_heartbeat_response
self._handle_heartbeat_response(response)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 1357, in _handle_heartbeat_response
response["flood_torrent_port"])
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 1823, in handle_heartbeat_flood
self.mkabsdir(flood_dir, user=FLOOD_FS_USER, group=FLOOD_FS_GROUP, mode=0755)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 1918, in mkabsdir
path_grp = grp.getgrgid(stat_info.st_gid)[0]
KeyError: 'getgrgid(): gid not found: 167 Do you have an idea why this error ? Hope somebody can help me to have a good configuration and my node are green. Regards, Fabien
... View more
04-06-2016
03:29 PM
Hi tseader, Sorry I wasn't avaiable ! For update, It works. The problem was the "Dynamic ressrouce pool". I create a resource pool for my username, and now the job is starting and runing. It was different from our Cloudera 4 in how it works... So now the job is runing, doing the sqoop and the hive job, and terminate successfuly ! Great news! But it very slow for a small table import, I think there is something to do in Dynamic resource pool or yarn setting to use more resource cause, during the job, cpu/emory of my 2 datanode was very less... Maybe you can give me some informations on how to calculate the the max container possible ? To give you some answer: - Yes sqoop was working alone. - Yes our analytics use <args> cause sometime in CDH4 with <command>, they were some error with specific caracters. - Now yes, sqoop/oozie/hive works now. We will try Impala now - No we doesn't try to create a workflow since Hue. I will see with our dev about that. - Not, didn't try with another db. As you thinking, the problem wasn't come from the workflow but the configuration. I'm new in Cloudera/Hadoop, so I learn! I discover the configuration with time! Now I've to find the best configuration to a better usage of our datanode... Thanks again tseader!
... View more
04-01-2016
04:47 PM
Yes we always used the real fqdn to start job. And we find the mistake. The "job_tracker" was on "name_node" job and vice-versa... So we change them to be valid, and the process start but it doesn't complish anymore. We see the worklows in "running" status, and see a job "oozie:launcher" on running state. It create an "oozie:action" task, but this last stay on the "accepted" status. And I don't find why. I try some settings in Yarn memory configuration with no success. In RessourceManager, I can find this log about he job: >>> Invoking Sqoop command line now >>>
4624 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
4654 [uber-SubtaskRunner] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.6-cdh5.5.2
4671 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead.
4672 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.BaseSqoopTool - Using Hive-specific delimiters for output. You can override
4672 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.BaseSqoopTool - delimiters with --fields-terminated-by, etc.
4690 [uber-SubtaskRunner] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
4816 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.MySQLManager - Preparing to use a MySQL streaming resultset.
4820 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.CodeGenTool - Beginning code generation
5360 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `table` AS t LIMIT 1
5521 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `table` AS t LIMIT 1
5616 [uber-SubtaskRunner] INFO org.apache.sqoop.orm.CompilationManager - HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
7274 [uber-SubtaskRunner] INFO org.apache.sqoop.orm.CompilationManager - Writing jar file: /tmp/sqoop-yarn/compile/f695dd68db2ed1ecf703a5405d308df5/table.jar
7282 [uber-SubtaskRunner] WARN org.apache.sqoop.manager.MySQLManager - It looks like you are importing from mysql.
7282 [uber-SubtaskRunner] WARN org.apache.sqoop.manager.MySQLManager - This transfer can be faster! Use the --direct
7282 [uber-SubtaskRunner] WARN org.apache.sqoop.manager.MySQLManager - option to exercise a MySQL-specific fast path.
7282 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.MySQLManager - Setting zero DATETIME behavior to convertToNull (mysql)
7284 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Beginning import of game_session
7398 [uber-SubtaskRunner] WARN org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able to find all job dependencies.
8187 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.DBInputFormat - Using read commited transaction isolation
8211 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat - BoundingValsQuery: SELECT MIN(`session_id`), MAX(`session_id`) FROM `table`
8237 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.IntegerSplitter - Split size: 9811415567004; Num splits: 4 from: 14556292800030657 to: 14595538462298675
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat The status is looping on "Heart beat" logs. I don't know if it can come from memory configuration, or anything else... Have you an idea about that ?
... View more
03-31-2016
04:23 PM
Hi tseader ! Thanks for your help ! Good eyes! Yes I think it could be a error ans stop the process. I modify these 2 settings but the problem still present. It seems to be before. It seems to read this file, but never start the mysql connection process. I have something else in oozie logs. When I launch the command on my VM, the workflow appears in Hue. But the log start with theses 2 lines: 2016-03-31 19:04:18,709 WARN org.apache.oozie.util.ParameterVerifier: SERVER[hostname] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] The application does not define formal parameters in its XML definition
2016-03-31 19:04:18,744 WARN org.apache.oozie.service.LiteWorkflowAppService: SERVER[hostname] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] libpath [hdfs://hostname.domain.com:8020/path/to/oozie/lib] does not exist Or its not the libpath that I give in th my job file... The complete log, since the job is started to the end. 2016-03-31 19:04:18,709 WARN org.apache.oozie.util.ParameterVerifier: SERVER[hostname] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] The application does not define formal parameters in its XML definition
2016-03-31 19:04:18,744 WARN org.apache.oozie.service.LiteWorkflowAppService: SERVER[hostname] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] libpath [hdfs://hostname.domain.com:8020/path/to/oozie/lib] does not exist
2016-03-31 19:04:18,805 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@:start:] Start action [0000001-160331185825562-oozie-oozi-W@:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-03-31 19:04:18,809 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@:start:] [***0000001-160331185825562-oozie-oozi-W@:start:***]Action status=DONE
2016-03-31 19:04:18,809 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@:start:] [***0000001-160331185825562-oozie-oozi-W@:start:***]Action updated in DB!
2016-03-31 19:04:18,898 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@extract] Start action [0000001-160331185825562-oozie-oozi-W@extract] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-03-31 19:04:18,907 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@extract] [***0000001-160331185825562-oozie-oozi-W@extract***]Action
2016-03-31 19:04:18,907 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@extract] [***0000001-160331185825562-oozie-oozi-W@extract***]Action updated in DB!
2016-03-31 19:04:19,077 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@session] Start action [0000001-160331185825562-oozie-oozi-W@session] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-03-31 19:04:22,804 WARN org.apache.hadoop.security.UserGroupInformation: SERVER[hostname] PriviledgedActionException as:username (auth:PROXY) via oozie (auth:SIMPLE) cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: httpstatus=DONE
2016-03-31 19:04:22,805 WARN org.apache.hadoop.security.UserGroupInformation: SERVER[hostname] PriviledgedActionException as:username (auth:PROXY) via oozie (auth:SIMPLE) cause:java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
2016-03-31 19:04:22,805 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@session] Error starting action [session]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.]
org.apache.oozie.action.ActionExecutorException: JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:454)
at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:434)
at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1032)
at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1203)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:250)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:64)
at org.apache.oozie.command.XCommand.call(XCommand.java:286)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:321)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:250)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:472)
at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:450)
at org.apache.oozie.service.HadoopAccessorService$3.run(HadoopAccessorService.java:436)
at org.apache.oozie.service.HadoopAccessorService$3.run(HadoopAccessorService.java:434)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.oozie.service.HadoopAccessorService.createJobClient(HadoopAccessorService.java:434)
at org.apache.oozie.action.hadoop.JavaActionExecutor.createJobClient(JavaActionExecutor.java:1246)
at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:980) Maybe it can give you an idea ! I check hostname in configuration file, but it seems to be ok. This error message is not very clear...
... View more
03-29-2016
01:38 PM
Hello everyone, I come with an error from one of our job which is not really explicit... I find another topic about the same error but seems to haven't the same origin. To reproduce the problem I start from my VM this oozie job (we have a standalone labs in Cloudera 5.5.2 remote server) The command to start the job: oozie job -oozie http://host.domain.com:11000/oozie -config config-default.xml -run The content of config-default.xml file: <configuration>
<property><name>job_tracker</name><value>host.domain.com:8032</value></property>
<property><name>job_xml</name><value>/path/to/file/hive-site.xml</value></property>
<property><name>name_node</name><value>hdfs://host.domain.com:8020</value></property>
<property><name>oozie.libpath</name><value>${name_node}/user/oozie/share/lib/lib_20160216173849</value></property>
<property><name>oozie.use.system.libpath</name><value>true</value></property>
<property><name>oozie.wf.application.path</name><value>${name_node}/path/to/file/simple-etl-wf.xml</value></property>
<property><name>db_user</name><value>user</value></property>
<property><name>db_pass</name><value>password</value></property>
<property><name>target_dir</name><value>/path/to/destination</value></property>
<property><name>hive_db_schema</name><value>default</value></property>
<property><name>table_suffix</name><value>specific_suffix</value></property>
</configuration> Try to set the "job"tracker" with http:// but we have the same error. The content of the simple-etl-wf.xml file <workflow-app xmlns="uri:oozie:workflow:0.5" name="simple-etl-wf">
<global>
<job-tracker>${name_node}</job-tracker>
<name-node>${job_tracker}</name-node>
<job-xml>${job_xml}</job-xml>
</global>
<start to="extract"/>
<fork name="extract">
<path start="table" />
</fork>
<action name="table">
<sqoop xmlns="uri:oozie:sqoop-action:0.4">
<arg>import</arg>
<arg>--connect</arg>
<arg>jdbc:mysql://db.domain.com/database</arg>
<arg>username</arg>
<arg>${db_user}</arg>
<arg>password</arg>
<arg>${db_pass}</arg>
<arg>--table</arg>
<arg>table</arg>
<arg>--target-dir</arg>
<arg>${target_dir}/table</arg>
<arg>--split-by</arg>
<arg>column</arg>
<arg>--hive-import</arg>
<arg>--hive-overwrite</arg>
<arg>--hive-table</arg>
<arg>${hive_db_schema}.table_${table_suffix}</arg>
</sqoop>
<ok to="join"/>
<error to="fail"/>
</action>
<join name="join" to="transform" />
<action name="transform">
<hive xmlns="uri:oozie:hive-action:0.4">
<script>script.hql</script>
<param>hive_db_schema=${hive_db_schema}</param>
<param>table_suffix=${table_suffix}</param>
</hive>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app> The job is start, but it block to 20% about. And we have this error: JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. 2016-03-29 15:45:17,149 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[host.domain.com] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000004-160325161246127-oozie-oozi-W] ACTION[0000004-160325161246127-oozie-oozi-W@session] Error starting action [session]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.]
org.apache.oozie.action.ActionExecutorException: JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:454)
at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:434)
at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1032)
at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1203)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:250)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:64)
at org.apache.oozie.command.XCommand.call(XCommand.java:286)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:321)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:250)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:472)
at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:450)
at org.apache.oozie.service.HadoopAccessorService$3.run(HadoopAccessorService.java:436)
at org.apache.oozie.service.HadoopAccessorService$3.run(HadoopAccessorService.java:434)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.oozie.service.HadoopAccessorService.createJobClient(HadoopAccessorService.java:434)
at org.apache.oozie.action.hadoop.JavaActionExecutor.createJobClient(JavaActionExecutor.java:1246)
at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:980)
... 10 more Or the job_tracker and name_node have the good url and path. The mysql-connector-java.jar is present in the sharlib folder. I put oozie un debug mode but no more information about that. The "mapreduce.framework.name" is set to "yarn on each xml configuration file the the cluster." Have you any idea about this error ?
... View more
Labels:
- Labels:
-
Apache Oozie
02-25-2016
07:25 AM
5 Kudos
It works! As we see in the outpulog, we see the HADOOP_CLASSPATH variable. Or we don't have any path for libs in hive directory... I try once to add in HADOOP_CLASSPATH the his folder but it doesn't works. The solution is to add the the folder and /* to take all jar... So I add this one in .bash_profile: export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/lib/hive/lib/* Then source ~/.bash_profile And now it works. Date were imported in Hive! Now we can continue our labs with Cloudera 5! Thanks!
... View more
02-22-2016
02:57 PM
On my client machine, I try to find the class and find it in a jar file. I go to /usr/lib/hive/lib folder, and look inside the hive-common.jar with this command: jar tf hive-common.jar At the end, I can see this line: org/apache/hadoop/hive/conf/HiveConf.class So the class is present. So why he can't find it when it start the import ? The HIVE_HOME is set to /usr/lib/hive, so, the path is valid... I continue to search, but maybe it can give you more informations why and how to solve that !
... View more
02-22-2016
07:14 AM
Hi Matt, Thanks. But my problem still present... Maybe someone else with a fresh install of Cloudera Manager/CDH 5.5 can have the same problem. For test, I try a fresh install on an another single machine. Same error ! So maybe the problem came from the client configuration. To do the installation I use our Cloudera Manager/CDH repository which are sync every day. So I use the package and the not the parcels during installation. My test VM are on CentOS 6.6. A supported version. To start the command, I launch it since my machine (Ubuntu) I install these services to work: sudo apt-get install hadoop-client hive oozie-client sqoop I add these variable in my ".bash_profile" export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/share/java/slf4j-simple.jar
export HIVE_HOME=/usr/lib/hive
export PATH=$PATH:$HIVE_HOME/bin Then I do an "scp" to recover the "/etc/sqoop", "/etc/hive", "/etc/hadoop/", The configuration seems to be ok so. If not, the command can't start. I tried to add the the HIVE_CONF_DIR variable in differents file: - sqoop-env.sh - hadoop-env.sh - hive-env.sh Without any success. The process starts, but the error still present. Hope somebody can help me !
... View more
02-18-2016
03:01 PM
To reproduce the problem, I install hadoop-client and sqoop on my machine. Same error here... The job start, data import is done successfully on HDFS, (I can see on Hue the job status and the database is hdfs), 16/02/18 17:01:15 INFO mapreduce.Job: Running job: job_1455812803225_0020
16/02/18 17:01:24 INFO mapreduce.Job: Job job_1455812803225_0020 running in uber mode : false
16/02/18 17:01:24 INFO mapreduce.Job: map 0% reduce 0%
16/02/18 17:01:33 INFO mapreduce.Job: map 25% reduce 0%
16/02/18 17:01:34 INFO mapreduce.Job: map 50% reduce 0%
16/02/18 17:01:41 INFO mapreduce.Job: map 100% reduce 0%
16/02/18 17:01:41 INFO mapreduce.Job: Job job_1455812803225_0020 completed successfully
16/02/18 17:01:41 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=555640
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=473
HDFS: Number of bytes written=8432
HDFS: Number of read operations=16
HDFS: Number of large read operations=0
HDFS: Number of write operations=8
Job Counters
Launched map tasks=4
Other local map tasks=4
Total time spent by all maps in occupied slots (ms)=25664
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=25664
Total vcore-seconds taken by all map tasks=25664
Total megabyte-seconds taken by all map tasks=26279936
Map-Reduce Framework
Map input records=91
Map output records=91
Input split bytes=473
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=351
CPU time spent (ms)=4830
Physical memory (bytes) snapshot=802369536
Virtual memory (bytes) snapshot=6319828992
Total committed heap usage (bytes)=887095296
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=8432
16/02/18 17:01:41 INFO mapreduce.ImportJobBase: Transferred 8,2344 KB in 30,7491 seconds (274,219 bytes/sec)
16/02/18 17:01:41 INFO mapreduce.ImportJobBase: Retrieved 91 records. but when import with hive is start: 16/02/18 17:01:41 WARN hive.TableDefWriter: Column last_updated had to be cast to a less precise type in Hive
16/02/18 17:01:41 INFO hive.HiveImport: Loading uploaded data into Hive
16/02/18 17:01:41 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
16/02/18 17:01:41 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:50)
at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:195)
at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
... 12 more I try few thing about that: - Add variable HIVE_CONF_DIR=/etc/hive/conf to my .bash_profile file: No success - Add the same variable to /usr/lib/hive/conf/hive-env.sh: no success - Copy /usr/lib/sqoop/conf/sqoop-env-template.sh and add the variable inside: no success Hope somebody have an idea to help us!
... View more
02-15-2016
07:15 AM
I try to stop all service in the cluster and restart them. I use this documentation for the order to start the process Cloudera 5 documentation for stop/start order
... View more
- « Previous
-
- 1
- 2
- Next »