Support Questions

Find answers, ask questions, and share your expertise

Impala daemon fail to start CDH 5.11.1 - unable to start it even after clean reinstall

avatar
Contributor

Hi ,

 

3 days ago I started to test hadoop cloudera HA over CDH 5.11 using 4 nodes 8GB ram , 4 cores (google compute engines)

 

All nodes where defined as data nodes .

 

During these tests I have restarted nodes .

 

After a while there there was 1 node which did not succeed to start the impala services than 2 and at last all of them .

 

No matter I restarted the Impala services or restarted one instance . The service did not succeed to run with no informative error message in impala logs files and agent logs .

 

cloudera-scm-server log:

 

2017-06-22 11:50:26,011 INFO CommandPusher:com.cloudera.cmf.service.GenericBringUpRoleCommand: BringUp command (119) has finished on service impala for role 61/impala-IMPALAD-9cfb5b1ab405f5aa4093cb4531cd05dd, with status FAILURE and message MessageWithArgs{messageId=message.command.role.bringUp.supervisor.fatal, args=[]}

 

cloudera-scm-agent log -  nothing significant.

 

 

I have deleted the service and created it once again . Did solve the problem.

 

Deleted the services restarted all machined - again did not help .

 

I have Uninstalled cloudera in all machined using this link

https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cm_ig_uninstall_cm.html

 

restarted all machined and created the cluster once again  it failed on creating the impala services .

 

All other services are working with no issues (HDFS, YARN,ZOOKEEPER,OOZIE)

 

What is going on ?

Do I miss something ?

 

I have QA ENV with failed impala instance with no remedy .

I'm afraid to have this problem in production.

 

Appreciate your help .

Many thanks

 

Alon

2 ACCEPTED SOLUTIONS

avatar

You might be running into https://issues.apache.org/jira/browse/IMPALA-5578, which is an issue with the Java Virtual machine (there's one embedded in the Impala daemon) and a Linux kernel update. See that JIRA for details.

 

You could try downgrading your kernel and restarting to confirm that that is indeed the issues. The suggested workaround to the problem, if confirmed, is to increase the -Xss parameter passed to the JVM.

View solution in original post

avatar
Contributor

Thanks a lot Tim .

 

It sloved the problem .

 

Downgraged ubuntu to 4.4.0-38-generic and impala run successfully .

View solution in original post

8 REPLIES 8

avatar

I'd suggest looking at the log files for the Impala Daemon role that failed.


There will typically be an explanation for the failed startup in there.

avatar
Contributor

Thanks .But there is nothing there .

****************************************************Log file created at: 2017/06/26 05:25:51
Running on machine: gc-dp-pdpint-data-02
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0626 05:25:51.308739 26469 logging.cc:120] stdout will be logged to this file.
E0626 05:25:51.308979 26469 logging.cc:121] stderr will be logged to this file.
I0626 05:25:51.309459 26469 minidump.cc:229] Setting minidump size limit to 20971520.
I0626 05:25:51.309499 26469 atomicops-internals-x86.cc:93] vendor GenuineIntel  family 6  model 15  sse2 1  cmpxchg16b 1
I0626 05:25:51.319569 26469 authentication.cc:998] Using LDAP authentication with server ldap://gc-dp-pdpint-app-01.c.bi-environment-1271.internal:389
W0626 05:25:51.319586 26469 authentication.cc:1002] LDAP authentication is being used, but without TLS. ALL PASSWORDS WILL GO OVER THE NETWORK IN THE CLEAR.
I0626 05:25:51.319600 26469 authentication.cc:1066] Internal communication is not authenticated
I0626 05:25:51.319603 26469 authentication.cc:1083] External communication is authenticated with LDAP
I0626 05:25:51.321149 26469 init.cc:206] impalad version 2.8.0-cdh5.11.0 RELEASE (build e09660de6b503a15f07e84b99b63e8e745854c34)
Built on Wed Apr  5 19:51:24 PDT 2017
I0626 05:25:51.321161 26469 init.cc:207] Using hostname: gc-dp-pdpint-data-02.c.bi-environment-1271.internal
I0626 05:25:51.321844 26469 logging.cc:156] Flags (see also /varz are on debug webserver):
--catalog_service_port=26000
--initial_hms_cnxn_timeout_s=120
--load_catalog_in_background=false
--num_metadata_loading_threads=16
--sentry_config=
--asm_module_dir=
--disable_optimization_passes=false
--dump_ir=false
--opt_module_dir=
--perf_map=false
--print_llvm_ir_instruction_count=false
--unopt_module_dir=
--abort_on_config_error=true
--be_port=22000
--be_principal=
--compact_catalog_topic=false
--disable_kudu=false
--disable_mem_pools=false
--enable_accept_queue_server=true
--enable_process_lifetime_heap_profiling=false
--heap_profile_dir=
--hostname=gc-dp-pdpint-data-02.c.bi-environment-1271.internal
--inc_stats_size_limit_bytes=2147483648
--keytab_file=
--krb5_conf=
--krb5_debug_file=
--kudu_operation_timeout_ms=180000
--load_auth_to_local_rules=false
--max_minidumps=9
--mem_limit=53687091200
--minidump_path=/var/log/impala-minidumps/impalad
--minidump_size_limit_hint_kb=20480
--principal=
--redaction_rules_file=
--max_log_files=10
--pause_monitor_sleep_time_ms=500
--pause_monitor_warn_threshold_ms=10000
--log_filename=impalad
--redirect_stdout_stderr=true
--data_source_batch_size=1024
--exchg_node_buffer_size_bytes=10485760
--enable_partitioned_aggregation=true
--enable_partitioned_hash_join=true
--enable_probe_side_filtering=true
--enable_quadratic_probing=true
--skip_lzo_version_check=false
--parquet_min_filter_reject_ratio=0.10000000000000001
--runtime_filter_wait_time_ms=1000
--suppress_unknown_disk_id_warnings=false
--max_row_batches=0
--kudu_max_row_batches=0
--kudu_scanner_keep_alive_period_us=15000000
--kudu_read_mode=READ_LATEST
--kudu_scanner_keep_alive_period_sec=15
--pick_only_leaders_for_tests=false
--kudu_mutation_buffer_size=10485760
--kudu_sink_mem_required=20971520
--convert_legacy_hive_parquet_utc_timestamps=false
--max_page_header_size=8388608
--enable_phj_probe_side_filtering=true
--accepted_cnxn_queue_depth=10000
--enable_ldap_auth=true
--internal_principals_whitelist=hdfs
--kerberos_reinit_interval=60
--ldap_allow_anonymous_binds=false
--ldap_baseDN=ou=users,dc=localdomain
--ldap_bind_pattern=
--ldap_ca_certificate=
--ldap_domain=
--ldap_manual_config=false
--ldap_passwords_in_clear_ok=true
--ldap_tls=false
--ldap_uri=ldap://gc-dp-pdpint-app-01.c.bi-environment-1271.internal:389
--sasl_path=
--rpc_cnxn_attempts=10
--rpc_cnxn_retry_interval_ms=2000
--disk_spill_encryption=false
--insert_inherit_permissions=false
--datastream_sender_timeout_ms=120000
--max_cached_file_handles=0
--max_free_io_buffers=128
--min_buffer_size=1024
--num_disks=0
--num_remote_hdfs_io_threads=24
--num_s3_io_threads=16
--num_threads_per_disk=0
--read_size=8388608
--backend_client_connection_num_retries=3
--backend_client_rpc_timeout_ms=300000
--catalog_client_connection_num_retries=3
--catalog_client_rpc_timeout_ms=0
--catalog_service_host=gc-dp-pdpint-name-02.c.bi-environment-1271.internal
--cgroup_hierarchy_path=
--coordinator_rpc_threads=12
--enable_rm=false
--enable_webserver=true
--llama_addresses=
--llama_callback_port=28000
--llama_host=
--llama_max_request_attempts=5
--llama_port=15000
--llama_registration_timeout_secs=30
--llama_registration_wait_secs=3
--num_hdfs_worker_threads=48
--resource_broker_cnxn_attempts=1
--resource_broker_cnxn_retry_interval_ms=3000
--resource_broker_recv_timeout=0
--resource_broker_send_timeout=0
--staging_cgroup=impala_staging
--state_store_host=gc-dp-pdpint-name-02.c.bi-environment-1271.internal
--state_store_subscriber_port=23000
--use_statestore=true
--s3a_access_key_cmd=
--s3a_secret_key_cmd=
--local_library_dir=/var/lib/impala/udfs
--serialize_batch=false
--status_report_interval=5
--max_filter_error_rate=0.75
--num_threads_per_core=3
--use_local_tz_for_unix_timestamp_conversions=false
--scratch_dirs=/plarium/1/impala/impalad,/plarium/2/impala/impalad,/plarium/3/impala/impalad
--queue_wait_timeout_ms=60000
--rm_always_use_defaults=false
--rm_default_cpu_vcores=2
--rm_default_memory=4G
--default_pool_max_queued=200
--default_pool_max_requests=-1
--default_pool_mem_limit=
--disable_pool_max_requests=false
--disable_pool_mem_limits=false
--fair_scheduler_allocation_path=/run/cloudera-scm-agent/process/2049-impala-IMPALAD/impala-conf/fair-scheduler.xml
--llama_site_path=/run/cloudera-scm-agent/process/2049-impala-IMPALAD/impala-conf/llama-site.xml
--require_username=false
--disable_admission_control=false
--log_mem_usage_interval=0
--authorization_policy_file=/user/impala/integration_impala-policy.ini
--authorization_policy_provider_class=org.apache.sentry.provider.file.LocalGroupResourceAuthorizationProvider
--authorized_proxy_user_config=
--authorized_proxy_user_config_delimiter=,
--kudu_master_hosts=
--server_name=server1
--abort_on_failed_audit_event=true
--abort_on_failed_lineage_event=true
--audit_event_log_dir=
--be_service_threads=64
--beeswax_port=21000
--cancellation_thread_pool_size=5
--default_query_options=
--fe_service_threads=64
--hs2_port=21050
--idle_query_timeout=3600
--idle_session_timeout=86400
--lineage_event_log_dir=/var/log/impalad/lineage
--local_nodemanager_url=
--log_query_to_file=true
--max_audit_event_log_file_size=5000
--max_lineage_log_file_size=5000
--max_profile_log_file_size=5000
--max_profile_log_files=10
--max_result_cache_size=100000
--profile_log_dir=
--query_log_size=25
--ssl_client_ca_certificate=
--ssl_private_key=BUNDLE-REDACTED 05:25:51.321954 26469 init.cc:212] Cpu Info:
  Model: Intel(R) Xeon(R)

avatar

You might be running into https://issues.apache.org/jira/browse/IMPALA-5578, which is an issue with the Java Virtual machine (there's one embedded in the Impala daemon) and a Linux kernel update. See that JIRA for details.

 

You could try downgrading your kernel and restarting to confirm that that is indeed the issues. The suggested workaround to the problem, if confirmed, is to increase the -Xss parameter passed to the JVM.

avatar
Contributor

Thanks a lot Tim .

 

It sloved the problem .

 

Downgraged ubuntu to 4.4.0-38-generic and impala run successfully .

avatar
Explorer
iam using the ubuntu 16.04 with cdh 5.11 to 4.4.0-81-generic i have a problem with the impala daemon and catalog server is starting .i reinstalled even they are not starting can u help me ..thanks advance

avatar
Contributor

You can do this as well it should solve the problem :

 

Add the following to /usr/lib/cmf/service/impala/impala.sh, right above "set impala configuration directory":


export JAVA_TOOL_OPTIONS="-Xss2m"

and the same export variable to /etc/profile as well as /etc/bash.bashrc

avatar
Explorer
after doing this impala daemon is started but catalog server is not starting on cdh 5.11 .plz help ...

avatar
Champion

@sri1993

 

please look in to my response in this thread . i think its a know issue . 

http://community.cloudera.com/t5/Cloudera-Manager-Installation/Impala-Catalog-Server-supervisor-perm...

 

let me know if that helps