Member since
09-29-2014
221
Posts
10
Kudos Received
9
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
850 | 03-30-2022 08:56 PM | |
1011 | 08-12-2021 10:40 AM | |
2939 | 04-28-2021 01:30 AM | |
2834 | 09-27-2016 08:16 PM | |
2134 | 09-24-2016 11:46 AM |
04-05-2022
07:43 AM
@araujo do you have any suggestion for this case ?
... View more
04-01-2022
04:14 AM
i give you a whole job log of sqoop for checking more details (this is just a example, hive query is the same) [root@host243 ~]# sqoop export --connect jdbc:mysql://10.37.144.6:3306/xaxsuatdb?characterEncoding=utf-8 --username root --password xaxs2016 --table customer_feature --export-dir "/user/hive/warehouse/penglin.db/label_cus_kpi_hightable_h" --input-fields-terminated-by '\001' --input-null-string '\\N' --input-null-non-string '\\N' --update-key CUSTOMER_BP,ORG_CODE,TAG_ID,VERSION --columns CUSTOMER_BP,ORG_CODE,CPMO_COP,TAG_ID,TAG_NAME,TAG_VALUE,VERSION,UPDATE_TIME --update-mode allowinsert -m 1;
Warning: /data/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/jars/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/jars/log4j-slf4j-impl-2.8.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
22/04/01 19:10:03 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7-cdh6.2.0
22/04/01 19:10:04 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
22/04/01 19:10:04 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
22/04/01 19:10:04 INFO tool.CodeGenTool: Beginning code generation
22/04/01 19:10:04 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customer_feature` AS t LIMIT 1
22/04/01 19:10:04 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `customer_feature` AS t LIMIT 1
22/04/01 19:10:04 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /data/cloudera/parcels/CDH/lib/hadoop-mapreduce
22/04/01 19:10:05 ERROR orm.CompilationManager: Could not rename /tmp/sqoop-root/compile/7255ac988b70c7d9b5eb963a6f4946f5/customer_feature.java to /root/./customer_feature.java. Error: Destination '/root/./customer_feature.java' already exists
22/04/01 19:10:05 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/7255ac988b70c7d9b5eb963a6f4946f5/customer_feature.jar
22/04/01 19:10:05 WARN manager.MySQLManager: MySQL Connector upsert functionality is using INSERT ON
22/04/01 19:10:05 WARN manager.MySQLManager: DUPLICATE KEY UPDATE clause that relies on table's unique key.
22/04/01 19:10:05 WARN manager.MySQLManager: Insert/update distinction is therefore independent on column
22/04/01 19:10:05 WARN manager.MySQLManager: names specified in --update-key parameter. Please see MySQL
22/04/01 19:10:05 WARN manager.MySQLManager: documentation for additional limitations.
22/04/01 19:10:05 INFO mapreduce.ExportJobBase: Beginning export of customer_feature
22/04/01 19:10:06 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
22/04/01 19:10:06 WARN mapreduce.ExportJobBase: IOException checking input file header: java.io.EOFException
22/04/01 19:10:06 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
22/04/01 19:10:06 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
22/04/01 19:10:06 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
22/04/01 19:10:07 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm127
22/04/01 19:10:07 INFO hdfs.DFSClient: Created token for hive: HDFS_DELEGATION_TOKEN owner=hive@DEV.ENN.CN, renewer=yarn, realUser=, issueDate=1648811407106, maxDate=1649416207106, sequenceNumber=176449, masterKeyId=2259 on ha-hdfs:nameservice1
22/04/01 19:10:07 INFO security.TokenCache: Got dt for hdfs://nameservice1; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:nameservice1, Ident: (token for hive: HDFS_DELEGATION_TOKEN owner=hive@DEV.ENN.CN, renewer=yarn, realUser=, issueDate=1648811407106, maxDate=1649416207106, sequenceNumber=176449, masterKeyId=2259)
22/04/01 19:10:07 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/hive/.staging/job_1648759620123_0052
22/04/01 19:10:09 INFO input.FileInputFormat: Total input files to process : 37
22/04/01 19:10:09 INFO input.FileInputFormat: Total input files to process : 37
22/04/01 19:10:09 INFO mapreduce.JobSubmitter: number of splits:2
22/04/01 19:10:09 INFO Configuration.deprecation: yarn.resourcemanager.zk-address is deprecated. Instead, use hadoop.zk.address
22/04/01 19:10:09 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
22/04/01 19:10:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1648759620123_0052
22/04/01 19:10:09 INFO mapreduce.JobSubmitter: Executing with tokens: [Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:nameservice1, Ident: (token for hive: HDFS_DELEGATION_TOKEN owner=hive@DEV.ENN.CN, renewer=yarn, realUser=, issueDate=1648811407106, maxDate=1649416207106, sequenceNumber=176449, masterKeyId=2259)]
22/04/01 19:10:09 INFO conf.Configuration: resource-types.xml not found
22/04/01 19:10:09 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
22/04/01 19:10:09 INFO impl.YarnClientImpl: Submitted application application_1648759620123_0052
22/04/01 19:10:09 INFO mapreduce.Job: The url to track the job: http://host243.master.dev.cluster.enn.cn:8088/proxy/application_1648759620123_0052/
22/04/01 19:10:09 INFO mapreduce.Job: Running job: job_1648759620123_0052
22/04/01 19:10:17 INFO mapreduce.Job: Job job_1648759620123_0052 running in uber mode : false
22/04/01 19:10:17 INFO mapreduce.Job: map 0% reduce 0%
22/04/01 19:10:26 INFO mapreduce.Job: map 50% reduce 0%
22/04/01 19:10:38 INFO mapreduce.Job: map 59% reduce 0%
22/04/01 19:10:44 INFO mapreduce.Job: map 67% reduce 0%
22/04/01 19:10:50 INFO mapreduce.Job: map 75% reduce 0%
22/04/01 19:10:56 INFO mapreduce.Job: map 84% reduce 0%
22/04/01 19:11:02 INFO mapreduce.Job: map 92% reduce 0%
22/04/01 19:11:07 INFO mapreduce.Job: map 100% reduce 0%
22/04/01 19:11:07 INFO mapreduce.Job: Job job_1648759620123_0052 completed successfully
22/04/01 19:11:07 INFO mapreduce.Job: Counters: 34
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=504986
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=135301211
HDFS: Number of bytes written=0
HDFS: Number of read operations=113
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
HDFS: Number of bytes read erasure-coded=0
Job Counters
Launched map tasks=2
Other local map tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=106796
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=53398
Total vcore-milliseconds taken by all map tasks=53398
Total megabyte-milliseconds taken by all map tasks=109359104
Map-Reduce Framework
Map input records=1500414
Map output records=1500414
Input split bytes=3902
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=296
CPU time spent (ms)=35690
Physical memory (bytes) snapshot=930320384
Virtual memory (bytes) snapshot=5727088640
Total committed heap usage (bytes)=1557135360
Peak Map Physical memory (bytes)=534118400
Peak Map Virtual memory (bytes)=2866765824
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
22/04/01 19:11:07 INFO mapreduce.ExportJobBase: Transferred 129.0333 MB in 60.3414 seconds (2.1384 MB/sec)
22/04/01 19:11:07 INFO mapreduce.ExportJobBase: Exported 1500414 records. we can see, the sqoop job is finished and successful. but we can find this error log from container logs. Log Type: container-localizer-syslog
Log Upload Time: Fri Apr 01 19:11:14 +0800 2022
Log Length: 3398
2022-04-01 19:10:18,487 WARN [main] org.apache.hadoop.security.LdapGroupsMapping: Exception while trying to get password for alias hadoop.security.group.mapping.ldap.bind.password:
java.io.IOException: Configuration problem with provider path.
at org.apache.hadoop.conf.Configuration.getPasswordFromCredentialProviders(Configuration.java:2272)
at org.apache.hadoop.conf.Configuration.getPassword(Configuration.java:2191)
at org.apache.hadoop.security.LdapGroupsMapping.getPassword(LdapGroupsMapping.java:719)
at org.apache.hadoop.security.LdapGroupsMapping.setConf(LdapGroupsMapping.java:616)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:77)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:137)
at org.apache.hadoop.security.Groups.<init>(Groups.java:106)
at org.apache.hadoop.security.Groups.<init>(Groups.java:102)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:451)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:352)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:314)
at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1973)
at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:743)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:693)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:604)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:461)
Caused by: java.io.FileNotFoundException: /var/run/cloudera-scm-agent/process/26878-yarn-NODEMANAGER/creds.localjceks (Permission denied)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider.getInputStreamForFile(LocalJavaKeyStoreProvider.java:83)
at org.apache.hadoop.security.alias.AbstractJavaKeyStoreProvider.locateKeystore(AbstractJavaKeyStoreProvider.java:321)
at org.apache.hadoop.security.alias.AbstractJavaKeyStoreProvider.<init>(AbstractJavaKeyStoreProvider.java:86)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider.<init>(LocalJavaKeyStoreProvider.java:58)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider.<init>(LocalJavaKeyStoreProvider.java:50)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider$Factory.createProvider(LocalJavaKeyStoreProvider.java:177)
at org.apache.hadoop.security.alias.CredentialProviderFactory.getProviders(CredentialProviderFactory.java:73)
at org.apache.hadoop.conf.Configuration.getPasswordFromCredentialProviders(Configuration.java:2253)
... 15 more
2022-04-01 19:10:18,723 INFO [main] org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer: Disk Validator: yarn.nodemanager.disk-validator is loaded.
2022-04-01 19:10:19,741 WARN [ContainerLocalizer Downloader] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
Log Type: prelaunch.err
Log Upload Time: Fri Apr 01 19:11:14 +0800 2022
Log Length: 0
Log Type: prelaunch.out
Log Upload Time: Fri Apr 01 19:11:14 +0800 2022
Log Length: 70
Setting up env variables
Setting up job resources
Launching container
Log Type: stderr
Log Upload Time: Fri Apr 01 19:11:14 +0800 2022
Log Length: 0
Log Type: stdout
Log Upload Time: Fri Apr 01 19:11:14 +0800 2022
Log Length: 0
Log Type: syslog
Log Upload Time: Fri Apr 01 19:11:14 +0800 2022
Log Length: 45172
Showing 4096 bytes of 45172 total. Click here for the full log.
ainer_e483_1648759620123_0052_01_000002/transaction-api-1.1.jar:/data/yarn/nm/usercache/hive/appcache/application_1648759620123_0052/container_e483_1648759620123_0052_01_000002/commons-jexl-2.1.1.jar
java.io.tmpdir: /data/yarn/nm/usercache/hive/appcache/application_1648759620123_0052/container_e483_1648759620123_0052_01_000002/tmp
user.dir: /data/yarn/nm/usercache/hive/appcache/application_1648759620123_0052/container_e483_1648759620123_0052_01_000002
user.name: hive
************************************************************/
2022-04-01 19:10:22,636 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2022-04-01 19:10:23,151 INFO [main] org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ]
2022-04-01 19:10:23,248 WARN [main] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2022-04-01 19:10:23,427 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: Paths:/user/hive/warehouse/penglin.db/label_cus_kpi_hightable_h/000005_0:0+1582218,/user/hive/warehouse/penglin.db/label_cus_kpi_hightable_h/000008_0:0+22415866,/user/hive/warehouse/penglin.db/label_cus_kpi_hightable_h/000014_0:0+23029717,/user/hive/warehouse/penglin.db/label_cus_kpi_hightable_h/000015_0:0+10901528,/user/hive/warehouse/penglin.db/label_cus_kpi_hightable_h/000017_0:0+17525005,/user/hive/warehouse/penglin.db/label_cus_kpi_hightable_h/000018_0:0+16289569,/user/hive/warehouse/penglin.db/label_cus_kpi_hightable_h/000036_0:0+43553385
2022-04-01 19:10:23,432 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.file is deprecated. Instead, use mapreduce.map.input.file
2022-04-01 19:10:23,432 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.start is deprecated. Instead, use mapreduce.map.input.start
2022-04-01 19:10:23,432 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: map.input.length is deprecated. Instead, use mapreduce.map.input.length
2022-04-01 19:11:04,627 INFO [Thread-14] org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false
2022-04-01 19:11:04,671 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1648759620123_0052_m_000000_0 is done. And is in the process of committing
2022-04-01 19:11:04,715 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1648759620123_0052_m_000000_0' done.
2022-04-01 19:11:04,728 INFO [main] org.apache.hadoop.mapred.Task: Final Counters for attempt_1648759620123_0052_m_000000_0: Counters: 26
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=252493
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=135298087
HDFS: Number of bytes written=0
HDFS: Number of read operations=22
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
HDFS: Number of bytes read erasure-coded=0
Map-Reduce Framework
Map input records=1500414
Map output records=1500414
Input split bytes=778
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=231
CPU time spent (ms)=34070
Physical memory (bytes) snapshot=534118400
Virtual memory (bytes) snapshot=2866765824
Total committed heap usage (bytes)=788529152
Peak Map Physical memory (bytes)=534118400
Peak Map Virtual memory (bytes)=2866765824
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
2022-04-01 19:11:04,829 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system...
2022-04-01 19:11:04,829 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped.
2022-04-01 19:11:04,829 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete
... View more
04-01-2022
04:07 AM
Log Type: container-localizer-syslog
Log Upload Time: Thu Mar 31 02:24:54 +0800 2022
Log Length: 3720
2022-03-31 02:24:16,275 WARN [main] org.apache.hadoop.security.LdapGroupsMapping: Exception while trying to get password for alias hadoop.security.group.mapping.ldap.bind.password:
java.io.IOException: Configuration problem with provider path.
at org.apache.hadoop.conf.Configuration.getPasswordFromCredentialProviders(Configuration.java:2272)
at org.apache.hadoop.conf.Configuration.getPassword(Configuration.java:2191)
at org.apache.hadoop.security.LdapGroupsMapping.getPassword(LdapGroupsMapping.java:719)
at org.apache.hadoop.security.LdapGroupsMapping.setConf(LdapGroupsMapping.java:616)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:77)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:137)
at org.apache.hadoop.security.Groups.<init>(Groups.java:106)
at org.apache.hadoop.security.Groups.<init>(Groups.java:102)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:451)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:352)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:314)
at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1973)
at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:743)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:693)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:604)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:461)
Caused by: java.io.FileNotFoundException: /var/run/cloudera-scm-agent/process/26618-yarn-NODEMANAGER/creds.localjceks (Permission denied)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider.getInputStreamForFile(LocalJavaKeyStoreProvider.java:83)
at org.apache.hadoop.security.alias.AbstractJavaKeyStoreProvider.locateKeystore(AbstractJavaKeyStoreProvider.java:321)
at org.apache.hadoop.security.alias.AbstractJavaKeyStoreProvider.<init>(AbstractJavaKeyStoreProvider.java:86)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider.<init>(LocalJavaKeyStoreProvider.java:58)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider.<init>(LocalJavaKeyStoreProvider.java:50)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider$Factory.createProvider(LocalJavaKeyStoreProvider.java:177)
at org.apache.hadoop.security.alias.CredentialProviderFactory.getProviders(CredentialProviderFactory.java:73)
at org.apache.hadoop.conf.Configuration.getPasswordFromCredentialProviders(Configuration.java:2253)
... 15 more
2022-03-31 02:24:16,438 INFO [main] org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer: Disk Validator: yarn.nodemanager.disk-validator is loaded.
2022-03-31 02:24:17,272 WARN [ContainerLocalizer Downloader] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
2022-03-31 02:24:19,294 WARN [ContainerLocalizer Downloader] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error it shows up in container-localizer-syslog. as you know, every map/reduce task has many logs when we open yarn web-ui, pick any one job ,there are map/reduce tasks, click these task to check task details, we can see the below logs: container-localizer-syslog : Total file length is 3398 bytes.
prelaunch.err : Total file length is 0 bytes.
prelaunch.out : Total file length is 70 bytes.
stderr : Total file length is 1643 bytes.
stdout : Total file length is 0 bytes.
syslog : Total file length is 141307 bytes.
... View more
03-31-2022
07:10 PM
actually , i don't know which user should visit this file while running map/reduce(hive query or sqoop, maybe there are also other programs)
... View more
03-31-2022
07:02 PM
Hi, araujo please refer to the below information: [appadmin@host21 ~]$ namei -l /run/cloudera-scm-agent/process/9295-yarn-RESOURCEMANAGER/creds.localjceks
f: /run/cloudera-scm-agent/process/9295-yarn-RESOURCEMANAGER/creds.localjceks
dr-xr-xr-x root root /
drwxr-xr-x root root run
drwxr-xr-x cloudera-scm cloudera-scm cloudera-scm-agent
drwxr-x--x root root process
drwxr-x--x yarn hadoop 9295-yarn-RESOURCEMANAGER
-rw-r----- yarn hadoop creds.localjceks [appadmin@host21 ~]$ ls -ln /run/cloudera-scm-agent/process/9295-yarn-RESOURCEMANAGER/creds.localjceks
-rw-r----- 1 981 984 533 Mar 25 03:10 /run/cloudera-scm-agent/process/9295-yarn-RESOURCEMANAGER/creds.localjceks the creds.localjecks owner is 981:984, the below output is yarn user id and hadoop group id. [root@host21 ~]# cat /etc/passwd | grep 981
solr:x:987:981:Solr:/var/lib/solr:/sbin/nologin
yarn:x:981:975:Hadoop Yarn:/var/lib/hadoop-yarn:/bin/bash
[root@host21 ~]#
[root@host21 ~]# cat /etc/group | grep hadoop
hadoop:x:984:hdfs,mapred,yarn
... View more
03-31-2022
04:05 PM
/run/cloudera-scm-agent/process/9506-IMPALA-impala-CATALOGSERVER-45e2ae1dbc69e00f769182717dd71aa8-ImpalaRoleDiagnosticsCollection/creds.localjceks
/run/cloudera-scm-agent/process/9478-hue-KT_RENEWER/creds.localjceks
/run/cloudera-scm-agent/process/9476-hue-HUE_SERVER/creds.localjceks
/run/cloudera-scm-agent/process/9471-impala-CATALOGSERVER/creds.localjceks
/run/cloudera-scm-agent/process/9462-impala-CATALOGSERVER/creds.localjceks
/run/cloudera-scm-agent/process/9456-sentry-SENTRY_SERVER/creds.localjceks
/run/cloudera-scm-agent/process/9455-oozie-OOZIE_SERVER/creds.localjceks
/run/cloudera-scm-agent/process/9454-hue-KT_RENEWER/creds.localjceks
/run/cloudera-scm-agent/process/9452-hue-HUE_SERVER/creds.localjceks
/run/cloudera-scm-agent/process/9448-hive-HIVEMETASTORE/creds.localjceks
/run/cloudera-scm-agent/process/9446-sentry-SENTRY_SERVER/creds.localjceks
/run/cloudera-scm-agent/process/9445-oozie-OOZIE_SERVER/creds.localjceks
/run/cloudera-scm-agent/process/9444-hue-KT_RENEWER/creds.localjceks
/run/cloudera-scm-agent/process/9442-hue-HUE_SERVER/creds.localjceks
/run/cloudera-scm-agent/process/9438-hive-HIVEMETASTORE/creds.localjceks
/run/cloudera-scm-agent/process/9437-hue-KT_RENEWER/creds.localjceks
/run/cloudera-scm-agent/process/9435-hue-HUE_SERVER/creds.localjceks
/run/cloudera-scm-agent/process/9429-impala-CATALOGSERVER/creds.localjceks
/run/cloudera-scm-agent/process/9424-oozie-OOZIE_SERVER/creds.localjceks
/run/cloudera-scm-agent/process/9420-hive-HIVEMETASTORE/creds.localjceks
/run/cloudera-scm-agent/process/9400-sentry-SENTRY_SERVER/creds.localjceks
/run/cloudera-scm-agent/process/9399-yarn-RESOURCEMANAGER/creds.localjceks
/run/cloudera-scm-agent/process/9388-yarn-JOBHISTORY/creds.localjceks
/run/cloudera-scm-agent/process/9413-hbase-REGIONSERVER/creds.localjceks
/run/cloudera-scm-agent/process/9411-hbase-MASTER/creds.localjceks
/run/cloudera-scm-agent/process/9377-hdfs-NAMENODE-nnRpcWait/creds.localjceks
/run/cloudera-scm-agent/process/9361-hdfs-NAMENODE/creds.localjceks
/run/cloudera-scm-agent/process/9351-HBaseShutdown/creds.localjceks
/run/cloudera-scm-agent/process/9343-hue-HUE_SERVER/creds.localjceks
/run/cloudera-scm-agent/process/9345-hue-KT_RENEWER/creds.localjceks
/run/cloudera-scm-agent/process/9339-hive-HIVEMETASTORE/creds.localjceks
/run/cloudera-scm-agent/process/9338-oozie-OOZIE_SERVER/creds.localjceks
/run/cloudera-scm-agent/process/9337-sentry-SENTRY_SERVER/creds.localjceks
/run/cloudera-scm-agent/process/9333-hue-KT_RENEWER/creds.localjceks every roles has their own creds.localjecks, and the default permission is 640. i pick some roles localjecks for your checking [root@host21 ~]# ls -l /run/cloudera-scm-agent/process/9478-hue-KT_RENEWER/creds.localjceks
-rw-r----- 1 hue hue 1501 Mar 25 04:11 /run/cloudera-scm-agent/process/9478-hue-KT_RENEWER/creds.localjceks
[root@host21 ~]# ls -l /run/cloudera-scm-agent/process/9471-impala-CATALOGSERVER/creds.localjceks
-rw-r----- 1 impala impala 533 Mar 25 04:01 /run/cloudera-scm-agent/process/9471-impala-CATALOGSERVER/creds.localjceks
[root@host21 ~]#
[root@host21 ~]# ls -l /run/cloudera-scm-agent/process/8788-hive-HIVEMETASTORE/creds.localjceks
-rw-r----- 1 hive hive 528 Mar 4 09:34 /run/cloudera-scm-agent/process/8788-hive-HIVEMETASTORE/creds.localjceks
[root@host21 ~]# ls -l /run/cloudera-scm-agent/process/9295-yarn-RESOURCEMANAGER/creds.localjceks
-rw-r----- 1 yarn hadoop 533 Mar 25 03:10 /run/cloudera-scm-agent/process/9295-yarn-RESOURCEMANAGER/creds.localjceks when i run hive sql or sqoop , the permission denied of creds.localjecks happended.
... View more
03-31-2022
02:02 PM
[root@host243 ~]# id yarn
uid=979(yarn) gid=973(yarn) groups=973(yarn),982(hadoop),979(solr)
[root@host243 ~]#
[root@host243 ~]#
[root@host243 ~]# hdfs groups yarn
yarn : hadoop yarn openldap user has been imported from OS user. so i think openldap user and group keep the same as os user/group. there is just one think i'd like to share with you , after integrated with openldap, i haven't delete OS user.
... View more
03-30-2022
08:56 PM
1 Kudo
it's done. after i set storage policy to ALL_SSD, and restart all the service , this error disappeared.
... View more
03-30-2022
01:31 PM
i followed https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_sg_ldap_grp_mappings.html#ldap_group_mapping to set up openldap integration . 1. install openldap 2. set ldap parameter by doucments. 3. restart all service.
... View more
03-30-2022
11:47 AM
as you know , this file locate many path, namenode, datenode, yarn ,hbase. and this file is created by CDH, do you suggest me to change these location path permission ? if i restart one of these role, this file as i think would created again , and the permission still would be 700
... View more
03-30-2022
10:48 AM
HI, after i have integrated CDH with Openldap, I found there is a WARNING in container log like below, try to get password file localjecks and permission denied. 2022-03-31 00:53:13,420 WARN [main] org.apache.hadoop.security.LdapGroupsMapping: Exception while trying to get password for alias hadoop.security.group.mapping.ldap.ssl.keystore.password:
java.io.IOException: Configuration problem with provider path.
at org.apache.hadoop.conf.Configuration.getPasswordFromCredentialProviders(Configuration.java:2118)
at org.apache.hadoop.conf.Configuration.getPassword(Configuration.java:2037)
at org.apache.hadoop.security.LdapGroupsMapping.getPassword(LdapGroupsMapping.java:528)
at org.apache.hadoop.security.LdapGroupsMapping.setConf(LdapGroupsMapping.java:473)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.security.Groups.<init>(Groups.java:104)
at org.apache.hadoop.security.Groups.<init>(Groups.java:100)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:435)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:341)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:308)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:895)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:861)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:728)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:387)
Caused by: java.io.FileNotFoundException: /run/cloudera-scm-agent/process/9392-yarn-NODEMANAGER/creds.localjceks (Permission denied)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider.getInputStreamForFile(LocalJavaKeyStoreProvider.java:83)
at org.apache.hadoop.security.alias.AbstractJavaKeyStoreProvider.locateKeystore(AbstractJavaKeyStoreProvider.java:334)
at org.apache.hadoop.security.alias.AbstractJavaKeyStoreProvider.<init>(AbstractJavaKeyStoreProvider.java:88)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider.<init>(LocalJavaKeyStoreProvider.java:58)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider.<init>(LocalJavaKeyStoreProvider.java:50)
at org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider$Factory.createProvider(LocalJavaKeyStoreProvider.java:177)
at org.apache.hadoop.security.alias.CredentialProviderFactory.getProviders(CredentialProviderFactory.java:73)
at org.apache.hadoop.conf.Configuration.getPasswordFromCredentialProviders(Configuration.java:2098) this warning doesn't affect the mapreduce job, i just want to know how to resolve this.
... View more
Labels:
- Labels:
-
Apache Hadoop
03-25-2022
03:06 PM
recently i have set up a new CDH cluster with all SSD disk. after this cluster goes live , i found the namenode log always output some WARNING log, as below:
2022-03-26 06:00:57,688 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD], creationFallbacks=[DISK], replicationFallbacks=[DISK]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology 2022-03-26 06:00:57,688 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD], creationFallbacks=[DISK], replicationFallbacks=[DISK]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology.
i would like to know what happend exactly, then i open debug log:
2022-03-26 05:56:50,837 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD], creationFallbacks=[DISK], replicationFallbacks=[DISK]}, newBlock=true) 2022-03-26 05:56:50,837 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.20.103:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:50,837 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to choose from local rack (location = /default); the second replica is not found, retry choosing randomly org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException: at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:827) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:715) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:622) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalStorage(BlockPlacementPolicyDefault.java:582) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:485) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:416) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:445) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:292) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:143) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:159) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2094) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2673) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) 2022-03-26 05:56:50,837 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD], creationFallbacks=[DISK], replicationFallbacks=[DISK]}, newBlock=true) 2022-03-26 05:56:50,837 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to choose remote rack (location = ~/default), fallback to local rack org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException: at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:827) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:689) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:494) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:416) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:465) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:445) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:292) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:143) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:159) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2094) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2673) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) 2022-03-26 05:56:50,837 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to choose remote rack (location = ~/default), fallback to local rack org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException: at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:827) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:689) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:503) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:416) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:465) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:445) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:292) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:143) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:159) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2094) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2673) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
there is a so strange information for me : the node xxxx has no enough space, actually, this is a new cluster, and all the node still has 8T space.
2022-03-26 05:56:45,328 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.23.103:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:46,724 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.23.27:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:46,724 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.23.27:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:50,836 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.20.103:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:50,837 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.20.103:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:51,777 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.21.31:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:51,778 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.21.31:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:57,978 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.21.228:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0). 2022-03-26 05:56:57,978 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: The node 10.228.21.228:9866 does not have enough SSD space (required=268435456, scheduled=0, remaining=0).
anyone knows how to handle this kind error ?
... View more
Labels:
03-23-2022
01:21 AM
1 Kudo
oh, this is a long time ago issue, the root cause is because new machines charset is not utf-8, just keep all the machines chaset is utf-8 , then its ok.
... View more
03-22-2022
05:39 PM
i found there are some people using SSSD to resolve this issue , install SSSD on every machine, yarn service will find user in local os , if os doesn't have this user, it will find in openldap. i have tested this solution, it works fine. but i still don't want to install SSSD on every machine, so my question is still why hdfs, hive, sentry can work fine with openldap, but yarn is not . what should i do ?
... View more
03-22-2022
02:34 PM
Hi, vinayk i have the same issue as yours, integrated hadoop with openldap, and hdfs, hive, sentry can work fine, i mean sentry or hdfs can find user in openldap. but the only exceptional is yarn can't , when i test mapreduce examples by the user which is exist in openldap, it gives me the errors like below: main : run as user is jialong main : requested yarn user is jialong User jialong not found as you know, when we create this user on os level , it will be ok , but i don't want to create user on os level, how to achieve that ? why hdfs. sentry can work fine, but yarn is not .
... View more
03-22-2022
01:33 PM
HI, everyone. i have finished hadoop integrated with openldap, and have tested hive, sentry ,hdfs , it works perfectly. but YARN can't find user in openldap. when i run mapreduce on os level, it shows me the errors as below: main : run as user is jialong main : requested yarn user is jialong User jialong not found my question is what should i do to get openldap user by yarn ? i don't want to create os user on every yarn machines.
... View more
Labels:
- Labels:
-
Apache YARN
08-12-2021
10:40 AM
i give you more details about this cdh cluster. the original cluster is 5.14 and os version is Centos 6.5, parcels REHL6, and recently i have added new machines into this cluster, os version is Centos 7.6 parcels is REHL7. all this erros happend just on the new machines which is REHL 7. the old datanode doesn't have this errors.
... View more
08-02-2021
02:30 PM
i have found my one of CDH has so many errors on every datanode, the error logs as below. who have this kind experience on this issue ? and give me some advises
2021-08-03 05:23:43,389 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061604_2849065475 src: /10.37.54.218:36088 dest: /10.37.54.218:1004
2021-08-03 05:23:43,700 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.218:36082, dest: /10.37.54.218:1004, bytes: 358, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-859199005_222, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061597_2849065468, duration: 59733778
2021-08-03 05:23:43,700 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061597_2849065468, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2021-08-03 05:23:43,833 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.218:36088, dest: /10.37.54.218:1004, bytes: 309, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-859199005_222, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061604_2849065475, duration: 200220559
2021-08-03 05:23:43,833 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061604_2849065475, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2021-08-03 05:23:44,044 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061619_2849065490 src: /10.37.54.15:59320 dest: /10.37.54.218:1004
2021-08-03 05:23:44,058 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.15:59320, dest: /10.37.54.218:1004, bytes: 112, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1165227557_139, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061619_2849065490, duration: 3752037
2021-08-03 05:23:44,058 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061619_2849065490, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2021-08-03 05:23:45,037 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061679_2849065550 src: /10.37.54.218:36108 dest: /10.37.54.218:1004
2021-08-03 05:23:45,185 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.218:36108, dest: /10.37.54.218:1004, bytes: 1415899, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1849481388_3452, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061679_2849065550, duration: 61038196
2021-08-03 05:23:45,185 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061679_2849065550, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2021-08-03 05:23:45,497 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Moved BP-2123011416-10.37.54.12-1457006347704:blk_3802213701_2741214333 from /10.37.54.13:44312, delHint=6a0ea409-35ad-42c5-956d-44a5b9bd58a6
2021-08-03 05:23:45,703 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061646_2849065517 src: /10.37.54.216:54728 dest: /10.37.54.218:1004
2021-08-03 05:23:45,714 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Received BP-2123011416-10.37.54.12-1457006347704:blk_3910061646_2849065517 src: /10.37.54.216:54728 dest: /10.37.54.218:1004 of size 4786053
2021-08-03 05:23:45,998 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Moved BP-2123011416-10.37.54.12-1457006347704:blk_1842563008_775812314 from /10.37.54.13:50434, delHint=6a0ea409-35ad-42c5-956d-44a5b9bd58a6
2021-08-03 05:23:46,042 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: BlockSender.sendChunks() exception:
java.io.IOException: 断开的管道
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608)
at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:223)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:605)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:789)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:736)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:551)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:148)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:103)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
at java.lang.Thread.run(Thread.java:745)
2021-08-03 05:23:46,043 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: BlockSender.sendChunks() exception:
java.io.IOException: 断开的管道
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608)
at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:223)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:605)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:789)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:736)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:551)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:148)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:103)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
at java.lang.Thread.run(Thread.java:745)
2021-08-03 05:23:47,003 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061723_2849065594 src: /10.37.54.216:54770 dest: /10.37.54.218:1004
2021-08-03 05:23:47,018 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061724_2849065595 src: /10.37.54.216:54772 dest: /10.37.54.218:1004
2021-08-03 05:23:47,019 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.216:54772, dest: /10.37.54.218:1004, bytes: 4158, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1438538333_1, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061724_2849065595, duration: 1392081
2021-08-03 05:23:47,019 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061724_2849065595, type=LAST_IN_PIPELINE, downstreams=0:[] terminating
2021-08-03 05:23:47,048 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061725_2849065596 src: /10.37.54.216:54774 dest: /10.37.54.218:1004
2021-08-03 05:23:47,056 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.216:54774, dest: /10.37.54.218:1004, bytes: 69, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_1452909160_189, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061725_2849065596, duration: 7712861
2021-08-03 05:23:47,056 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061725_2849065596, type=LAST_IN_PIPELINE, downstreams=0:[] terminating
2021-08-03 05:23:47,371 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061731_2849065602 src: /10.37.54.218:36198 dest: /10.37.54.218:1004
2021-08-03 05:23:47,407 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.218:36198, dest: /10.37.54.218:1004, bytes: 314, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_466653976_222, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061731_2849065602, duration: 11069615
2021-08-03 05:23:47,407 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061731_2849065602, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2021-08-03 05:23:47,422 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-2123011416-10.37.54.12-1457006347704:blk_3910061732_2849065603 src: /10.37.54.218:36202 dest: /10.37.54.218:1004
2021-08-03 05:23:47,458 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.37.54.218:36202, dest: /10.37.54.218:1004, bytes: 17456, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_466653976_222, offset: 0, srvID: 44713da0-9f69-44ea-b6c0-8f7420a41f83, blockid: BP-2123011416-10.37.54.12-1457006347704:blk_3910061732_2849065603, duration: 9623611
2021-08-03 05:23:47,458 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-2123011416-10.37.54.12-1457006347704:blk_3910061732_2849065603, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2021-08-03 05:23:47,497 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Moved BP-2123011416-10.37.54.12-1457006347704:blk_2529434549_1466543157 from /10.37.54.13:39396, delHint=6a0ea409-35ad-42c5-956d-44a5b9bd58a6
... View more
Labels:
05-05-2021
07:52 PM
in my pervious experience, i haven't ever set the port range, he default port is 32768 ---65536. so my only question is why 1000~ port can't be connected ? could you give me some information?
... View more
04-28-2021
01:30 AM
this issue has been solved right now. and the investigation road likes below: when i got this issue from development team, these peoples told me some tasks will be failed, and asked me how to solve it. then i open Yarn web ui to check what's exact errors of this issue, and found the connection time out. this is the first vision i have got. so i was considering why the port can't connect ? maybe there is a firewall ? or maybe one machine got some problem, when task assigned to this machine, then this issue happended? these all are my assumption, and after two days checked, the answer is no. since no firewall, and this issue happended randomly on every machine. just yesterday night, i found if the connection port is near 1000, then the job failed and connection timeout, but if the port is near 30000+, there are no any issue happend. so i am going to check the sysctl.conf, i found the setting for port range is "net.ipv4.ip_local_port_range = 1024 65000", at last i set the port range between "32678. 655000", this issue has been solved.
... View more
04-27-2021
03:02 PM
the port: 1983 is application master's port or not ? i am not sure about that
... View more
04-27-2021
03:01 PM
Log Type: syslog
Log Upload Time: Wed Apr 28 03:31:04 +0800 2021
Log Length: 132219
2021-04-28 03:27:36,319 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1618548626214_128739_000001
2021-04-28 03:27:36,530 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
2021-04-28 03:27:36,530 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@5b218417)
2021-04-28 03:27:36,706 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config org.apache.hadoop.hive.ql.io.HiveFileFormatUtils$NullOutputCommitter
2021-04-28 03:27:36,708 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.hive.ql.io.HiveFileFormatUtils$NullOutputCommitter
2021-04-28 03:27:37,232 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-04-28 03:27:37,378 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
2021-04-28 03:27:37,379 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
2021-04-28 03:27:37,380 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
2021-04-28 03:27:37,381 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
2021-04-28 03:27:37,381 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
2021-04-28 03:27:37,385 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
2021-04-28 03:27:37,386 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
2021-04-28 03:27:37,386 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
2021-04-28 03:27:37,430 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://nameservice1:8020]
2021-04-28 03:27:37,449 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://nameservice1:8020]
2021-04-28 03:27:37,469 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://nameservice1:8020]
2021-04-28 03:27:37,481 INFO [main] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Emitting job history data to the timeline server is not enabled
2021-04-28 03:27:37,513 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
2021-04-28 03:27:37,673 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2021-04-28 03:27:37,724 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2021-04-28 03:27:37,725 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started
2021-04-28 03:27:37,735 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1618548626214_128739 to jobTokenSecretManager
2021-04-28 03:27:37,855 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1618548626214_128739 because: not enabled; too much RAM;
2021-04-28 03:27:37,877 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1618548626214_128739 = 23256534. Number of splits = 7
2021-04-28 03:27:37,877 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1618548626214_128739 = 0
2021-04-28 03:27:37,877 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1618548626214_128739Job Transitioned from NEW to INITED
2021-04-28 03:27:37,878 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1618548626214_128739.
2021-04-28 03:27:37,905 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 100
2021-04-28 03:27:37,914 INFO [Socket Reader #1 for port 9115] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 9115
2021-04-28 03:27:37,951 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
2021-04-28 03:27:37,952 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2021-04-28 03:27:37,952 INFO [IPC Server listener on 9115] org.apache.hadoop.ipc.Server: IPC Server listener on 9115: starting
2021-04-28 03:27:37,953 INFO [main] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated MRClientService at dataware-14/10.39.58.19:9115
2021-04-28 03:27:38,009 INFO [main] org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2021-04-28 03:27:38,015 INFO [main] org.apache.hadoop.security.authentication.server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2021-04-28 03:27:38,019 INFO [main] org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.mapreduce is not defined
2021-04-28 03:27:38,027 INFO [main] org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2021-04-28 03:27:38,072 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context mapreduce
2021-04-28 03:27:38,074 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context static
2021-04-28 03:27:38,077 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /mapreduce/*
2021-04-28 03:27:38,077 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
2021-04-28 03:27:38,086 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 41305
2021-04-28 03:27:38,086 INFO [main] org.mortbay.log: jetty-6.1.26.cloudera.4
2021-04-28 03:27:38,120 INFO [main] org.mortbay.log: Extract jar:file:/opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/jars/hadoop-yarn-common-2.6.0-cdh5.13.3.jar!/webapps/mapreduce to /tmp/Jetty_0_0_0_0_41305_mapreduce____2p8bem/webapp
2021-04-28 03:27:38,436 INFO [main] org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:41305
2021-04-28 03:27:38,437 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce started at 41305
2021-04-28 03:27:38,742 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
2021-04-28 03:27:38,745 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: JOB_CREATE job_1618548626214_128739
2021-04-28 03:27:38,748 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 3000
2021-04-28 03:27:38,749 INFO [Socket Reader #1 for port 1983] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 1983
2021-04-28 03:27:38,753 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2021-04-28 03:27:38,753 INFO [IPC Server listener on 1983] org.apache.hadoop.ipc.Server: IPC Server listener on 1983: starting
2021-04-28 03:27:38,775 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
2021-04-28 03:27:38,775 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3
2021-04-28 03:27:38,775 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33
2021-04-28 03:27:38,848 INFO [main] org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider: Failing over to rm237
2021-04-28 03:27:38,877 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: maxContainerCapability: <memory:24576, vCores:14>
2021-04-28 03:27:38,877 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: queue: root.etl_core
2021-04-28 03:27:38,881 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper limit on the thread pool size is 500
2021-04-28 03:27:38,881 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: The thread pool initial size is 10
2021-04-28 03:27:38,889 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1618548626214_128739Job Transitioned from INITED to SETUP
2021-04-28 03:27:38,893 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP
2021-04-28 03:27:38,895 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1618548626214_128739Job Transitioned from SETUP to RUNNING
2021-04-28 03:27:38,970 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1618548626214_128739_m_000000 Task Transitioned from NEW to SCHEDULED
2021-04-28 03:27:38,988 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1618548626214_128739, File: hdfs://nameservice1:8020/user/hive/.staging/job_1618548626214_128739/job_1618548626214_128739_1.jhist
2021-04-28 03:27:38,994 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1618548626214_128739_m_000001 Task Transitioned from NEW to SCHEDULED
2021-04-28 03:27:39,013 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1618548626214_128739_m_000002 Task Transitioned from NEW to SCHEDULED
2021-04-28 03:27:39,032 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1618548626214_128739_m_000003 Task Transitioned from NEW to SCHEDULED
2021-04-28 03:27:39,049 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1618548626214_128739_m_000004 Task Transitioned from NEW to SCHEDULED
2021-04-28 03:27:39,077 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1618548626214_128739_m_000005 Task Transitioned from NEW to SCHEDULED
2021-04-28 03:27:39,095 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1618548626214_128739_m_000006 Task Transitioned from NEW to SCHEDULED
2021-04-28 03:27:39,097 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1618548626214_128739_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2021-04-28 03:27:39,097 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1618548626214_128739_m_000001_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2021-04-28 03:27:39,098 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1618548626214_128739_m_000002_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2021-04-28 03:27:39,098 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1618548626214_128739_m_000003_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2021-04-28 03:27:39,098 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1618548626214_128739_m_000004_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2021-04-28 03:27:39,098 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1618548626214_128739_m_000005_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2021-04-28 03:27:39,098 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1618548626214_128739_m_000006_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2021-04-28 03:27:39,099 INFO [Thread-53] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceRequest:<memory:6144, vCores:1> please help me check the port:1983, everytime when the job failed, retry connection port is 1983, after several times then job failed since connection timeout. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-28 03:27:59,247 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-14/10.39.58.19:1983. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-28 03:28:03,247 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-14/10.39.58.19:1983. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-28 03:28:07,248 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-14/10.39.58.19:1983. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-28 03:28:11,249 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-14/10.39.58.19:1983. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-28 03:28:15,250 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-14/10.39.58.19:1983. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-28 03:28:19,251 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-14/10.39.58.19:1983. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-28 03:28:23,253 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-14/10.39.58.19:1983. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-28 03:28:26,258 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.net.ConnectException: Call From dataware-17/10.39.58.15 to dataware-14:1983 failed on connection exception: java.net.ConnectException: Connection timed out; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1508)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:246)
at com.sun.proxy.$Proxy9.getTask(Unknown Source)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:132)
Caused by: java.net.ConnectException: Connection timed out
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:648)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:744)
at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557)
at org.apache.hadoop.ipc.Client.call(Client.java:1480)
... 4 more
... View more
04-25-2021
08:55 AM
you are right, the connection is between two Nodemanager, and i assume dataware-3 :1079 is app master, another one is a task. that's why i said the connection timeout is from task to appmaster. since this kind error just happend randomly, and 1 time per hour, so it's really hard for me to find out root cause.
... View more
04-25-2021
01:12 AM
Recently, MapReduce job sometimes failed, the details as below: after check map tasks , the log like below: Log Type: syslog Log Upload Time: Sun Apr 25 13:54:17 +0800 2021 Log Length: 5507 2021-04-25 13:51:01,806 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2021-04-25 13:51:01,893 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2021-04-25 13:51:01,893 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2021-04-25 13:51:01,895 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2021-04-25 13:51:01,895 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1618548626214_99981, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@732c2a62)
2021-04-25 13:51:02,182 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2021-04-25 13:51:06,267 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-3/10.39.58.16:1079. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-25 13:51:10,268 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-3/10.39.58.16:1079. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-25 13:51:14,268 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-3/10.39.58.16:1079. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-25 13:51:18,268 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-3/10.39.58.16:1079. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-25 13:51:22,269 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-3/10.39.58.16:1079. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-25 13:51:26,270 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-3/10.39.58.16:1079. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-25 13:51:30,271 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-3/10.39.58.16:1079. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-25 13:51:34,272 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-3/10.39.58.16:1079. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-25 13:51:38,272 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-3/10.39.58.16:1079. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-25 13:51:42,272 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: dataware-3/10.39.58.16:1079. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-04-25 13:51:45,274 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.net.ConnectException: Call From dataware-14/10.39.58.19 to dataware-3:1079 failed on connection exception: java.net.ConnectException: Connection timed out; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1508)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:246)
at com.sun.proxy.$Proxy9.getTask(Unknown Source)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:132)
Caused by: java.net.ConnectException: Connection timed out
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:648)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:744)
at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557)
at org.apache.hadoop.ipc.Client.call(Client.java:1480)
... 4 more
2021-04-25 13:51:45,275 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system...
2021-04-25 13:51:45,275 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped.
2021-04-25 13:51:45,275 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete. from the above log, we can see the task connection timeout with App master, but this error happend randomly, who can give me some advises on this error. thanks.
... View more
Labels:
- Labels:
-
Apache Hadoop
04-22-2021
09:10 AM
i think it's no sense, even you can create vpc by REST api, you also need to arrange components manually.
... View more
04-15-2021
08:59 AM
Hi, everyone since my company is a group which has many Subsidiary company, so i am thinking about Cloudera Virtual private cluster solution is good for me or not . all the data save in the base cluster, and if there are new project or business need separate compute resource, then i just arrange a fews machines as a new compute cluster like hive or Flink . i mean this kind architecture is quite clear for me to calculate the resource. but i don't have any experiences on VPC, does anyone have this kind experience, could you share to me what's your experience or thinking ? thanks
... View more
Labels:
- Labels:
-
Cloudera Data Platform (CDP)
01-14-2020
10:31 AM
today all the impala process crash suddenly, totally is 21 impala daemon, then I restart impala daemon and it's normally just a few second, then crash again.
the info logs is like below:
I0114 18:51:33.220736 24116 TAcceptQueueServer.cpp:218] connection_setup_thread_pool_size is set to 2
I0114 18:51:33.959517 23216 ImpaladCatalog.java:200] Adding: TABLE:oyo_behavior_dw.dw_search_log_sort_response version: 7346 size: 79
I0114 18:51:33.959635 23216 ImpaladCatalog.java:200] Adding: CATALOG_SERVICE_ID version: 7346 size: 49
I0114 18:51:33.960319 23216 impala-server.cc:1578] Catalog topic update applied with version: 7346 new min catalog object version: 1366
Wrote minidump to /var/log/impala-minidumps/impalad/73c0cda6-b4cc-4a4c-f8913e88-4ad72a7c.dmp
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f09a2b9a2c0, pid=22656, tid=0x00007f08d9e8f700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libc.so.6+0x1502c0] __memmove_ssse3_back+0x50
#
# Core dump written. Default location: /var/log/impalad/core or core.22656
#
# An error report file with more information is saved as:
# /var/log/impalad/hs_err_pid22656.log
#
# If you would like to submit a bug report, please visit:
# <a href="http://bugreport.java.com/bugreport/crash.jsp" target="_blank">http://bugreport.java.com/bugreport/crash.jsp</a>
#
because I have no idea what happened, so I click restart Botton again, I don't remember how many time I restart, because it's always crashed after restart.
this kind issues happened two times, it seems this issue also happened one month ago. I am not sure it's a bug or not.
I have prepared dump file and core file, but how can I upload it? then may be you will find the root cause, thanks.
the below is hs_err_pid22656.log:
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f09a2b9a2c0, pid=22656, tid=0x00007f08d9e8f700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libc.so.6+0x1502c0] __memmove_ssse3_back+0x50
#
# Core dump written. Default location: /var/log/impalad/core or core.22656
#
# If you would like to submit a bug report, please visit:
# <a href="http://bugreport.java.com/bugreport/crash.jsp" target="_blank">http://bugreport.java.com/bugreport/crash.jsp</a>
#
--------------- T H R E A D ---------------
Current thread is native thread
siginfo: si_signo: 11 (SIGSEGV), si_code: 128 (SI_KERNEL), si_addr: 0x0000000000000000
Registers:
RAX=0x000000000c7ec000, RBX=0x0000000000007f33, RCX=0x0000000000007f33, RDX=0x0000000000007f33
RSP=0x00007f08d9e8df68, RBP=0x00007f08d9e8df90, RSI=0x3131353538393300, RDI=0x000000000c7ec000
R8 =0x0000000000000000, R9 =0x0000000000000000, R10=0x0000000000002000, R11=0x0000000000000000
R12=0x3131353538393300, R13=0x0000000010b6c000, R14=0x000000000e2d8d00, R15=0x000000000ba54390
RIP=0x00007f09a2b9a2c0, EFLAGS=0x0000000000010206, CSGSFS=0x8348000000000033, ERR=0x0000000000000000
TRAPNO=0x000000000000000d
Top of Stack: (sp=0x00007f08d9e8df68)
0x00007f08d9e8df68: 00000000010ec03a 00007f3300000000
0x00007f08d9e8df78: 000000000c7ec000 00000000110b9010
0x00007f08d9e8df88: 000000000e2ec348 00007f08d9e8dfa0
0x00007f08d9e8df98: 0000000000b7c7f1 00007f08d9e8dfd0
0x00007f08d9e8dfa8: 0000000000b7c816 0000000b00000000
0x00007f08d9e8dfb8: 0000000000000001 000000000db2f290
0x00007f08d9e8dfc8: 000000000f691f00 00007f08d9e8e070
0x00007f08d9e8dfd8: 00000000012fb96d 00007f08d9e8e020
0x00007f08d9e8dfe8: 0000000010277fe8 0025848e00000000
0x00007f08d9e8dff8: 00004519c6189a00 0000000000000031
0x00007f08d9e8e008: 0000000011e634e0 0025848e00000000
0x00007f08d9e8e018: 00004519c6189a00 00007f08d9e8e070
0x00007f08d9e8e028: 00000000012bfbc0 000000080025848e
0x00007f08d9e8e038: 0000000010be45a0 000000000025848e
0x00007f08d9e8e048: 0000000000000ab0 0000000000000000
0x00007f08d9e8e058: 000000000ba54000 00000000012fb530
0x00007f08d9e8e068: 000000000e2d8d00 0000000010b6c000
0x00007f08d9e8e078: 00007f08cc2595ff 0000000000000003
0x00007f08d9e8e088: 000000000e2d8d00 000004000cbadf40
0x00007f08d9e8e098: 00007f08d9e8e1e0 000000000cbadf40
0x00007f08d9e8e0a8: 000000000e2d8e00 0000000010278000
0x00007f08d9e8e0b8: 0000000000000003 00000000d5298327
0x00007f08d9e8e0c8: 000000000fd2f010 0000000000000400
--More--(1%)Instructions: (pc=0x00007f09a2b9a2c0)
0x00007f09a2b9a2a0: fa 90 00 00 00 73 19 48 01 d6 48 01 d7 4c 8d 1d
0x00007f09a2b9a2b0: 1c 44 03 00 49 63 14 93 49 8d 14 13 ff e2 0f 0b
0x00007f09a2b9a2c0: f3 0f 6f 06 49 89 f8 48 83 e7 f0 48 83 c7 10 49
0x00007f09a2b9a2d0: 89 f9 4d 29 c1 4c 29 ca 4c 01 ce 49 89 f1 49 83
Register to memory mapping:
RAX=0x000000000c7ec000 is an unknown value
RBX=0x0000000000007f33 is an unknown value
RCX=0x0000000000007f33 is an unknown value
RDX=0x0000000000007f33 is an unknown value
RSP=0x00007f08d9e8df68 is an unknown value
RBP=0x00007f08d9e8df90 is an unknown value
RSI=0x3131353538393300 is an unknown value
RDI=0x000000000c7ec000 is an unknown value
R8 =0x0000000000000000 is an unknown value
R9 =0x0000000000000000 is an unknown value
R10=0x0000000000002000 is an unknown value
R11=0x0000000000000000 is an unknown value
R12=0x3131353538393300 is an unknown value
R13=0x0000000010b6c000 is an unknown value
R14=0x000000000e2d8d00 is an unknown value
R15=0x000000000ba54390 is an unknown value
Stack: [0x00007f08d968f000,0x00007f08d9e90000], sp=0x00007f08d9e8df68, free space=8187k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [libc.so.6+0x1502c0] __memmove_ssse3_back+0x50
C [impalad+0x77c7f1] impala::AggregateFunctions::StringValGetValue(impala_udf::FunctionContext*, impala_udf::StringVal const&)+0x21
C [impalad+0x77c816] impala::AggregateFunctions::StringValSerializeOrFinalize(impala_udf::FunctionContext*, impala_udf::StringVal const&)+0x16
C [impalad+0xefb96d] impala::AggFnEvaluator::SerializeOrFinalize(impala::Tuple*, impala::SlotDescriptor const&, impala::Tuple*, void*)+0x43d
VM state:not at safepoint (normal execution)
VM Mutex/Monitor currently owned by a thread: None
Heap:
PSYoungGen total 1835008K, used 343247K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
eden space 1572864K, 21% used [0x0000000740000000,0x0000000754f33e68,0x00000007a0000000)
from space 262144K, 0% used [0x00000007b0000000,0x00000007b0000000,0x00000007c0000000)
to space 262144K, 0% used [0x00000007a0000000,0x00000007a0000000,0x00000007b0000000)
ParOldGen total 4194304K, used 37858K [0x0000000640000000, 0x0000000740000000, 0x0000000740000000)
object space 4194304K, 0% used [0x0000000640000000,0x00000006424f8a50,0x0000000740000000)
Metaspace used 35676K, capacity 35974K, committed 36352K, reserved 1081344K
class space used 3879K, capacity 4001K, committed 4096K, reserved 1048576K
Card table byte_map: [0x00007f098d6fe000,0x00007f098e2ff000] byte_map_base: 0x00007f098a4fe000
Marking Bits: (ParMarkBitMap*) 0x00007f09a60860a0
Begin Bits: [0x00007f097f8e8000, 0x00007f09858e8000)
End Bits: [0x00007f09858e8000, 0x00007f098b8e8000)
Polling page: 0x00007f09a66ee000
CodeCache: size=245760Kb used=10577Kb max_used=12273Kb free=235182Kb
bounds [0x00007f098e6bf000, 0x00007f098f2cf000, 0x00007f099d6bf000]
total_blobs=3234 nmethods=2683 adapters=461
compilation: enabled
Compilation events (10 events):
Event: 49.534 Thread 0x000000000a8ca800 3238 3 org.apache.hadoop.hdfs.protocol.DatanodeInfo$DatanodeInfoBuilder::build (97 bytes)
Event: 49.534 Thread 0x000000000a8c9000 3239 3 com.google.protobuf.UnmodifiableLazyStringList$2::next (5 bytes)
Event: 49.534 Thread 0x000000000a887000 nmethod 3236 0x00007f098f100750 code [0x00007f098f100920, 0x00007f098f101128]
Event: 49.534 Thread 0x000000000a887000 3240 3 com.google.protobuf.UnmodifiableLazyStringList$2::next (13 bytes)
Event: 49.534 Thread 0x000000000a8c8000 nmethod 3235 0x00007f098eb21c10 code [0x00007f098eb21de0, 0x00007f098eb225e8]
Event: 49.534 Thread 0x000000000a887000 nmethod 3240 0x00007f098ea55bd0 code [0x00007f098ea55d40, 0x00007f098ea56048]
Event: 49.534 Thread 0x000000000a887000 3241 3 com.google.protobuf.LazyStringArrayList::get (6 bytes)
Event: 49.535 Thread 0x000000000a887000 nmethod 3241 0x00007f098ea55810 code [0x00007f098ea55980, 0x00007f098ea55b28]
Event: 49.535 Thread 0x000000000a8c9000 nmethod 3239 0x00007f098ea57350 code [0x00007f098ea574e0, 0x00007f098ea57848]
Event: 49.535 Thread 0x000000000a8ca800 nmethod 3238 0x00007f098eb47110 code [0x00007f098eb47280, 0x00007f098eb47648]
GC Heap History (8 events):
Event: 1.729 GC heap before
{Heap before GC invocations=1 (full 0):
PSYoungGen total 1835008K, used 534775K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
eden space 1572864K, 34% used [0x0000000740000000,0x0000000760a3dc70,0x00000007a0000000)
from space 262144K, 0% used [0x00000007b0000000,0x00000007b0000000,0x00000007c0000000)
to space 262144K, 0% used [0x00000007a0000000,0x00000007a0000000,0x00000007b0000000)
ParOldGen total 4194304K, used 0K [0x0000000640000000, 0x0000000740000000, 0x0000000740000000)
object space 4194304K, 0% used [0x0000000640000000,0x0000000640000000,0x0000000740000000)
Metaspace used 21009K, capacity 21174K, committed 21296K, reserved 1069056K
class space used 2350K, capacity 2410K, committed 2432K, reserved 1048576K
Event: 1.749 GC heap after
Heap after GC invocations=1 (full 0):
PSYoungGen total 1835008K, used 17868K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
eden space 1572864K, 0% used [0x0000000740000000,0x0000000740000000,0x00000007a0000000)
from space 262144K, 6% used [0x00000007a0000000,0x00000007a1173250,0x00000007b0000000)
to space 262144K, 0% used [0x00000007b0000000,0x00000007b0000000,0x00000007c0000000)
ParOldGen total 4194304K, used 16K [0x0000000640000000, 0x0000000740000000, 0x0000000740000000)
object space 4194304K, 0% used [0x0000000640000000,0x0000000640004000,0x0000000740000000)
Metaspace used 21009K, capacity 21174K, committed 21296K, reserved 1069056K
class space used 2350K, capacity 2410K, committed 2432K, reserved 1048576K
}
Event: 1.749 GC heap before
{Heap before GC invocations=2 (full 1):
PSYoungGen total 1835008K, used 17868K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
eden space 1572864K, 0% used [0x0000000740000000,0x0000000740000000,0x00000007a0000000)
from space 262144K, 6% used [0x00000007a0000000,0x00000007a1173250,0x00000007b0000000)
to space 262144K, 0% used [0x00000007b0000000,0x00000007b0000000,0x00000007c0000000)
ParOldGen total 4194304K, used 16K [0x0000000640000000, 0x0000000740000000, 0x0000000740000000)
object space 4194304K, 0% used [0x0000000640000000,0x0000000640004000,0x0000000740000000)
Metaspace used 21009K, capacity 21174K, committed 21296K, reserved 1069056K
class space used 2350K, capacity 2410K, committed 2432K, reserved 1048576K
Event: 1.780 GC heap after
Heap after GC invocations=2 (full 1):
PSYoungGen total 1835008K, used 0K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
eden space 1572864K, 0% used [0x0000000740000000,0x0000000740000000,0x00000007a0000000)
from space 262144K, 0% used [0x00000007a0000000,0x00000007a0000000,0x00000007b0000000)
to space 262144K, 0% used [0x00000007b0000000,0x00000007b0000000,0x00000007c0000000)
ParOldGen total 4194304K, used 16570K [0x0000000640000000, 0x0000000740000000, 0x0000000740000000)
object space 4194304K, 0% used [0x0000000640000000,0x000000064102eac0,0x0000000740000000)
Metaspace used 21009K, capacity 21174K, committed 21296K, reserved 1069056K
class space used 2350K, capacity 2410K, committed 2432K, reserved 1048576K
}
Event: 48.627 GC heap before
{Heap before GC invocations=3 (full 1):
PSYoungGen total 1835008K, used 597742K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
eden space 1572864K, 38% used [0x0000000740000000,0x00000007647bb868,0x00000007a0000000)
from space 262144K, 0% used [0x00000007a0000000,0x00000007a0000000,0x00000007b0000000)
to space 262144K, 0% used [0x00000007b0000000,0x00000007b0000000,0x00000007c0000000)
ParOldGen total 4194304K, used 16570K [0x0000000640000000, 0x0000000740000000, 0x0000000740000000)
object space 4194304K, 0% used [0x0000000640000000,0x000000064102eac0,0x0000000740000000)
Metaspace used 34984K, capacity 35258K, committed 35456K, reserved 1081344K
class space used 3827K, capacity 3935K, committed 3968K, reserved 1048576K
Event: 48.653 GC heap after
Heap after GC invocations=3 (full 1):
PSYoungGen total 1835008K, used 34295K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
eden space 1572864K, 0% used [0x0000000740000000,0x0000000740000000,0x00000007a0000000)
from space 262144K, 13% used [0x00000007b0000000,0x00000007b217df00,0x00000007c0000000)
to space 262144K, 0% used [0x00000007a0000000,0x00000007a0000000,0x00000007b0000000)
ParOldGen total 4194304K, used 16642K [0x0000000640000000, 0x0000000740000000, 0x0000000740000000)
object space 4194304K, 0% used [0x0000000640000000,0x0000000641040ad0,0x0000000740000000)
Metaspace used 34984K, capacity 35258K, committed 35456K, reserved 1081344K
class space used 3827K, capacity 3935K, committed 3968K, reserved 1048576K
}
Event: 48.653 GC heap before
{Heap before GC invocations=4 (full 2):
PSYoungGen total 1835008K, used 34295K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
eden space 1572864K, 0% used [0x0000000740000000,0x0000000740000000,0x00000007a0000000)
from space 262144K, 13% used [0x00000007b0000000,0x00000007b217df00,0x00000007c0000000)
to space 262144K, 0% used [0x00000007a0000000,0x00000007a0000000,0x00000007b0000000)
ParOldGen total 4194304K, used 16642K [0x0000000640000000, 0x0000000740000000, 0x0000000740000000)
object space 4194304K, 0% used [0x0000000640000000,0x0000000641040ad0,0x0000000740000000)
Metaspace used 34984K, capacity 35258K, committed 35456K, reserved 1081344K
class space used 3827K, capacity 3935K, committed 3968K, reserved 1048576K
Event: 48.723 GC heap after
Heap after GC invocations=4 (full 2):
PSYoungGen total 1835008K, used 0K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
eden space 1572864K, 0% used [0x0000000740000000,0x0000000740000000,0x00000007a0000000)
from space 262144K, 0% used [0x00000007b0000000,0x00000007b0000000,0x00000007c0000000)
to space 262144K, 0% used [0x00000007a0000000,0x00000007a0000000,0x00000007b0000000)
ParOldGen total 4194304K, used 37858K [0x0000000640000000, 0x0000000740000000, 0x0000000740000000)
object space 4194304K, 0% used [0x0000000640000000,0x00000006424f8a50,0x0000000740000000)
Metaspace used 34984K, capacity 35258K, committed 35456K, reserved 1081344K
class space used 3827K, capacity 3935K, committed 3968K, reserved 1048576K
}
Deoptimization events (10 events):
Event: 48.500 Thread 0x00000000116b0800 Uncommon trap: reason=bimorphic action=maybe_recompile pc=0x00007f098f2b9cf4 method=java.util.AbstractCollection.isEmpty()Z @ 1
Event: 48.501 Thread 0x00000000116b0800 Uncommon trap: reason=class_check action=maybe_recompile pc=0x00007f098edd45c4 method=java.io.DataOutputStream.write(I)V @ 5
Event: 48.504 Thread 0x00000000116b0800 Uncommon trap: reason=predicate action=maybe_recompile pc=0x00007f098f05b814 method=java.util.regex.Pattern$Slice.match(Ljava/util/regex/Matcher;ILjava/lang/CharSequence;)Z @ 21
Event: 48.565 Thread 0x00000000116b0800 Uncommon trap: reason=bimorphic action=maybe_recompile pc=0x00007f098ed93d30 method=java.io.DataOutputStream.writeShort(I)V @ 12
Event: 48.611 Thread 0x00000000116b0800 Uncommon trap: reason=bimorphic action=maybe_recompile pc=0x00007f098ed93d30 method=java.io.DataOutputStream.writeShort(I)V @ 12
Event: 48.748 Thread 0x00000000113a3000 Uncommon trap: reason=bimorphic action=maybe_recompile pc=0x00007f098ed93d30 method=java.io.DataOutputStream.writeShort(I)V @ 12
Event: 49.379 Thread 0x0000000010d9d800 Uncommon trap: reason=bimorphic action=maybe_recompile pc=0x00007f098ed93d30 method=java.io.DataOutputStream.writeShort(I)V @ 12
Event: 49.394 Thread 0x0000000011db3000 Uncommon trap: reason=unstable_if action=reinterpret pc=0x00007f098ecabce4 method=java.util.concurrent.ConcurrentHashMap.putVal(Ljava/lang/Object;Ljava/lang/Object;Z)Ljava/lang/Object; @ 98
Event: 49.417 Thread 0x0000000011db3000 Uncommon trap: reason=unstable_if action=reinterpret pc=0x00007f098f17840c method=java.lang.Long.getChars(JI[C)V @ 94
Event: 49.417 Thread 0x0000000011db3000 Uncommon trap: reason=unstable_if action=reinterpret pc=0x00007f098f15c058 method=java.lang.Long.getChars(JI[C)V @ 94
Classes redefined (0 events):
No events
Internal exceptions (10 events):
Event: 0.586 Thread 0x0000000005b00000 Exception <a 'java/lang/ArrayIndexOutOfBoundsException'> (0x000000074df5c958) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u181/11358/hotspot/src/share/vm/runtime/sharedRuntime.cpp, line 605]
Event: 0.847 Thread 0x0000000005b00000 Exception <a 'java/lang/NoSuchFieldError': method resolution failed> (0x0000000750246b38) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u181/11358/hotspot/src/share/vm/prims/methodHandles.cpp, line 1
167]
Event: 0.848 Thread 0x0000000005b00000 Exception <a 'java/lang/NoSuchFieldError': method resolution failed> (0x00000007502541f8) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u181/11358/hotspot/src/share/vm/prims/methodHandles.cpp, line 1
167]
Event: 1.939 Thread 0x0000000005b00000 Exception <a 'java/lang/NoClassDefFoundError': org/apache/hadoop/hbase/client/ScannerTimeoutException> (0x00000007483e1258) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u181/11358/hotspot/src/share/
vm/classfile/systemDictionary.cpp, line 199]
Event: 1.939 Thread 0x0000000005b00000 Exception <a 'java/lang/NoSuchMethodError': setCaching> (0x00000007483e14b8) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u181/11358/hotspot/src/share/vm/prims/jni.cpp, line 1613]
Event: 1.945 Thread 0x0000000005b00000 Exception <a 'java/lang/NoSuchMethodError': java.lang.Object.lambda$static$0(Lorg/apache/hadoop/hbase/client/Row;Lorg/apache/hadoop/hbase/client/Row;)I> (0x00000007484ee2e0) thrown at [/HUDSON/workspace/8-2-b
uild-linux-amd64/jdk8u181/11358/hotspot/src/sha
Event: 2.168 Thread 0x0000000005b00000 Exception <a 'java/lang/ClassNotFoundException': org/apache/impala/util/GlogAppenderBeanInfo> (0x00000007499aaae0) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u181/11358/hotspot/src/share/vm/classf
ile/systemDictionary.cpp, line 210]
Event: 2.169 Thread 0x0000000005b00000 Exception <a 'java/lang/ClassNotFoundException': org/apache/impala/util/GlogAppenderCustomizer> (0x00000007499eb348) thrown at [/HUDSON/workspace/8-2-build-linux-amd64/jdk8u181/11358/hotspot/src/share/vm/clas
sfile/systemDictionary.cpp, line 210]
Event: 2.169 Thread 0x0000000005b00000 Implicit null exception at 0x00007f098eaf6cf6 to 0x00007f098eaf70b5
Event: 15.697 Thread 0x000000000fbdd000 Implicit null exception at 0x00007f098f21b80c to 0x00007f098f21b82e
... View more
Labels:
- Labels:
-
Apache Impala
12-15-2019
05:39 AM
after monitor agent status more than ten days, I think this issue has been resolved by your solution. it seems this issue caused by impala logs, since I just have changed the impala log level to WARN, then it haven't happened again. thanks.
... View more
12-10-2019
01:20 AM
after change Impala log level to WARN, the agent connectivity issue happened frequency has been reduced. but still two servers happened, after stop agent and delete impala log, these the issue on these two servers hasn't happened again. I will continue monitor the agent issue, and feedback to you.
... View more
12-03-2019
11:18 PM
basically these node machines just install DataNode, impala, node manager, no others. I have changed impala log to WARN level, if this issue still happened then I will change other DataNode and node manager log level. hope can find out root cause . anything good and bad news will feed back to you , thanks.
... View more