Member since
05-19-2016
216
Posts
20
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4281 | 05-29-2018 11:56 PM | |
7154 | 07-06-2017 02:50 AM | |
3848 | 10-09-2016 12:51 AM | |
3658 | 05-13-2016 04:17 AM |
05-24-2018
06:07 PM
@Harsh J : Could you please respond? It's a production cluster and it is disturbing our workflows when we run into this error
... View more
05-24-2018
07:27 AM
I have. It gives me no information apart from error code:
Log Upload Time: Thu May 24 18:56:02 +0530 2018
Log Length: 34607
2018-05-24 18:54:27,921 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1527163745858_0517_000001
2018-05-24 18:54:29,584 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
2018-05-24 18:54:29,584 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@39d12b10)
2018-05-24 18:54:30,453 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: RM_DELEGATION_TOKEN, Service: 172.31.4.192:8032, Ident: (RM_DELEGATION_TOKEN owner=hue, renewer=oozie mr token, realUser=oozie, issueDate=1527168258855, maxDate=1527773058855, sequenceNumber=613, masterKeyId=2)
2018-05-24 18:54:33,015 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-05-24 18:54:33,042 WARN [main] org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2018-05-24 18:54:33,532 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config org.apache.oozie.action.hadoop.OozieLauncherOutputCommitter
2018-05-24 18:54:33,534 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.oozie.action.hadoop.OozieLauncherOutputCommitter
2018-05-24 18:54:33,649 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
2018-05-24 18:54:33,681 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
2018-05-24 18:54:33,683 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
2018-05-24 18:54:33,684 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
2018-05-24 18:54:33,684 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
2018-05-24 18:54:33,685 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
2018-05-24 18:54:33,686 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
2018-05-24 18:54:33,687 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
2018-05-24 18:54:33,821 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020]
2018-05-24 18:54:33,937 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020]
2018-05-24 18:54:34,022 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020]
2018-05-24 18:54:34,076 INFO [main] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Emitting job history data to the timeline server is not enabled
2018-05-24 18:54:34,225 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
2018-05-24 18:54:35,260 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2018-05-24 18:54:35,465 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2018-05-24 18:54:35,465 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started
2018-05-24 18:54:35,495 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1527163745858_0517 to jobTokenSecretManager
2018-05-24 18:54:35,843 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1527163745858_0517 because: not enabled;
2018-05-24 18:54:35,941 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1527163745858_0517 = 0. Number of splits = 1
2018-05-24 18:54:35,941 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1527163745858_0517 = 0
2018-05-24 18:54:35,941 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1527163745858_0517Job Transitioned from NEW to INITED
2018-05-24 18:54:35,943 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1527163745858_0517.
2018-05-24 18:54:36,091 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 100
2018-05-24 18:54:36,176 INFO [Socket Reader #1 for port 43856] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 43856
2018-05-24 18:54:36,266 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
2018-05-24 18:54:36,284 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2018-05-24 18:54:36,326 INFO [main] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated MRClientService at ip-172-31-5-201.ap-south-1.compute.internal/172.31.5.201:43856
2018-05-24 18:54:36,329 INFO [IPC Server listener on 43856] org.apache.hadoop.ipc.Server: IPC Server listener on 43856: starting
2018-05-24 18:54:36,585 INFO [main] org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2018-05-24 18:54:36,655 INFO [main] org.apache.hadoop.security.authentication.server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2018-05-24 18:54:36,679 INFO [main] org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.mapreduce is not defined
2018-05-24 18:54:36,691 INFO [main] org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2018-05-24 18:54:36,697 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context mapreduce
2018-05-24 18:54:36,697 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context static
2018-05-24 18:54:36,701 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /mapreduce/*
2018-05-24 18:54:36,701 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
2018-05-24 18:54:36,761 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 36747
2018-05-24 18:54:36,761 INFO [main] org.mortbay.log: jetty-6.1.26.cloudera.4
2018-05-24 18:54:36,898 INFO [main] org.mortbay.log: Extract jar:file:/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/jars/hadoop-yarn-common-2.6.0-cdh5.10.1.jar!/webapps/mapreduce to ./tmp/Jetty_0_0_0_0_36747_mapreduce____.wrg2hq/webapp
2018-05-24 18:54:38,183 INFO [main] org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:36747
2018-05-24 18:54:38,184 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce started at 36747
2018-05-24 18:54:39,423 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
2018-05-24 18:54:39,516 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 3000
2018-05-24 18:54:39,523 INFO [Socket Reader #1 for port 37627] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 37627
2018-05-24 18:54:39,578 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2018-05-24 18:54:39,588 INFO [IPC Server listener on 37627] org.apache.hadoop.ipc.Server: IPC Server listener on 37627: starting
2018-05-24 18:54:40,346 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
2018-05-24 18:54:40,346 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3
2018-05-24 18:54:40,346 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33
2018-05-24 18:54:40,565 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at ip-172-31-4-192.ap-south-1.compute.internal/172.31.4.192:8030
2018-05-24 18:54:40,873 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: maxContainerCapability: <memory:25600, vCores:8>
2018-05-24 18:54:40,873 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: queue: root.users.hue
2018-05-24 18:54:40,886 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper limit on the thread pool size is 500
2018-05-24 18:54:40,887 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: The thread pool initial size is 10
2018-05-24 18:54:40,959 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1527163745858_0517Job Transitioned from INITED to SETUP
2018-05-24 18:54:40,991 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP
2018-05-24 18:54:41,012 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1527163745858_0517Job Transitioned from SETUP to RUNNING
2018-05-24 18:54:41,123 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1527163745858_0517_m_000000 Task Transitioned from NEW to SCHEDULED
2018-05-24 18:54:41,135 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2018-05-24 18:54:41,294 INFO [Thread-52] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceRequest:<memory:1024, vCores:1>
2018-05-24 18:54:41,439 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1527163745858_0517, File: hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/hue/.staging/job_1527163745858_0517/job_1527163745858_0517_1.jhist
2018-05-24 18:54:41,894 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:54:42,056 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1527163745858_0517: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:180224, vCores:35> knownNMs=6
2018-05-24 18:54:42,324 WARN [DataStreamer for file /user/hue/.staging/job_1527163745858_0517/job_1527163745858_0517_1_conf.xml] org.apache.hadoop.hdfs.DFSClient: Caught exception
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1281)
at java.lang.Thread.join(Thread.java:1355)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:951)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:689)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:878)
2018-05-24 18:54:42,382 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020]
2018-05-24 18:54:43,088 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 1
2018-05-24 18:54:43,205 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1527163745858_0517_01_000002 to attempt_1527163745858_0517_m_000000_0
2018-05-24 18:54:43,206 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:54:43,297 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Job jar is not present. Not adding any jar to the list of resources.
2018-05-24 18:54:43,352 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf file on the remote FS is /user/hue/.staging/job_1527163745858_0517/job.xml
2018-05-24 18:54:43,654 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache archive (mapreduce.job.cache.archives) hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/hue/mysql-connector-java-5.0.8-bin.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/oozie/share/lib/lib_20170413135352/sqoop/mysql-connector-java-5.0.8-bin.jar This will be an error in Hadoop 2.0
2018-05-24 18:54:44,041 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #1 tokens and #1 secret keys for NM use for launching container
2018-05-24 18:54:44,041 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of containertokens_dob is 2
2018-05-24 18:54:44,041 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Putting shuffle token in serviceData
2018-05-24 18:54:45,097 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf: Task java-opts do not specify heap size. Setting task attempt jvm max heap size to -Xmx820m
2018-05-24 18:54:45,101 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED
2018-05-24 18:54:45,109 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1527163745858_0517: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:179200, vCores:34> knownNMs=6
2018-05-24 18:54:45,146 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1527163745858_0517_01_000002 taskAttempt attempt_1527163745858_0517_m_000000_0
2018-05-24 18:54:45,149 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1527163745858_0517_m_000000_0
2018-05-24 18:54:45,350 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1527163745858_0517_m_000000_0 : 13562
2018-05-24 18:54:45,352 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1527163745858_0517_m_000000_0] using containerId: [container_1527163745858_0517_01_000002 on NM: [ip-172-31-1-207.ap-south-1.compute.internal:8041]
2018-05-24 18:54:45,355 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_0 TaskAttempt Transitioned from ASSIGNED to RUNNING
2018-05-24 18:54:45,356 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1527163745858_0517_m_000000 Task Transitioned from SCHEDULED to RUNNING
2018-05-24 18:54:52,626 INFO [Socket Reader #1 for port 37627] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1527163745858_0517 (auth:SIMPLE)
2018-05-24 18:54:52,800 INFO [IPC Server handler 5 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1527163745858_0517_m_000002 asked for a task
2018-05-24 18:54:52,818 INFO [IPC Server handler 5 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1527163745858_0517_m_000002 given task: attempt_1527163745858_0517_m_000000_0
2018-05-24 18:55:08,465 INFO [Socket Reader #1 for port 37627] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1527163745858_0517 (auth:SIMPLE)
2018-05-24 18:55:20,711 INFO [Socket Reader #1 for port 37627] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1527163745858_0517 (auth:SIMPLE)
2018-05-24 18:55:20,793 INFO [IPC Server handler 3 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1527163745858_0517_m_000000_0 is : 1.0
2018-05-24 18:55:30,659 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1527163745858_0517_01_000002
2018-05-24 18:55:30,660 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:55:30,660 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1527163745858_0517_m_000000_0: Container killed on request. Exit code is 137
Container exited with a non-zero exit code 137
Killed by external signal
2018-05-24 18:55:30,662 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_0 TaskAttempt Transitioned from RUNNING to FAILED
2018-05-24 18:55:30,711 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_COMPLETED for container container_1527163745858_0517_01_000002 taskAttempt attempt_1527163745858_0517_m_000000_0
2018-05-24 18:55:30,719 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_1 TaskAttempt Transitioned from NEW to UNASSIGNED
2018-05-24 18:55:30,725 INFO [Thread-52] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 1 failures on node ip-172-31-1-207.ap-south-1.compute.internal
2018-05-24 18:55:30,726 INFO [Thread-52] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Added attempt_1527163745858_0517_m_000000_1 to list of failed maps
2018-05-24 18:55:31,660 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:55:31,688 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1527163745858_0517: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:71168, vCores:40> knownNMs=6
2018-05-24 18:55:32,700 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 1
2018-05-24 18:55:32,700 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigning container Container: [ContainerId: container_1527163745858_0517_01_000003, NodeId: ip-172-31-1-207.ap-south-1.compute.internal:8041, NodeHttpAddress: ip-172-31-1-207.ap-south-1.compute.internal:8042, Resource: <memory:1024, vCores:1>, Priority: 5, Token: Token { kind: ContainerToken, service: 172.31.1.207:8041 }, ] to fast fail map
2018-05-24 18:55:32,700 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned from earlierFailedMaps
2018-05-24 18:55:32,700 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1527163745858_0517_01_000003 to attempt_1527163745858_0517_m_000000_1
2018-05-24 18:55:32,700 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:55:32,710 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf: Task java-opts do not specify heap size. Setting task attempt jvm max heap size to -Xmx820m
2018-05-24 18:55:32,710 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_1 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED
2018-05-24 18:55:32,725 INFO [ContainerLauncher #2] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1527163745858_0517_01_000003 taskAttempt attempt_1527163745858_0517_m_000000_1
2018-05-24 18:55:32,725 INFO [ContainerLauncher #2] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1527163745858_0517_m_000000_1
2018-05-24 18:55:32,991 INFO [ContainerLauncher #2] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1527163745858_0517_m_000000_1 : 13562
2018-05-24 18:55:32,993 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1527163745858_0517_m_000000_1] using containerId: [container_1527163745858_0517_01_000003 on NM: [ip-172-31-1-207.ap-south-1.compute.internal:8041]
2018-05-24 18:55:32,993 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_1 TaskAttempt Transitioned from ASSIGNED to RUNNING
2018-05-24 18:55:33,715 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1527163745858_0517: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:74240, vCores:41> knownNMs=6
2018-05-24 18:55:39,936 INFO [Socket Reader #1 for port 37627] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1527163745858_0517 (auth:SIMPLE)
2018-05-24 18:55:40,020 INFO [IPC Server handler 9 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1527163745858_0517_m_000003 asked for a task
2018-05-24 18:55:40,021 INFO [IPC Server handler 9 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1527163745858_0517_m_000003 given task: attempt_1527163745858_0517_m_000000_1
2018-05-24 18:55:51,261 INFO [Socket Reader #1 for port 37627] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1527163745858_0517 (auth:SIMPLE)
2018-05-24 18:55:51,314 INFO [IPC Server handler 22 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1527163745858_0517_m_000000_1 is : 0.0
2018-05-24 18:55:51,472 INFO [IPC Server handler 25 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1527163745858_0517_m_000000_1 is : 1.0
2018-05-24 18:55:51,562 INFO [IPC Server handler 23 on 37627] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Done acknowledgement from attempt_1527163745858_0517_m_000000_1
2018-05-24 18:55:51,595 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_1 TaskAttempt Transitioned from RUNNING to SUCCESS_FINISHING_CONTAINER
2018-05-24 18:55:51,595 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with attempt attempt_1527163745858_0517_m_000000_1
2018-05-24 18:55:51,603 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1527163745858_0517_m_000000 Task Transitioned from RUNNING to SUCCEEDED
2018-05-24 18:55:51,604 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 1
2018-05-24 18:55:51,604 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1527163745858_0517Job Transitioned from RUNNING to COMMITTING
2018-05-24 18:55:51,621 INFO [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_COMMIT
2018-05-24 18:55:51,859 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Calling handler for JobFinishedEvent
2018-05-24 18:55:51,860 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1527163745858_0517Job Transitioned from COMMITTING to SUCCEEDED
2018-05-24 18:55:51,883 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: We are finishing cleanly so this is the last retry
2018-05-24 18:55:51,883 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify RMCommunicator isAMLastRetry: true
2018-05-24 18:55:51,883 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: RMCommunicator notified that shouldUnregistered is: true
2018-05-24 18:55:51,883 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify JHEH isAMLastRetry: true
2018-05-24 18:55:51,884 INFO [Thread-73] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: JobHistoryEventHandler notified that forceJobCompletion is true
2018-05-24 18:55:51,884 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Calling stop for all the services
2018-05-24 18:55:51,892 INFO [Thread-73] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopping JobHistoryEventHandler. Size of the outstanding queue size is 0
2018-05-24 18:55:51,968 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:55:52,261 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/hue/.staging/job_1527163745858_0517/job_1527163745858_0517_1.jhist to hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517-1527168259391-hue-oozie%3Alauncher%3AT%3Dsqoop%3AW%3DERP_new%2Dcopy%3AA%3Dsqoop%2Dd90b-1527168351857-1-0-SUCCEEDED-root.users.hue-1527168280941.jhist_tmp
2018-05-24 18:55:52,647 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517-1527168259391-hue-oozie%3Alauncher%3AT%3Dsqoop%3AW%3DERP_new%2Dcopy%3AA%3Dsqoop%2Dd90b-1527168351857-1-0-SUCCEEDED-root.users.hue-1527168280941.jhist_tmp
2018-05-24 18:55:52,677 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/hue/.staging/job_1527163745858_0517/job_1527163745858_0517_1_conf.xml to hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517_conf.xml_tmp
2018-05-24 18:55:52,980 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1527163745858_0517_01_000003
2018-05-24 18:55:52,981 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:55:52,981 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1527163745858_0517_m_000000_1:
2018-05-24 18:55:52,982 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1527163745858_0517_m_000000_1 TaskAttempt Transitioned from SUCCESS_FINISHING_CONTAINER to SUCCEEDED
2018-05-24 18:55:52,998 INFO [ContainerLauncher #3] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_COMPLETED for container container_1527163745858_0517_01_000003 taskAttempt attempt_1527163745858_0517_m_000000_1
2018-05-24 18:55:53,034 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517_conf.xml_tmp
2018-05-24 18:55:53,060 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517.summary_tmp to hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517.summary
2018-05-24 18:55:53,068 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517_conf.xml_tmp to hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517_conf.xml
2018-05-24 18:55:53,077 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517-1527168259391-hue-oozie%3Alauncher%3AT%3Dsqoop%3AW%3DERP_new%2Dcopy%3AA%3Dsqoop%2Dd90b-1527168351857-1-0-SUCCEEDED-root.users.hue-1527168280941.jhist_tmp to hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020/user/history/done_intermediate/hue/job_1527163745858_0517-1527168259391-hue-oozie%3Alauncher%3AT%3Dsqoop%3AW%3DERP_new%2Dcopy%3AA%3Dsqoop%2Dd90b-1527168351857-1-0-SUCCEEDED-root.users.hue-1527168280941.jhist
2018-05-24 18:55:53,109 INFO [Thread-73] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopped JobHistoryEventHandler. super.stop()
2018-05-24 18:55:53,210 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Setting job diagnostics to
2018-05-24 18:55:53,211 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: History url is http://ip-172-31-4-192.ap-south-1.compute.internal:19888/jobhistory/job/job_1527163745858_0517
2018-05-24 18:55:53,237 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Waiting for application to be successfully unregistered.
2018-05-24 18:55:54,247 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Final Stats: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:0
2018-05-24 18:55:54,249 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory hdfs://ip-172-31-4-192.ap-south-1.compute.internal:8020 /user/hue/.staging/job_1527163745858_0517
2018-05-24 18:55:54,278 INFO [Thread-73] org.apache.hadoop.ipc.Server: Stopping server on 37627
2018-05-24 18:55:54,386 INFO [IPC Server listener on 37627] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 37627
2018-05-24 18:55:54,390 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2018-05-24 18:55:54,495 INFO [TaskHeartbeatHandler PingChecker] org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler thread interrupted
2018-05-24 18:55:54,515 INFO [Ping Checker] org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: TaskAttemptFinishingMonitor thread interrupted
2018-05-24 18:55:54,540 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Job end notification started for jobID : job_1527163745858_0517
2018-05-24 18:55:54,550 INFO [Thread-73] org.mortbay.log: Job end notification attempts left 0
2018-05-24 18:55:54,550 INFO [Thread-73] org.mortbay.log: Job end notification trying http://ip-172-31-4-192.ap-south-1.compute.internal:11000/oozie/callback?id=0000003-180524173424234-oozie-oozi-W@sqoop-d90b&status=SUCCEEDED
2018-05-24 18:55:54,612 INFO [Thread-73] org.mortbay.log: Job end notification to http://ip-172-31-4-192.ap-south-1.compute.internal:11000/oozie/callback?id=0000003-180524173424234-oozie-oozi-W@sqoop-d90b&status=SUCCEEDED succeeded
2018-05-24 18:55:54,612 INFO [Thread-73] org.mortbay.log: Job end notification succeeded for job_1527163745858_0517
2018-05-24 18:55:59,639 INFO [Thread-73] org.apache.hadoop.ipc.Server: Stopping server on 43856
2018-05-24 18:55:59,674 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2018-05-24 18:55:59,682 INFO [IPC Server listener on 43856] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 43856
2018-05-24 18:55:59,712 INFO [Thread-73] org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:0
Please help me fix this. I have been trying from quite some time
... View more
05-21-2018
03:42 AM
1 Kudo
@sim6 I hope you have more than 3 data nodes Generally there two types of "data missing" issues are possible for many reasons a. ReplicaNotFoundException b. BlockMissingException If your issue is related to BlockMissingException and if you have backup data in your DR environment then you are good otherwise it might be a problem, but for ReplicaNotFoundException, please make sure all your datanodes are healthy and commissioned state. In fact, namenode suppose to handle this automatically whenever a hit occurs on that data.. if not, you can also try hdfs rebalance (or) NN restart may fix this issue, but you don't need to try this option unless some user report any issue on the particular data. In your case no one reported yet and you found it, so you can ignore it for now
... View more
10-19-2017
02:24 AM
in /etc/hosts for all nodes: put ip_adress FQDN DN 10.10.1.230 name.domain.com name the FQDN must be before the name
... View more
07-06-2017
02:50 AM
I have increased the heap size. It was set to default of 256 MB which I guess was causing the problem. I will revert if it keeps working alright 🙂 Thanks much for your response .It helped
... View more
03-17-2017
04:01 AM
Well, the lock is stored in zookeeper. So you can search in zookeeper if the lock exists and delete it if yes. But I would advise you not to do so. Locks exist for data integrity. If you remove them while you should not it could lead to some "odd results". Maybe you could add some error management in the workflow in order to "retry X times" before failing the whole workflow ? You could also better "communicate" with the users in order to reduce the likely hood of having your scheduled queries running concurrently with "users" queries.
... View more
11-03-2016
06:02 PM
@jss : Yes, hostname -f returns the FQDN as expected but in AWS it is the private DNS which is not pingable from outside network. Only public DNS is pingable from outside netowrk and public DNS is not the FQDN. What do you suggest in this case?
... View more
11-05-2016
02:10 AM
You cannot do this. This is not supported as @Artem Ervits has already stated. Imagine what would happen to writes when clusters span multiple data centers? Remember, networks are assumed to be unreliable and unsecured. Now, I hate to say this and please don't do it as it is unsupported but amazon offers VPC which makes AWS an extension of your network using a VPN.
... View more
10-12-2016
11:17 AM
5 Kudos
Howdy, I'm just going to jump in and give you as much info as possible, so strap in. There's going to be a lot of (hopefully helpful) info. Before I get started, and I state this toward the end too, it’s important to know that all of this info is general “big picture” stuff, and there are a ton of factors that go into speccing your cluster (use cases, future scaling considerations, etc). I cannot stress this enough. That being said, let’s dig in. I'm going to answer your questions in order. In short, yes. we generally recommend bare metal(“node” = physical box) for production clusters. You can get away with using VMs on a hypervisor for development clusters or POCs, but that’s not recommended for production clusters. If you don’t have the resources for a bare metal cluster, it’s generally a better idea to deploy it in the cloud. For cloud based clusters, I recommend Cloudera Director, which allows you to deploy cloud based clusters that are configured with hadoop performance in mind. 2. It's not simply a question of how many nodes, but what the specs of each node are. We have some good documentation here that explains best practices for speccing your cluster hardware. The amount of nodes depends on what your workload will be like. This includes how much data you'll be ingesting, how often you'll be ingesting it, and how much you plan on processing said data. That being said, Cloudera Manager makes it super easy to scale out as you're workload grows. I would say the bare minimum is 5 nodes (2 masters, 3 workers) You can always scale out from there by adding additional worker and master nodes. 3 and 4. These can be answered with this nifty diagram (memory recommendations are RAM): This comes from our article on how to deploy clusters like a boss, which covers quite a bit. Additional info on the graphic can be found toward the bottom of the article. If you look at the diagram, you'll notice a few things: - The concept of master nodes, worker nodes, and edge nodes. Master nodes have master services like namenode service, resource manager, zookeeper, journal nodes, etc. If the service keeps track of tasks, marks changes, or has the term "manager" in it, you usually want it on a master node. You can put a good amount on single nodes because they don't do too much heavy lifting. - The placement of DB dependent services. Note that cloudera manager, hue, and all servers that reference a metastore are installed on the master node with an RDBMS installed. You don't have to set it up this way, but it does make logical sense, and is a little more tidy. You will have to consider adding a dedicated RDBMS server eventually, because having it installed on a master node with other servers can easily cause a bottleneck when you’ve scaled enough. - The worker node(s). this diagram only has one worker node, but it’s important to know that you should have at least three worker nodes for your cluster to function properly, as the default replication factor for HDFS is three. From there, you can add as many worker nodes as your workload dictates. At its base, you don't need many services on a worker node, but what you do need is a lot more memory, because these nodes are where data is stored in HDFS and where the heavy processing will be done. -The edge node. It's specced similarly, or even lower, than master nodes, and is only really home to gateways and other services that communicate with the outside world. You could add these services to another master node, but it's nice to have one dedicated, especially if you plan on having folks access the cluster externally. The article also has some good info on where to go with these services as you scale your cluster out further. One more note. If this is a Proof of Concept cluster, I recommend saving sentry for when you put the cluster into production. When you do add, do note it’s a service that uses an RDBMS. Some parting thoughts: When you're planning a cluster, it's important to stop and evaluate exactly what your goal is for said cluster. My recommendation is to start only with the services you need to get the job done. You can always add and activate services later through cloudera manager. If you need any info on each particular service and whether or not you really need it, check out the below links to our official documentation: Hive Pig (apache documentation) Zookeeper HDFS Hue Oozie Sqoop and sqoop2 Yarn Sentry And for that matter, you can search through our documentation here. While this info helps with general ideas and “big picture” topics, you need to consider a lot more info about your planned usage and vision to come up with an optimal setup. Use cases are vitally important to consider when speccing a cluster, especially for production. That being said, you’re more than welcome to get in touch with one our solutions architects to figure out the best configuration for your cluster. Here’s a link to some more info on that. This is a lot of info, so feel free to take your time digesting it all. Let me know if you have any questions. 🙂 Cheers
... View more
10-09-2016
01:02 AM
This should help: http://www.yourtechchick.com/hadoop/no-databases-available-permissions-missing-error-hive-sentry/
... View more