Member since
β11-21-2017
16
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
748 | β12-17-2019 12:16 AM |
β12-17-2019
12:16 AM
well, This is not really a problem with NiFi per se: NiFi was installed with docker so I kept using the docker host as it was the correct machine. Embarassing.. π
... View more
β12-16-2019
08:42 AM
Disclaimer: crosspost from https://stackoverflow.com/questions/59359158/apache-nifi-1-10-how-to-execute-a-python-script-as-a-processor
In NiFi <= v1.9 I used the following NiFi setup:
`GetTwitter` generates a JSON for every tweet gathered and attached to it there used to be a `ExecuteStreamCommand` that has the following parameters:
This would allow me to get the flowfile as input, and I could analyse the content and do something (in this specific case I would put the content on Elasticsearch) - You could reach the same result if you do something like this `cat flowfile.json | python <scriptname>.py`
But now in >= v1.10 it seems I cannot do the same. it gives me some error about parameters (it seems it wants to use the flowfile as parameter on calling the command/script instead, as it used to be, pass the flowfile as input. It gives me the following error:
2019-12-16 14:28:44,953 ERROR [Timer-Driven Process Thread-2] o.a.n.p.standard.ExecuteStreamCommand ExecuteStreamCommand[id=ef1d15fb-016e-1000-3050-35fba1733f97] Transferring flow file StandardFlowFileRecord[uuid=c485d8aa-58e9-428c-84b1-b4acbd6275a7,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1576505352573-22, container=default, section=22], offset=663999, length=-1],offset=0,name=242802f9-5ffb-4a2c-90a5-27736459dfd0.json,size=0] to nonzero status. Executable command bash ended in an error: bash: line 1: $'{delete:status:id:1005156317567188992}\r': command not found
2019-12-16 14:28:44,962 ERROR [Timer-Driven Process Thread-2] o.a.n.p.standard.ExecuteStreamCommand ExecuteStreamCommand[id=ef1d15fb-016e-1000-3050-35fba1733f97] Transferring flow file StandardFlowFileRecord[uuid=5b617840-ba8e-47b3-9a27-06e8ec0f88bd,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1576505352573-22, container=default, section=22], offset=663999, length=-1],offset=0,name=eb9e236a-1ccb-4243-aa1c-e5de43746f80.json,size=0] to nonzero status. Executable command bash ended in an error: bash: line 1: $'{delete:status:id:1139832600422277120}\r': command not found
It seems it takes the flowfile JSON as parameter instead of input . Any ideas?
... View more
- Tags:
- NiFi
Labels:
- Labels:
-
Apache NiFi
β01-11-2018
03:50 AM
spark-submit command
CONF_FILE=hdfs://ambari1.internal:8020/user/hadoop/local.normalizer.conf
${SPARK_DIR}/bin/spark-submit --conf spark.yarn.am.waitTime=110s --class mypackage.NormalizeJSON --master yarn --deploy-mode cluster ${WORK_DIR}/target/${JAR_VERSION} ${CONF_FILE}
... View more
β01-11-2018
03:49 AM
nodemanager.log 2018-01-10 15:32:29,309 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(499)) - Memory usage of ProcessTree 15190 for container-id container_e07_1515594329635_0001_01_000001: 513.4 MB of 2.5 GB physical memory used; 3.7 GB of 5.3 GB virtual memory used
2018-01-10 15:32:29,553 WARN nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:launchContainer(249)) - Exit code from container container_e07_1515594329635_0001_01_000001 is : 10
2018-01-10 15:32:29,554 WARN nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:launchContainer(255)) - Exception from container-launch with container ID: container_e07_1515594329635_0001_01_000001 and exit code: 10
ExitCodeException exitCode=10:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:944)
at org.apache.hadoop.util.Shell.run(Shell.java:848)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1142)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:237)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:326)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:87)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2018-01-10 15:32:29,556 INFO nodemanager.ContainerExecutor (ContainerExecutor.java:logOutput(368)) - Exception from container-launch.
2018-01-10 15:32:29,557 INFO nodemanager.ContainerExecutor (ContainerExecutor.java:logOutput(368)) - Container id: container_e07_1515594329635_0001_01_000001
2018-01-10 15:32:29,557 INFO nodemanager.ContainerExecutor (ContainerExecutor.java:logOutput(368)) - Exit code: 10
2018-01-10 15:32:29,557 WARN launcher.ContainerLaunch (ContainerLaunch.java:handleContainerExitWithFailure(431)) - Container launch failed : Container exited with a non-zero exit code 10.
2018-01-10 15:32:29,563 INFO container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e07_1515594329635_0001_01_000001 transitioned from RUNNING to EXITED_WITH_FAILURE
2018-01-10 15:32:29,564 INFO launcher.ContainerLaunch (ContainerLaunch.java:cleanupContainer(541)) - Cleaning up container container_e07_1515594329635_0001_01_000001
2018-01-10 15:32:29,588 INFO nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:deleteAsUser(492)) - Deleting absolute path : /hadoop/yarn/local/usercache/root/appcache/application_1515594329635_0001/container_e07_1515594329635_0001_01_000001
2018-01-10 15:32:29,595 INFO container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e07_1515594329635_0001_01_000001 transitioned from EXITED_WITH_FAILURE to DONE
... View more
β01-11-2018
02:24 AM
2018-01-10 12:13:32,405 INFO resourcemanager.ClientRMService (ClientRMService.java:getNewApplicationId(297)) - Allocated new applicationId: 1
2018-01-10 12:13:34,080 WARN rmapp.RMAppImpl (RMAppImpl.java:<init>(423)) - The specific max attempts: 0 for application: 1 is invalid, because it is out of the range [1, 2]. Use the global max attempts instead.
2018-01-10 12:13:34,081 INFO rmapp.RMAppImpl (RMAppImpl.java:transition(1105)) - Storing application with id application_1515582681419_0001
2018-01-10 12:13:34,082 INFO resourcemanager.ClientRMService (ClientRMService.java:submitApplication(615)) - Application with id 1 submitted by user root
2018-01-10 12:13:34,084 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(779)) - application_1515582681419_0001 State change from NEW to NEW_SAVING
2018-01-10 12:13:34,084 INFO recovery.RMStateStore (RMStateStore.java:transition(199)) - Storing info for app: application_1515582681419_0001
2018-01-10 12:13:34,095 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(779)) - application_1515582681419_0001 State change from NEW_SAVING to SUBMITTED
2018-01-10 12:13:34,096 INFO capacity.ParentQueue (ParentQueue.java:addApplication(360)) - Application added - appId: application_1515582681419_0001 user: root leaf-queue of parent: root #applications: 1
2018-01-10 12:13:34,097 INFO capacity.CapacityScheduler (CapacityScheduler.java:addApplication(744)) - Accepted application application_1515582681419_0001 from user: root, in queue: default
2018-01-10 12:13:34,097 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(779)) - application_1515582681419_0001 State change from SUBMITTED to ACCEPTED
2018-01-10 12:13:34,119 INFO resourcemanager.ApplicationMasterService (ApplicationMasterService.java:registerAppAttempt(679)) - Registering app attempt : appattempt_1515582681419_0001_000001
2018-01-10 12:13:34,120 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1515582681419_0001_000001 State change from NEW to SUBMITTED
2018-01-10 12:13:34,131 INFO capacity.LeafQueue (LeafQueue.java:activateApplications(720)) - Application application_1515582681419_0001 from user: root activated in queue: default
2018-01-10 12:13:34,131 INFO capacity.LeafQueue (LeafQueue.java:addApplicationAttempt(744)) - Application added - appId: application_1515582681419_0001 user: root, leaf-queue: default #user-pending-applications: 0 #user-active-applications: 1 #queue-pending-applications: 0 #queue-active-applications: 1
2018-01-10 12:13:34,132 INFO capacity.CapacityScheduler (CapacityScheduler.java:addApplicationAttempt(773)) - Added Application Attempt appattempt_1515582681419_0001_000001 to scheduler from user root in queue default
2018-01-10 12:13:34,133 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1515582681419_0001_000001 State change from SUBMITTED to SCHEDULED
2018-01-10 12:13:34,701 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(422)) - container_e05_1515582681419_0001_01_000001 Container Transitioned from NEW to ALLOCATED
2018-01-10 12:13:34,702 INFO scheduler.SchedulerNode (SchedulerNode.java:allocateContainer(152)) - Assigned container container_e05_1515582681419_0001_01_000001 of capacity <memory:1536, vCores:1> on host ambari2.internal:45454, which has 1 containers, <memory:1536, vCores:1> used and <memory:10496, vCores:3> available after allocation
2018-01-10 12:13:34,702 INFO allocator.AbstractContainerAllocator (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(89)) - assignedContainer application attempt=appattempt_1515582681419_0001_000001 container=container_e05_1515582681419_0001_01_000001 queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@6d756235 clusterResource=<memory:12032, vCores:4> type=OFF_SWITCH
2018-01-10 12:13:34,716 INFO security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:createAndGetNMToken(200)) - Sending NMToken for nodeId : ambari2.internal:45454 for container : container_e05_1515582681419_0001_01_000001
2018-01-10 12:13:34,724 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(422)) - container_e05_1515582681419_0001_01_000001 Container Transitioned from ALLOCATED to ACQUIRED
2018-01-10 12:13:34,724 INFO capacity.ParentQueue (ParentQueue.java:assignContainers(475)) - assignedContainer queue=root usedCapacity=0.12765957 absoluteUsedCapacity=0.12765957 used=<memory:1536, vCores:1> cluster=<memory:12032, vCores:4>
2018-01-10 12:13:34,725 INFO security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:clearNodeSetForAttempt(146)) - Clear node set for appattempt_1515582681419_0001_000001
2018-01-10 12:13:34,725 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:storeAttempt(1955)) - Storing attempt: AppId: application_1515582681419_0001 AttemptId: appattempt_1515582681419_0001_000001 MasterContainer: Container: [ContainerId: container_e05_1515582681419_0001_01_000001, NodeId: ambari2.internal:45454, NodeHttpAddress: ambari2.internal:8042, Resource: <memory:1536, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.100.25:45454 }, ]
2018-01-10 12:13:34,728 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1515582681419_0001_000001 State change from SCHEDULED to ALLOCATED_SAVING
2018-01-10 12:13:34,731 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1515582681419_0001_000001 State change from ALLOCATED_SAVING to ALLOCATED
2018-01-10 12:13:34,751 INFO amlauncher.AMLauncher (AMLauncher.java:run(266)) - Launching masterappattempt_1515582681419_0001_000001
2018-01-10 12:13:34,770 INFO amlauncher.AMLauncher (AMLauncher.java:launch(111)) - Setting up container Container: [ContainerId: container_e05_1515582681419_0001_01_000001, NodeId: ambari2.internal:45454, NodeHttpAddress: ambari2.internal:8042, Resource: <memory:1536, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.100.25:45454 }, ] for AM appattempt_1515582681419_0001_000001
2018-01-10 12:13:34,771 INFO security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:createAndGetAMRMToken(195)) - Create AMRMToken for ApplicationAttempt: appattempt_1515582681419_0001_000001
2018-01-10 12:13:34,774 INFO security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:createPassword(307)) - Creating password for appattempt_1515582681419_0001_000001
2018-01-10 12:13:34,985 INFO amlauncher.AMLauncher (AMLauncher.java:launch(132)) - Done launching container Container: [ContainerId: container_e05_1515582681419_0001_01_000001, NodeId: ambari2.internal:45454, NodeHttpAddress: ambari2.internal:8042, Resource: <memory:1536, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.100.25:45454 }, ] for AM appattempt_1515582681419_0001_000001
2018-01-10 12:13:34,986 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1515582681419_0001_000001 State change from ALLOCATED to LAUNCHED
2018-01-10 12:13:35,701 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(422)) - container_e05_1515582681419_0001_01_000001 Container Transitioned from ACQUIRED to RUNNING
2018-01-10 12:15:18,860 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(422)) - container_e05_1515582681419_0001_01_000001 Container Transitioned from RUNNING to COMPLETED
2018-01-10 12:15:18,860 INFO scheduler.SchedulerNode (SchedulerNode.java:releaseContainer(220)) - Released container container_e05_1515582681419_0001_01_000001 of capacity <memory:1536, vCores:1> on host ambari2.internal:45454, which currently has 0 containers, <memory:0, vCores:0> used and <memory:12032, vCores:4> available, release resources=true
2018-01-10 12:15:18,861 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:rememberTargetTransitionsAndStoreState(1209)) - Updating application attempt appattempt_1515582681419_0001_000001 with final state: FAILED, and exit status: 10
2018-01-10 12:15:18,862 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1515582681419_0001_000001 State change from LAUNCHED to FINAL_SAVING
2018-01-10 12:15:18,862 INFO integration.RMRegistryOperationsService (RMRegistryOperationsService.java:onContainerFinished(144)) - Container container_e05_1515582681419_0001_01_000001 finished, purging container-level records
2018-01-10 12:15:18,862 INFO integration.RMRegistryOperationsService (RMRegistryOperationsService.java:purgeRecordsAsync(198)) - records under / with ID container_e05_1515582681419_0001_01_000001 and policy container: {}
2018-01-10 12:15:18,870 INFO resourcemanager.ApplicationMasterService (ApplicationMasterService.java:unregisterAttempt(685)) - Unregistering app attempt : appattempt_1515582681419_0001_000001
2018-01-10 12:15:18,871 INFO security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:applicationMasterFinished(124)) - Application finished, removing password for appattempt_1515582681419_0001_000001
2018-01-10 12:15:18,872 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1515582681419_0001_000001 State change from FINAL_SAVING to FAILED
2018-01-10 12:15:18,872 INFO rmapp.RMAppImpl (RMAppImpl.java:transition(1331)) - The number of failed attempts is 1. The max attempts is 2
2018-01-10 12:15:18,873 INFO capacity.CapacityScheduler (CapacityScheduler.java:doneApplicationAttempt(812)) - Application Attempt appattempt_1515582681419_0001_000001 is done. finalState=FAILED
2018-01-10 12:15:18,873 INFO resourcemanager.ApplicationMasterService (ApplicationMasterService.java:registerAppAttempt(679)) - Registering app attempt : appattempt_1515582681419_0001_000002
2018-01-10 12:15:18,873 INFO scheduler.AppSchedulingInfo (AppSchedulingInfo.java:clearRequests(124)) - Application application_1515582681419_0001 requests cleared
2018-01-10 12:15:18,873 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1515582681419_0001_000002 State change from NEW to SUBMITTED
2018-01-10 12:15:18,873 INFO capacity.LeafQueue (LeafQueue.java:removeApplicationAttempt(795)) - Application removed - appId: application_1515582681419_0001 user: root queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2018-01-10 12:15:18,873 INFO capacity.LeafQueue (LeafQueue.java:activateApplications(720)) - Application application_1515582681419_0001 from user: root activated in queue: default
2018-01-10 12:15:18,873 INFO capacity.LeafQueue (LeafQueue.java:addApplicationAttempt(744)) - Application added - appId: application_1515582681419_0001 user: root, leaf-queue: default #user-pending-applications: 0 #user-active-applications: 1 #queue-pending-applications: 0 #queue-active-applications: 1
2018-01-10 12:15:18,873 INFO capacity.CapacityScheduler (CapacityScheduler.java:addApplicationAttempt(773)) - Added Application Attempt appattempt_1515582681419_0001_000002 to scheduler from user root in queue default
2018-01-10 12:15:18,877 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1515582681419_0001_000002 State change from SUBMITTED to SCHEDULED
2018-01-10 12:15:19,860 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(422)) - container_e05_1515582681419_0001_02_000001 Container Transitioned from NEW to ALLOCATED
2018-01-10 12:15:19,860 INFO scheduler.SchedulerNode (SchedulerNode.java:allocateContainer(152)) - Assigned container container_e05_1515582681419_0001_02_000001 of capacity <memory:1536, vCores:1> on host ambari2.internal:45454, which has 1 containers, <memory:1536, vCores:1> used and <memory:10496, vCores:3> available after allocation
2018-01-10 12:15:19,861 INFO allocator.AbstractContainerAllocator (AbstractContainerAllocator.java:getCSAssignmentFromAllocateResult(89)) - assignedContainer application attempt=appattempt_1515582681419_0001_000002 container=container_e05_1515582681419_0001_02_000001 queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@2cf52214 clusterResource=<memory:12032, vCores:4> type=OFF_SWITCH
2018-01-10 12:15:19,862 INFO security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:createAndGetNMToken(200)) - Sending NMToken for nodeId : ambari2.internal:45454 for container : container_e05_1515582681419_0001_02_000001
2018-01-10 12:15:19,862 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(422)) - container_e05_1515582681419_0001_02_000001 Container Transitioned from ALLOCATED to ACQUIRED
2018-01-10 12:15:19,863 INFO security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:clearNodeSetForAttempt(146)) - Clear node set for appattempt_1515582681419_0001_000002
2018-01-10 12:15:19,863 INFO capacity.ParentQueue (ParentQueue.java:assignContainers(475)) - assignedContainer queue=root usedCapacity=0.12765957 absoluteUsedCapacity=0.12765957 used=<memory:1536, vCores:1> cluster=<memory:12032, vCores:4>
2018-01-10 12:15:19,863 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:storeAttempt(1955)) - Storing attempt: AppId: application_1515582681419_0001 AttemptId: appattempt_1515582681419_0001_000002 MasterContainer: Container: [ContainerId: container_e05_1515582681419_0001_02_000001, NodeId: ambari2.internal:45454, NodeHttpAddress: ambari2.internal:8042, Resource: <memory:1536, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.100.25:45454 }, ]
2018-01-10 12:15:19,863 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1515582681419_0001_000002 State change from SCHEDULED to ALLOCATED_SAVING
2018-01-10 12:15:19,867 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1515582681419_0001_000002 State change from ALLOCATED_SAVING to ALLOCATED
2018-01-10 12:15:19,867 INFO amlauncher.AMLauncher (AMLauncher.java:run(266)) - Launching masterappattempt_1515582681419_0001_000002
2018-01-10 12:15:19,870 INFO amlauncher.AMLauncher (AMLauncher.java:launch(111)) - Setting up container Container: [ContainerId: container_e05_1515582681419_0001_02_000001, NodeId: ambari2.internal:45454, NodeHttpAddress: ambari2.internal:8042, Resource: <memory:1536, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.100.25:45454 }, ] for AM appattempt_1515582681419_0001_000002
2018-01-10 12:15:19,870 INFO security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:createAndGetAMRMToken(195)) - Create AMRMToken for ApplicationAttempt: appattempt_1515582681419_0001_000002
2018-01-10 12:15:19,870 INFO security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:createPassword(307)) - Creating password for appattempt_1515582681419_0001_000002
2018-01-10 12:15:19,887 INFO amlauncher.AMLauncher (AMLauncher.java:launch(132)) - Done launching container Container: [ContainerId: container_e05_1515582681419_0001_02_000001, NodeId: ambari2.internal:45454, NodeHttpAddress: ambari2.internal:8042, Resource: <memory:1536, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.100.25:45454 }, ] for AM appattempt_1515582681419_0001_000002
2018-01-10 12:15:19,887 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1515582681419_0001_000002 State change from ALLOCATED to LAUNCHED
2018-01-10 12:15:20,862 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(422)) - container_e05_1515582681419_0001_02_000001 Container Transitioned from ACQUIRED to RUNNING
2018-01-10 12:17:02,626 INFO rmcontainer.RMContainerImpl (RMContainerImpl.java:handle(422)) - container_e05_1515582681419_0001_02_000001 Container Transitioned from RUNNING to COMPLETED
2018-01-10 12:17:02,626 INFO scheduler.SchedulerNode (SchedulerNode.java:releaseContainer(220)) - Released container container_e05_1515582681419_0001_02_000001 of capacity <memory:1536, vCores:1> on host ambari2.internal:45454, which currently has 0 containers, <memory:0, vCores:0> used and <memory:12032, vCores:4> available, release resources=true
2018-01-10 12:17:02,627 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:rememberTargetTransitionsAndStoreState(1209)) - Updating application attempt appattempt_1515582681419_0001_000002 with final state: FAILED, and exit status: 10
2018-01-10 12:17:02,627 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1515582681419_0001_000002 State change from LAUNCHED to FINAL_SAVING
2018-01-10 12:17:02,627 INFO integration.RMRegistryOperationsService (RMRegistryOperationsService.java:onContainerFinished(144)) - Container container_e05_1515582681419_0001_02_000001 finished, purging container-level records
2018-01-10 12:17:02,627 INFO integration.RMRegistryOperationsService (RMRegistryOperationsService.java:purgeRecordsAsync(198)) - records under / with ID container_e05_1515582681419_0001_02_000001 and policy container: {}
2018-01-10 12:17:02,632 INFO resourcemanager.ApplicationMasterService (ApplicationMasterService.java:unregisterAttempt(685)) - Unregistering app attempt : appattempt_1515582681419_0001_000002
2018-01-10 12:17:02,632 INFO security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:applicationMasterFinished(124)) - Application finished, removing password for appattempt_1515582681419_0001_000002
2018-01-10 12:17:02,632 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(809)) - appattempt_1515582681419_0001_000002 State change from FINAL_SAVING to FAILED
2018-01-10 12:17:02,632 INFO rmapp.RMAppImpl (RMAppImpl.java:transition(1331)) - The number of failed attempts is 2. The max attempts is 2
2018-01-10 12:17:02,633 INFO rmapp.RMAppImpl (RMAppImpl.java:rememberTargetTransitionsAndStoreState(1124)) - Updating application application_1515582681419_0001 with final state: FAILED
2018-01-10 12:17:02,634 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(779)) - application_1515582681419_0001 State change from ACCEPTED to FINAL_SAVING
2018-01-10 12:17:02,634 INFO recovery.RMStateStore (RMStateStore.java:transition(228)) - Updating info for app: application_1515582681419_0001
2018-01-10 12:17:02,634 INFO capacity.CapacityScheduler (CapacityScheduler.java:doneApplicationAttempt(812)) - Application Attempt appattempt_1515582681419_0001_000002 is done. finalState=FAILED
2018-01-10 12:17:02,634 INFO scheduler.AppSchedulingInfo (AppSchedulingInfo.java:clearRequests(124)) - Application application_1515582681419_0001 requests cleared
2018-01-10 12:17:02,634 INFO capacity.LeafQueue (LeafQueue.java:removeApplicationAttempt(795)) - Application removed - appId: application_1515582681419_0001 user: root queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2018-01-10 12:17:02,636 INFO rmapp.RMAppImpl (RMAppImpl.java:transition(1064)) - Application application_1515582681419_0001 failed 2 times due to AM Container for appattempt_1515582681419_0001_000002 exited with exitCode: 10
For more detailed output, check the application tracking page: http://ambari1.internal:8088/cluster/app/application_1515582681419_0001 Then click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e05_1515582681419_0001_02_000001
Exit code: 10
Container exited with a non-zero exit code 10. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
f time 1515583020000 ms
18/01/10 12:17:00 INFO JobScheduler: Total delay: 0.013 s for time 1515583020000 ms (execution: 0.009 s)
18/01/10 12:17:00 INFO KafkaRDD: Removing RDD 8 from persistence list
18/01/10 12:17:00 INFO BlockManager: Removing RDD 8
18/01/10 12:17:00 INFO ReceivedBlockTracker: Deleting batches:
18/01/10 12:17:00 INFO InputInfoTracker: remove old batch metadata: 1515583000000 ms
18/01/10 12:17:02 ERROR ApplicationMaster: Uncaught exception:
java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:423)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:282)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:768)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:766)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
18/01/10 12:17:02 INFO ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds])
18/01/10 12:17:02 INFO StreamingContext: Invoking stop(stopGracefully=false) from shutdown hook
18/01/10 12:17:02 INFO ReceiverTracker: ReceiverTracker stopped
18/01/10 12:17:02 INFO JobGenerator: Stopping JobGenerator immediately
18/01/10 12:17:02 INFO RecurringTimer: Stopped timer for JobGenerator after time 1515583020000
18/01/10 12:17:02 INFO JobGenerator: Stopped JobGenerator
18/01/10 12:17:02 INFO JobScheduler: Stopped JobScheduler
18/01/10 12:17:02 INFO ContextHandler: Stopped o.s.j.s.ServletContextHandler@6fd4adf6{/streaming,null,UNAVAILABLE,@Spark}
18/01/10 12:17:02 INFO ContextHandler: Stopped o.s.j.s.ServletContextHandler@4db63574{/streaming/batch,null,UNAVAILABLE,@Spark}
18/01/10 12:17:02 INFO ContextHandler: Stopped o.s.j.s.ServletContextHandler@5aa67d0f{/static/streaming,null,UNAVAILABLE,@Spark}
18/01/10 12:17:02 INFO StreamingContext: StreamingContext stopped successfully
18/01/10 12:17:02 INFO SparkContext: Invoking stop() from shutdown hook
18/01/10 12:17:02 INFO AbstractConnector: Stopped Spark@39c6ad48{HTTP/1.1,[http/1.1]}{0.0.0.0:0}
18/01/10 12:17:02 INFO SparkUI: Stopped Spark web UI at http://192.168.100.25:34356
18/01/10 12:17:02 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/01/10 12:17:02 INFO MemoryStore: MemoryStore cleared
18/01/10 12:17:02 INFO BlockManager: BlockManager stopped
18/01/10 12:17:02 INFO BlockManagerMaster: BlockManagerMaster stopped
18/01/10 12:17:02 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/01/10 12:17:02 INFO SparkContext: Successfully stopped SparkContext
18/01/10 12:17:02 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Uncaught exception: java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds])
18/01/10 12:17:02 INFO ApplicationMaster: Deleting staging directory hdfs://ambari1.internal:8020/user/root/.sparkStaging/application_1515582681419_0001
18/01/10 12:17:02 INFO ShutdownHookManager: Shutdown hook called
18/01/10 12:17:02 INFO ShutdownHookManager: Deleting directory /hadoop/yarn/local/usercache/root/appcache/application_1515582681419_0001/spark-58754dbc-fb87-4868-a4df-164f898d3ff5
Failing this attempt. Failing the application.
2018-01-10 12:17:02,637 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(779)) - application_1515582681419_0001 State change from FINAL_SAVING to FAILED
2018-01-10 12:17:02,637 INFO capacity.ParentQueue (ParentQueue.java:removeApplication(385)) - Application removed - appId: application_1515582681419_0001 user: root leaf-queue of parent: root #applications: 0
2018-01-10 12:17:02,638 INFO integration.RMRegistryOperationsService (RMRegistryOperationsService.java:onApplicationCompleted(119)) - Application application_1515582681419_0001 completed, purging application-level records
2018-01-10 12:17:02,638 INFO integration.RMRegistryOperationsService (RMRegistryOperationsService.java:purgeRecordsAsync(198)) - records under / with ID application_1515582681419_0001 and policy application: {}
2018-01-10 12:21:21,034 INFO scheduler.AbstractYarnScheduler (AbstractYarnScheduler.java:run(479)) - Release request cache is cleaned up
2018-01-10 13:21:20,999 INFO availability.MetricSinkWriteShardHostnameHashingStrategy (MetricSinkWriteShardHostnameHashingStrategy.java:findCollectorShard(42)) - Calculated collector shard ambari1.internal based on hostname: ambari1.internal
2018-01-10 14:21:18,164 INFO resourcemanager.ClientRMService (ClientRMService.java:getNewApplicationId(297)) - Allocated new applicationId: 2
2018-01-10 14:21:19,857 WARN rmapp.RMAppImpl (RMAppImpl.java:<init>(423)) - The specific max attempts: 0 for application: 2 is invalid, because it is out of the range [1, 2]. Use the global max attempts instead.
2018-01-10 14:21:19,857 INFO resourcemanager.ClientRMService (ClientRMService.java:submitApplication(615)) - Application with id 2 submitted by user root
2018-01-10 14:21:19,857 INFO rmapp.RMAppImpl (RMAppImpl.java:transition(1105)) - Storing application with id application_1515582681419_0002
2018-01-10 14:21:19,858 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(779)) - application_1515582681419_0002 State change from NEW to NEW_SAVING
2018-01-10 14:21:19,858 INFO recovery.RMStateStore (RMStateStore.java:transition(199)) - Storing info for app: application_1515582681419_0002
2018-01-10 14:21:19,863 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(779)) - application_1515582681419_0002 State change from NEW_SAVING to SUBMITTED
2018-01-10 14:21:19,863 INFO capacity.ParentQueue (ParentQueue.java:addApplication(360)) - Application added - appId: application_1515582681419_0002 user: root leaf-queue of parent: root #applications: 1
2018-01-10 14:21:19,863 INFO capacity.CapacityScheduler (CapacityScheduler.java:addApplication(744)) - Accepted application application_1515582681419_0002 from user: root, in queue: default resourcemanager.log
... View more
β01-11-2018
02:21 AM
Thank you guys for replying. I wanted to edit the question in order to give more logs there. But I don't think I can... So... here you are: hadoop-mapreduce.jobsummary.log
2018-01-10 12:11:23,189 INFO resourcemanager.RMAppManager$ApplicationSummary: appId=application_1515514117634_0006,name=iit.cnr.it.socialpipeline.NormalizeJSON,user=root,queue=default,state=FAILED,trackingUrl=http://ambari1.internal:8088/proxy/application_1515514117634_0006/,appMasterHost=N/A,startTime=1515521579503,finishTime=1515521786904,finalStatus=FAILED,memorySeconds=421688,vcoreSeconds=205,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=<memory:0\, vCores:0>,applicationType=SPARK
2018-01-10 12:11:23,189 INFO resourcemanager.RMAppManager$ApplicationSummary: appId=application_1515521947770_0001,name=iit.cnr.it.socialpipeline.NormalizeJSON,user=root,queue=default,state=FAILED,trackingUrl=http://ambari1.internal:8088/proxy/application_1515521947770_0001/,appMasterHost=N/A,startTime=1515522042260,finishTime=1515522252527,finalStatus=FAILED,memorySeconds=320665,vcoreSeconds=208,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=<memory:0\, vCores:0>,applicationType=SPARK
2018-01-10 12:11:23,190 INFO resourcemanager.RMAppManager$ApplicationSummary: appId=application_1515521947770_0002,name=iit.cnr.it.socialpipeline.NormalizeJSON,user=root,queue=default,state=FAILED,trackingUrl=http://ambari1.internal:8088/proxy/application_1515521947770_0002/,appMasterHost=N/A,startTime=1515581922574,finishTime=1515582130269,finalStatus=FAILED,memorySeconds=316332,vcoreSeconds=205,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=<memory:0\, vCores:0>,applicationType=SPARK
2018-01-10 12:17:02,638 INFO resourcemanager.RMAppManager$ApplicationSummary: appId=application_1515582681419_0001,name=iit.cnr.it.socialpipeline.NormalizeJSON,user=root,queue=default,state=FAILED,trackingUrl=http://ambari1.internal:8088/cluster/app/application_1515582681419_0001,appMasterHost=N/A,startTime=1515582814080,finishTime=1515583022633,finalStatus=FAILED,memorySeconds=317834,vcoreSeconds=206,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=<memory:0\, vCores:0>,applicationType=SPARK
2018-01-10 14:32:07,537 INFO resourcemanager.RMAppManager$ApplicationSummary: appId=application_1515582681419_0002,name=iit.cnr.it.socialpipeline.NormalizeJSON,user=root,queue=default,state=FAILED,trackingUrl=http://ambari1.internal:8088/cluster/app/application_1515582681419_0002,appMasterHost=N/A,startTime=1515590479857,finishTime=1515591127527,finalStatus=FAILED,memorySeconds=992297,vcoreSeconds=645,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=<memory:0\, vCores:0>,applicationType=SPARK
2018-01-10 15:00:56,548 INFO resourcemanager.RMAppManager$ApplicationSummary: appId=application_1515582681419_0003,name=iit.cnr.it.socialpipeline.NormalizeJSON,user=root,queue=default,state=FAILED,trackingUrl=http://ambari1.internal:8088/cluster/app/application_1515582681419_0003,appMasterHost=N/A,startTime=1515592208420,finishTime=1515592856545,finalStatus=FAILED,memorySeconds=1654602,vcoreSeconds=645,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=<memory:0\, vCores:0>,applicationType=SPARK rm-audit.log
Container exited with a non-zero exit code 10. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
f time 1515583020000 ms
18/01/10 12:17:00 INFO JobScheduler: Total delay: 0.013 s for time 1515583020000 ms (execution: 0.009 s)
18/01/10 12:17:00 INFO KafkaRDD: Removing RDD 8 from persistence list
18/01/10 12:17:00 INFO BlockManager: Removing RDD 8
18/01/10 12:17:00 INFO ReceivedBlockTracker: Deleting batches:
18/01/10 12:17:00 INFO InputInfoTracker: remove old batch metadata: 1515583000000 ms
18/01/10 12:17:02 ERROR ApplicationMaster: Uncaught exception:
java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:423)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:282)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:768)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:766)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
18/01/10 12:17:02 INFO ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds])
18/01/10 12:17:02 INFO StreamingContext: Invoking stop(stopGracefully=false) from shutdown hook
18/01/10 12:17:02 INFO ReceiverTracker: ReceiverTracker stopped
18/01/10 12:17:02 INFO JobGenerator: Stopping JobGenerator immediately
18/01/10 12:17:02 INFO RecurringTimer: Stopped timer for JobGenerator after time 1515583020000
18/01/10 12:17:02 INFO JobGenerator: Stopped JobGenerator
18/01/10 12:17:02 INFO JobScheduler: Stopped JobScheduler
18/01/10 12:17:02 INFO ContextHandler: Stopped o.s.j.s.ServletContextHandler@6fd4adf6{/streaming,null,UNAVAILABLE,@Spark}
18/01/10 12:17:02 INFO ContextHandler: Stopped o.s.j.s.ServletContextHandler@4db63574{/streaming/batch,null,UNAVAILABLE,@Spark}
18/01/10 12:17:02 INFO ContextHandler: Stopped o.s.j.s.ServletContextHandler@5aa67d0f{/static/streaming,null,UNAVAILABLE,@Spark}
18/01/10 12:17:02 INFO StreamingContext: StreamingContext stopped successfully
18/01/10 12:17:02 INFO SparkContext: Invoking stop() from shutdown hook
18/01/10 12:17:02 INFO AbstractConnector: Stopped Spark@39c6ad48{HTTP/1.1,[http/1.1]}{0.0.0.0:0}
18/01/10 12:17:02 INFO SparkUI: Stopped Spark web UI at http://192.168.100.25:34356
18/01/10 12:17:02 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/01/10 12:17:02 INFO MemoryStore: MemoryStore cleared
18/01/10 12:17:02 INFO BlockManager: BlockManager stopped
18/01/10 12:17:02 INFO BlockManagerMaster: BlockManagerMaster stopped
18/01/10 12:17:02 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/01/10 12:17:02 INFO SparkContext: Successfully stopped SparkContext
18/01/10 12:17:02 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Uncaught exception: java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds])
18/01/10 12:17:02 INFO ApplicationMaster: Deleting staging directory hdfs://ambari1.internal:8020/user/root/.sparkStaging/application_1515582681419_0001
18/01/10 12:17:02 INFO ShutdownHookManager: Shutdown hook called
18/01/10 12:17:02 INFO ShutdownHookManager: Deleting directory /hadoop/yarn/local/usercache/root/appcache/application_1515582681419_0001/spark-58754dbc-fb87-4868-a4df-164f898d3ff5
Failing this attempt. Failing the application. APPID=application_1515582681419_0001
2018-01-10 15:25:31,389 WARN resourcemanager.RMAuditLogger: USER=root OPERATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1515582681419_0002 failed 2 times due to AM Container for appattempt_1515582681419_0002_000002 exited with exitCode: 10
For more detailed output, check the application tracking page: http://ambari1.internal:8088/cluster/app/application_1515582681419_0002 Then click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e05_1515582681419_0002_02_000001
Exit code: 10
... View more
β01-10-2018
07:18 AM
I am running a spark streaming application that simply read messages from a Kafka topic, enrich them and then write the enriched messages in another kafka topic. I already tried it in Standalone mode (both client and cluster deploy mode) and in YARN client mode, successfully. When I submit the application in cluster mode it gives me the following messages:
18/01/10 12:13:34 INFO Client: Submitting application application_1515582681419_0001 to ResourceManager
18/01/10 12:13:34 INFO YarnClientImpl: Submitted application application_1515582681419_0001
18/01/10 12:13:35 INFO Client: Application report for application_1515582681419_0001 (state: ACCEPTED)
18/01/10 12:13:35 INFO Client:
client token: N/A
diagnostics: AM container is launched, waiting for AM container to Register with RM
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1515582814080
final status: UNDEFINED
tracking URL: http://ambari1.internal:8088/proxy/application_1515582681419_0001/
user: root
18/01/10 12:13:36 INFO Client: Application report for application_1515582681419_0001 (state: ACCEPTED)
18/01/10 12:13:37 INFO Client: Application report for application_1515582681419_0001 (state: ACCEPTED)
And keeps stuck in ACCEPTED Status until after around 4-5 minutes, exit with the following error message:
18/01/10 12:17:00 INFO InputInfoTracker: remove old batch metadata: 1515583000000 ms
18/01/10 12:17:02 ERROR ApplicationMaster: Uncaught exception:
java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:423)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:282)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:768)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:766)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
18/01/10 12:17:02 INFO ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds])
18/01/10 12:17:02 INFO StreamingContext: Invoking stop(stopGracefully=false) from shutdown hook
18/01/10 12:17:02 INFO ReceiverTracker: ReceiverTracker stopped
18/01/10 12:17:02 INFO JobGenerator: Stopping JobGenerator immediately
Funny fact: If I visit the age of the application, I can see that the Spark Context has been started and it processes some messages.
Could anyone help me on this? PS: These are the resources of my YARN cluster:
... View more
β11-24-2017
07:41 AM
I installed Cloudera Manager 5.13. While installing and running for the first time YARN.. I had the following error
Error starting JobHistoryServer
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Error creating done directory: [hdfs://vmi150132.contaboserver.net:8020/user/history/done]
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.tryCreatingHistoryDirs(HistoryFileManager.java:682)
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.createHistoryDirs(HistoryFileManager.java:618)
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.serviceInit(HistoryFileManager.java:579)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.mapreduce.v2.hs.JobHistory.serviceInit(JobHistory.java:95)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.serviceInit(JobHistoryServer.java:154)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.launchJobHistoryServer(JobHistoryServer.java:229)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:239)
Caused by: org.apache.hadoop.security.AccessControlException: Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:279)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:260)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:240)
[.... and other rows of Stack exception errors]
So basically the problem is the permission of the HDFS folders. Something like: `sudo -u hdfs hdfs dfs -chmod -R 777 /` will fix the error. But my question is... Wouldn't be unsafe? Why Cloudera doesn't handle this permissions? Was this handled well in the prrevious versions?
... View more
Labels:
- Labels:
-
Cloudera Manager
-
HDFS
-
MapReduce
-
YARN
β11-21-2017
08:21 AM
1 Kudo
I didn't get why is this a problem for Cloudera. But excluding it, it actually worked!
... View more
β11-21-2017
08:13 AM
that works! and it is actually weird
... View more
β11-21-2017
07:33 AM
1 Kudo
I have cloudera Manager 5.12.1 and After having installed the CSD following the first steps of this guide: https://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html but after restarting the Cloudera Manager, even if it automatically added the remote parcel repository URL (effect from the installation of the CSD) . I cannot find the new parcel in order to download it. I digged into the logs and I found this warning 4:18:57.328 PM WARN ReadWriteDatabaseTaskCallable
Error while executing CmfEntityManager task
java.lang.NullPointerException
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191)
at com.google.common.collect.Collections2.filter(Collections2.java:92)
at com.cloudera.parcel.components.ParcelDownloaderImpl$RepositoryInfo.getParcelsWithValidNames(ParcelDownloaderImpl.java:673)
at com.cloudera.parcel.components.ParcelDownloaderImpl$RepositoryInfo.getSortedParcels(ParcelDownloaderImpl.java:691)
at com.cloudera.parcel.components.ParcelDownloaderImpl.syncRemoteRepos(ParcelDownloaderImpl.java:368)
at com.cloudera.parcel.components.ParcelDownloaderImpl$1.run(ParcelDownloaderImpl.java:438)
at com.cloudera.parcel.components.ParcelDownloaderImpl$1.run(ParcelDownloaderImpl.java:433)
at com.cloudera.cmf.persist.ReadWriteDatabaseTaskCallable.call(ReadWriteDatabaseTaskCallable.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745) maybe it cannot download the parcels? why? I have no firewall or proxy activated. I have CENTOS7
... View more
Labels:
- Labels:
-
Cloudera Manager
-
Spark
β04-30-2016
11:55 PM
1 Kudo
I tried to install Ambari into 16.04 Lubuntu. The ambari version was 2.2.1.1 While registering with the UI wizard, an Ambari agent⦠the following error occurred from: /var/log/ambari-agent/ambari-agent.log INFO 2016-05-01 01:01:03,673 NetUtil.py:60 - Connecting to https://salvob1:8440/connection_infoINFO 2016-05-01 01:01:04,165 security.py:99 - SSL Connect being called.. connecting to the serverINFO 2016-05-01 01:01:04,792 security.py:60 - SSL connection established. Two-way SSL authentication is turned off on the server.ERROR 2016-05-01 01:01:05,197 Controller.py:165 - Cannot register host with not supported os type, hostname=salvob2, serverOsType=ubuntu16, agentOsType=ubuntu16INFO 2016-05-01 01:01:05,199 Controller.py:394 - Registration response from salvob1 was FAILED I wonder if and when is gonna be available a fix for this
... View more
Labels:
- Labels:
-
Apache Ambari
β04-30-2016
11:48 PM
Thank you for the reply. I did some research and it is important to add the right repo.list This is a "better" guide . Plus, I was installing ambari in my Lubuntu 16.04 . I was sure it was fully compatible and almost the same with a 14.04 Ubuntu version. But I found out that it fails in registering a host with a "not-recognised os", here the stack error log from /var/log/ambari-agent/ambari-agent.log : INFO 2016-05-01 01:01:03,673 NetUtil.py:60 - Connecting to https://salvob1:8440/connection_info
INFO 2016-05-01 01:01:04,165 security.py:99 - SSL Connect being called.. connecting to the server
INFO 2016-05-01 01:01:04,792 security.py:60 - SSL connection established. Two-way SSL authentication is turned off on the server.
ERROR 2016-05-01 01:01:05,197 Controller.py:165 - Cannot register host with not supported os type, hostname=salvob2, serverOsType=ubuntu16, agentOsType=ubuntu16
INFO 2016-05-01 01:01:05,199 Controller.py:394 - Registration response from salvob1 was FAILED I hope Hortonworks add soon, even a small support to this new os
... View more
β04-30-2016
04:11 PM
I am trying to register a host with my Ambari server.
I am using ubuntu, and the version of the server is 2.2.1.0-161 but when trying to register the host, it fails with this error in bootstrap error: INFO:root:BootStrapping hosts ['salvob2'] using /usr/lib/python2.6/site-packages/ambari_server cluster primary OS: ubuntu16 with user 'root' sshKey File /var/run/ambari-server/bootstrap/1/sshKey password File null using tmp dir /var/run/ambari-server/bootstrap/1 ambari: salvob1; server_port: 8080; ambari version: 2.2.1.0; user_run_as: root
INFO:root:Executing parallel bootstrap
ERROR:root:ERROR: Bootstrap of host salvob2 fails because previous action finished with non-zero exit code (1)
ERROR MESSAGE: Connection to salvob2 closed.
STDOUT: Desired version (2.2.1.0) of ambari-agent package is not available. Repository has following versions of ambari-agent:2.2.1.0,2.2.1.0,
Connection to salvob2 closed.
INFO:root:Finished parallel bootstrap
... View more
Labels:
- Labels:
-
Apache Ambari