Created on 11-27-2014 07:04 AM - edited 09-16-2022 08:39 AM
I have Oryx running fine with local computation, however, when I switch to Hadoop, it throws a few fatal exceptions. I will explain below my local configuration, the commands I ran and the result from the computation layer.
Configuration
- Hadoop 2.5.1 running in pseudo distributed mode with Yarn. The installation is exactly as explained on the Apache foundation website, keeping the same names too (I ran the map reduce example from there and it went on fine)
- Oryx built from source, pulled around a month ago, and it produced this: oryx-computation-1.0.1-SNAPSHOT.jar
- my conf file:
model=${als-model}
model.instance-dir=/home/christina/IdeaProjects/oryx_hadoop
#model.local-computation=true
#model.local-data=true
model.test-set-fraction=0.25
model.features=6
model.lambda=1
model.alpha=30
serving-layer.api.port=8093
computation-layer.api.port=8094
- conf file full path: /home/christina/IdeaProjects/oryx_hadoop/ComputationLayer.conf
- Hadoop installation dir: /usr/local/hadoop
Commands
(~ = /home/christina)
1) /usr/local/hadoop$ hadoop fs -mkdir -p ~/IdeaProjects/oryx_hadoop/00000/inbound
2) /usr/local/hadoop$ hadoop fs -copyFromLocal ~/IdeaProjects/oryx/00000/inbound/cleaned_taste_preferences_rated_last_month.csv ~/IdeaProjects/oryx_hadoop/00000/inbound
(this other location: ~/IdeaProjects/oryx/00000/inbound/cleaned_taste_preferences_rated_last_month.csv is where I keep the original file)
3) ~$ java -Dconfig.file=/home/christina/IdeaProjects/oryx_hadoop/ComputationLayer.conf -jar /home/christina/oryx/computation/target/oryx-computation-1.0.1-SNAPSHOT.jar -server -d64 -Xmx10240m -XX:+UseParallelGC -XX:+UseParallelOldGC -XX:-UseGCOverheadLimit
Results
Thu Nov 27 14:14:07 GMT 2014 INFO Initializing ProtocolHandler ["http-nio-8094"]
Thu Nov 27 14:14:07 GMT 2014 INFO Using a shared selector for servlet write/read
Thu Nov 27 14:14:07 GMT 2014 INFO Starting service Tomcat
Thu Nov 27 14:14:07 GMT 2014 INFO Starting Servlet Engine: Apache Tomcat/7.0.56
Thu Nov 27 14:14:08 GMT 2014 INFO Computation Layer console available at http://christina-Precision-T1700:8094
Thu Nov 27 14:14:08 GMT 2014 INFO Starting ProtocolHandler ["http-nio-8094"]
Thu Nov 27 14:14:08 GMT 2014 SEVERE Unexpected error in execution
java.lang.ExceptionInInitializerError
at com.cloudera.oryx.common.servcomp.StoreUtils.listGenerationsForInstance(StoreUtils.java:50)
at com.cloudera.oryx.computation.PeriodicRunner.run(PeriodicRunner.java:173)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: Not a directory: /etc/hadoop/conf
at com.google.common.base.Preconditions.checkState(Preconditions.java:172)
at com.cloudera.oryx.common.servcomp.OryxConfiguration.findHadoopConfDir(OryxConfiguration.java:108)
at com.cloudera.oryx.common.servcomp.OryxConfiguration.configure(OryxConfiguration.java:73)
at com.cloudera.oryx.common.servcomp.OryxConfiguration.get(OryxConfiguration.java:50)
at com.cloudera.oryx.common.servcomp.Store.<init>(Store.java:71)
at com.cloudera.oryx.common.servcomp.Store.<clinit>(Store.java:57)
... 9 more
Thu Nov 27 14:14:26 GMT 2014 SEVERE Servlet.service() for servlet [index_jspx] in context with path [] threw exception [java.lang.NoClassDefFoundError: Could not initialize class com.cloudera.oryx.common.servcomp.Store] with root cause
java.lang.NoClassDefFoundError: Could not initialize class com.cloudera.oryx.common.servcomp.Store
at com.cloudera.oryx.common.servcomp.StoreUtils.listGenerationsForInstance(StoreUtils.java:50)
at com.cloudera.oryx.computation.PeriodicRunner.doGetState(PeriodicRunner.java:140)
at com.cloudera.oryx.computation.PeriodicRunner.access$000(PeriodicRunner.java:63)
at com.cloudera.oryx.computation.PeriodicRunner$1.call(PeriodicRunner.java:119)
at com.cloudera.oryx.computation.PeriodicRunner$1.call(PeriodicRunner.java:116)
at com.cloudera.oryx.common.ReloadingReference.doGet(ReloadingReference.java:128)
at com.cloudera.oryx.common.ReloadingReference.get(ReloadingReference.java:93)
at com.cloudera.oryx.computation.PeriodicRunner.getState(PeriodicRunner.java:125)
at com.cloudera.oryx.computation.web.index_jspx._jspService(index_jspx.java:134)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:503)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:421)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1070)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:611)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1736)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1695)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:745)
Is it just a mismatch between the Oryx version and Hadoop version? I read prior to this that Oryx has to be compiled for specific versions of Hadoop, but I can't seem to find where to check for this.
Any help will be greatly appreciated.
Kind regards,
Christina
Created 11-27-2014 11:48 AM
First you need to figure out where your Hadoop config files are -- core-site.xml, etc. If you unpacked things in /usr/local/hadoop, then it's almost surely /usr/local/hadoop/conf. You have "etc" in your path but shouldn't, and that's the actual problem.
You don't need to set all these environment variables, just "export HADOOP_CONF_DIR=..." in your shell. You don't need to modify any scripts. hadoop-env.sh won't do anything.
Have you installed Snappy? you will need Snappy. I don't know if plain vanilla Apache Hadoop is able to configure and install it for you, although it's part of Hadoop. It's much easier to use a distribution, but your second problem appears to be down to not having Snappy set up.
Created 11-27-2014 08:47 AM
Caused by: java.lang.IllegalStateException: Not a directory: /etc/hadoop/conf
Is HADOOP_CONF_DIR set and set to /usr/local/hadoop ? that's what it's complaining about, that it can't find Hadoop config in a default location.
Created 11-27-2014 09:31 AM
Hi
Thanks for coming back. As you correctly suspected, no, it was not set. I have tried the followings:
- modify hadoop-env.sh, where I hardcoded HADOOP_CONF_DIR to /usr/local/hadoop/etc/hadoop. This made no difference.
- modify ~.bashrc to include export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop (since the -env.sh was not adding that /etc/hadoop termination anymore). This again made no difference, not even after reboot, and printenv prooved my values were ignored for some reason (... surely user error, this works for the rest of the people on the Internet...).
- modified etc/environment and set these:
HADOOP_INSTALL="/usr/local/hadoop"
HADOOP_CONF_DIR="/usr/local/hadoop/etc/hadoop"
HADOOP_MAPRED_HOME="/usr/local/hadoop"
HADOOP_COMMON_HOME="/usr/local/hadoop"
HADOOP_HDFS_HOME="/usr/local/hadoop"
YARN_HOME="/usr/local/hadoop"
Finally I was able to start the computation layer without getting the conf dir error.
Next I tried 2 things
1) Feeding the input file through the serving layer
Despite upload being successful, things halted with this:
Thu Nov 27 17:06:05 GMT 2014 INFO Completed MergeIDMappingStep in 27s
Thu Nov 27 17:06:05 GMT 2014 WARNING Unexpected exception while running step
com.cloudera.oryx.computation.common.JobException: Oryx-/home/christina/IdeaProjects/oryx_hadoop_ingest-0-MergeIDMappingStep failed in state FAILED
at com.cloudera.oryx.computation.common.JobStep.run(JobStep.java:200)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at com.cloudera.oryx.computation.common.ParallelStep$1.call(ParallelStep.java:85)
at com.cloudera.oryx.computation.common.ParallelStep$1.call(ParallelStep.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
But, on the plus side, I am able to see the failed MAPREDUCE jobs in yarn (it retried it), so something must be working better.
2) Pushing the data in through Hadoop.
I rerun commands 1 and 2 from the Commands section of my previous message, this time after starting the computation and serving layers. The file was picked up and the processing again died later on:
Thu Nov 27 17:24:15 GMT 2014 INFO Completed SplitTestStep in 0s
Thu Nov 27 17:24:15 GMT 2014 INFO Mapper memory: 1024
Thu Nov 27 17:24:15 GMT 2014 INFO Mappers have 787MB heap and can access 1024MB RAM
Thu Nov 27 17:24:15 GMT 2014 INFO Set mapreduce.map.java.opts to '-Xmx787m -XX:+UseCompressedOops -XX:+UseParallelGC -XX:+UseParallelOldGC'
Thu Nov 27 17:24:15 GMT 2014 INFO Reducer memory: 1024
Thu Nov 27 17:24:15 GMT 2014 INFO Reducers have 787MB heap and can access 1024MB RAM
Thu Nov 27 17:24:15 GMT 2014 INFO Set mapreduce.reduce.java.opts to '-Xmx787m -XX:+UseCompressedOops -XX:+UseParallelGC -XX:+UseParallelOldGC'
Thu Nov 27 17:24:15 GMT 2014 INFO Created pipeline configuration Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, file:/usr/local/hadoop/etc/hadoop/core-site.xml, file:/usr/local/hadoop/etc/hadoop/hdfs-site.xml, file:/usr/local/hadoop/etc/hadoop/mapred-site.xml, file:/usr/local/hadoop/etc/hadoop/yarn-site.xml, file:/usr/local/hadoop/etc/hadoop/core-site.xml, file:/usr/local/hadoop/etc/hadoop/hdfs-site.xml, file:/usr/local/hadoop/etc/hadoop/mapred-site.xml, file:/usr/local/hadoop/etc/hadoop/yarn-site.xml
Thu Nov 27 17:24:15 GMT 2014 INFO Will write output files to new path: hdfs://localhost:9000/home/christina/IdeaProjects/oryx_hadoop/00000/idMapping
Thu Nov 27 17:24:15 GMT 2014 INFO Waiting for Oryx-/home/christina/IdeaProjects/oryx_hadoop-0-MergeIDMappingStep to complete
Thu Nov 27 17:24:15 GMT 2014 INFO Connecting to ResourceManager at /0.0.0.0:8032
Thu Nov 27 17:24:15 GMT 2014 INFO Total input paths to process : 1
Thu Nov 27 17:24:15 GMT 2014 INFO DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 84003086
Thu Nov 27 17:24:15 GMT 2014 INFO number of splits:1
Thu Nov 27 17:24:15 GMT 2014 INFO Submitting tokens for job: job_1417106947444_0011
Thu Nov 27 17:24:15 GMT 2014 INFO Submitted application application_1417106947444_0011
Thu Nov 27 17:24:15 GMT 2014 INFO The url to track the job: http://christina-Precision-T1700:8088/proxy/application_1417106947444_0011/
Thu Nov 27 17:24:15 GMT 2014 INFO Running job "Oryx-/home/christina/IdeaProjects/oryx_hadoop-0-MergeIDMappingStep: Text(hdfs://localhost:9000/home/christina/IdeaProjects/or... ID=1 (1/1)"
Thu Nov 27 17:24:15 GMT 2014 INFO Job status available at: http://christina-Precision-T1700:8088/proxy/application_1417106947444_0011/
1 job failure(s) occurred:
Oryx-/home/christina/IdeaProjects/oryx_hadoop-0-MergeIDMappingStep: Text(hdfs://localhost:9000/home/christina/IdeaProjects/or... ID=1 (1/1)(1): Job failed!
Thu Nov 27 17:24:58 GMT 2014 INFO Completed MergeIDMappingStep in 43s
Thu Nov 27 17:24:58 GMT 2014 WARNING Unexpected exception while running step
com.cloudera.oryx.computation.common.JobException: Oryx-/home/christina/IdeaProjects/oryx_hadoop-0-MergeIDMappingStep failed in state FAILED
at com.cloudera.oryx.computation.common.JobStep.run(JobStep.java:200)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at com.cloudera.oryx.computation.common.ParallelStep$1.call(ParallelStep.java:85)
at com.cloudera.oryx.computation.common.ParallelStep$1.call(ParallelStep.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Thu Nov 27 17:24:58 GMT 2014 WARNING Unexpected error in execution
com.cloudera.oryx.computation.common.JobException: Oryx-/home/christina/IdeaProjects/oryx_hadoop-0-MergeIDMappingStep failed in state FAILED
at com.cloudera.oryx.computation.common.JobStep.run(JobStep.java:200)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at com.cloudera.oryx.computation.common.ParallelStep$1.call(ParallelStep.java:85)
at com.cloudera.oryx.computation.common.ParallelStep$1.call(ParallelStep.java:81)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Any ideas now? I am looking for some logs with more information, perhaps it's some sort of out of memory exception somewhere?
Created 11-27-2014 09:45 AM
Tracked down some logs:
2014-11-27 17:31:19,885 ERROR [IPC Server handler 6 on 57154] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1417106947444_0015_m_000000_0 - exited : java.lang.RuntimeException: native snappy library not available: SnappyCompressor has not been loaded. at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:69) at org.apache.hadoop.io.compress.SnappyCodec.getCompressorType(SnappyCodec.java:132) at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:148) at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:163) at org.apache.hadoop.mapred.IFile$Writer.<init>(IFile.java:115) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1583) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1462) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:700) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) 2014-11-27 17:31:19,885 INFO [IPC Server handler 6 on 57154] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from attempt_1417106947444_0015_m_000000_0: Error: java.lang.RuntimeException: native snappy library not available: SnappyCompressor has not been loaded. at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:69) at org.apache.hadoop.io.compress.SnappyCodec.getCompressorType(SnappyCodec.java:132) at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:148) at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:163) at org.apache.hadoop.mapred.IFile$Writer.<init>(IFile.java:115) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1583) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1462) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:700) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) 2014-11-27 17:31:19,886 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1417106947444_0015_m_000000_0: Error: java.lang.RuntimeException: native snappy library not available: SnappyCompressor has not been loaded. at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:69) at org.apache.hadoop.io.compress.SnappyCodec.getCompressorType(SnappyCodec.java:132) at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:148) at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:163) at org.apache.hadoop.mapred.IFile$Writer.<init>(IFile.java:115) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1583) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1462) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:700) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) 2014-11-27 17:31:19,887 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1417106947444_0015_m_000000_0 TaskAttempt Transitioned from RUNNING to FAIL_CONTAINER_CLEANUP 2014-11-27 17:31:19,887 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_1417106947444_0015_01_000002 taskAttempt attempt_1417106947444_0015_m_000000_0 2014-11-27 17:31:19,887 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING attempt_1417106947444_0015_m_000000_0 2014-11-27 17:31:19,897 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1417106947444_0015_m_000000_0 TaskAttempt Transitioned from FAIL_CONTAINER_CLEANUP to FAIL_TASK_CLEANUP 2014-11-27 17:31:19,898 INFO [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: TASK_ABORT 2014-11-27 17:31:19,905 WARN [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Could not delete hdfs://localhost:9000/tmp/crunch-839704455/p1/output/_temporary/1/_temporary/attempt_1417106947444_0015_m_000000_0 2014-11-27 17:31:19,906 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1417106947444_0015_m_000000_0 TaskAttempt Transitioned from FAIL_TASK_CLEANUP to FAILED 2014-11-27 17:31:19,910 INFO [Thread-51] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 1 failures on node christina-Precision-T1700 2014-11-27 17:31:19,911 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1417106947444_0015_m_000000_1 TaskAttempt Transitioned from NEW to UNASSIGNED 2014-11-27 17:31:19,912 INFO [Thread-51] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Added attempt_1417106947444_0015_m_000000_1 to list of failed maps 2014-11-27 17:31:20,130 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:11 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0 2014-11-27 17:31:20,133 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1417106947444_0015: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:6144, vCores:-1> knownNMs=1 2014-11-27 17:31:20,133 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=6144 2014-11-27 17:31:20,133 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start threshold not met. completedMapsForReduceSlowstart 1 2014-11-27 17:31:21,147 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1417106947444_0015_01_000002 2014-11-27 17:31:21,148 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 1 2014-11-27 17:31:21,148 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1417106947444_0015_m_000000_0: Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143
Created 11-27-2014 11:48 AM
First you need to figure out where your Hadoop config files are -- core-site.xml, etc. If you unpacked things in /usr/local/hadoop, then it's almost surely /usr/local/hadoop/conf. You have "etc" in your path but shouldn't, and that's the actual problem.
You don't need to set all these environment variables, just "export HADOOP_CONF_DIR=..." in your shell. You don't need to modify any scripts. hadoop-env.sh won't do anything.
Have you installed Snappy? you will need Snappy. I don't know if plain vanilla Apache Hadoop is able to configure and install it for you, although it's part of Hadoop. It's much easier to use a distribution, but your second problem appears to be down to not having Snappy set up.
Created 11-28-2014 07:00 AM
Thanks again.
@srowen wrote:First you need to figure out where your Hadoop config files are -- core-site.xml, etc. If you unpacked things in /usr/local/hadoop, then it's almost surely /usr/local/hadoop/conf. You have "etc" in your path but shouldn't, and that's the actual problem.
I thought that the conf dir went away a few versions of Hadoop ago, all the configuration files are now in (/usr/local/hadoop/)etc/hadoop. This is why I originally thought that my Oryx version is assuming I have an old Hadoop, because it was looking for a folder called conf.
But things are working now with the environment variables I have set, so that one is solved.
@srowen wrote:
Have you installed Snappy? you will need Snappy. I don't know if plain vanilla Apache Hadoop is able to configure and install it for you, although it's part of Hadoop. It's much easier to use a distribution, but your second problem appears to be down to not having Snappy set up.
I have now installed Snappy and things sort of work, I managed to run a generation to completion. In case someone runs into problems building and installing Snappy, please see the end of this message.
Now I have a different problem: the computations converge at iteration 2 and the MAP is abysmal, 0.006. Plus:
Fri Nov 28 14:46:51 GMT 2014 INFO Loading X and Y to test whether they have sufficient rank
Fri Nov 28 14:46:55 GMT 2014 INFO Matrix is not yet proved to be non-singular, continuing to load...
Fri Nov 28 14:46:55 GMT 2014 WARNING X or Y does not have sufficient rank; deleting this model and its results
I ran the process twice (deleted the previous results), and both were like this (MAP was slightly different the first time, but stil like 0.00x).
Given this a random process, every now and again I expect things to be picked out badly. But, with these particular parameters, I ran about 20 in memory simulations and the MAP was always above 0.11. Convergence happens between 25 and 60 iterations. With other lambdas, factors and alphas, I do get the occasional convergence at 2 and the tiny MAP, but never twice in a row.
Should I open a new thread for this?
How to install Snappy
Make sure these 3 packages are installed first:
sudo apt-get install build-essential (needed for snappy)
sudo apt-get install autoconf (needed for snappy-hadoop)
sudo apt-get install libtool (needed for snappy-hadoop; without it you get the m4_pattern_allow error)
Then follow this: https://github.com/electrum/hadoop-snappy . How to build and install snappy is explained in the file called "INSTALL".
Created 11-28-2014 08:13 AM
Thanks for the help, Sean, I have opened a new thread for the MAP problem.
Created 11-28-2014 08:27 AM
Hadoop still has config files for sure. They can end up wherever you want them to. I though they're still at $HADOOP_HOME/conf in the vanilla Hadoop tarball, but I took a look at 2.5.2 and it's at$HADOOP_HOME/etc/hadoop in fact. In any event if they're at /usr/local/hadoop/etc/hadoop in your installation, then that's what you set $HADOOP_CONF_DIR to. Just wherever they really are. This is one of Hadoop's standard environment variables. If you're up and running then this is working.
Yes that sounds like about what you do to install snappy. They are libs that should be present on the cluster machines.