About anandi

anandi · ‎11-21-2018

Hey Jasper, great article! Thanks for sharing. Would you recommend using hive to spark. what about a similar article using spark? 😉

anandi · ‎09-01-2018

@ Steve Matison Thank you for posting the ES MPACK. Would you be able to share how to build our own custom MPACK. Maybe a follow on blog. Cheers Amit

anandi · ‎10-29-2017

Storm - Supervisor and Nimbus dropping immediately after start - Please advise on remediation. Thanks. HDP install on OpenStack - Centos 7.2 with Ambari 2.5.2.0: HDP-2.6.2.14 - Storm 1.1.0 Storm failure to start at install time with the following log: stderr: /var/lib/ambari-agent/data/errors-238.txt stderr: Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/STORM/0.9.1/package/scripts/service_check.py", line 79, in <module> ServiceCheck().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 329, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/STORM/0.9.1/package/scripts/service_check.py", line 70, in service_check user=params.storm_user File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 166, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 262, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call raise ExecutionFailed(err_msg, code, out, err) resource_management.core.exceptions.ExecutionFailed: Execution of 'storm jar /tmp/wordCount.jar storm.starter.WordCountTopology WordCountid1aaca8ef_date022917' returned 1. Running: /usr/jdk64/jdk1.8.0_112/bin/java -server -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/hdp/2.6.2.14-5/storm -Dstorm.log.dir=/var/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= -cp /usr/hdp/2.6.2.14-5/storm/lib/asm-5.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/clojure-1.7.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/disruptor-3.3.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/kryo-3.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-api-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-core-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-slf4j-impl-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/minlog-1.3.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/objenesis-2.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/reflectasm-1.10.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/ring-cors-0.1.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/slf4j-api-1.7.21.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-core-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-rename-hack-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/zookeeper.jar:/usr/hdp/2.6.2.14-5/storm/lib/ambari-metrics-storm-sink.jar:/usr/hdp/2.6.2.14-5/storm/extlib/atlas-plugin-classloader-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib/storm-bridge-shim-0.8.0.2.6.2.14-5.jar org.apache.storm.daemon.ClientJarTransformerRunner org.apache.storm.hack.StormShadeTransformer /tmp/wordCount.jar /tmp/ea59a668bcca11e7ae97fa163eb0f425.jar 1330 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseBasicBolt to org/apache/storm/topology/base/BaseBasicBolt in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace 1337 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/tuple/Tuple to org/apache/storm/tuple/Tuple in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace 1338 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/BasicOutputCollector to org/apache/storm/topology/BasicOutputCollector in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace 1338 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/tuple/Values to org/apache/storm/tuple/Values in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace 1339 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/OutputFieldsDeclarer to org/apache/storm/topology/OutputFieldsDeclarer in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace 1339 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/tuple/Fields to org/apache/storm/tuple/Fields in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace 1340 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/LinearDRPCTopologyBuilder to org/apache/storm/drpc/LinearDRPCTopologyBuilder in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1341 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/IBasicBolt to org/apache/storm/topology/IBasicBolt in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1341 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/LinearDRPCInputDeclarer to org/apache/storm/drpc/LinearDRPCInputDeclarer in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1341 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/Config to org/apache/storm/Config in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1343 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/LocalDRPC to org/apache/storm/LocalDRPC in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1343 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/LocalCluster to org/apache/storm/LocalCluster in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1344 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/ILocalDRPC to org/apache/storm/ILocalDRPC in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1344 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/generated/StormTopology to org/apache/storm/generated/StormTopology in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1345 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/StormSubmitter to org/apache/storm/StormSubmitter in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1353 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseRichBolt to org/apache/storm/topology/base/BaseRichBolt in storm/starter/bolt/RollingCountBolt.class. please modify your code to use the new namespace 1354 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/OutputCollector to org/apache/storm/task/OutputCollector in storm/starter/bolt/RollingCountBolt.class. please modify your code to use the new namespace 1354 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/TopologyContext to org/apache/storm/task/TopologyContext in storm/starter/bolt/RollingCountBolt.class. please modify your code to use the new namespace 1358 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/TimeCacheMap$ExpiredCallback to org/apache/storm/utils/TimeCacheMap$ExpiredCallback in storm/starter/bolt/SingleJoinBolt$ExpireCallback.class. please modify your code to use the new namespace 1358 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/generated/GlobalStreamId to org/apache/storm/generated/GlobalStreamId in storm/starter/bolt/SingleJoinBolt$ExpireCallback.class. please modify your code to use the new namespace 1358 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/TimeCacheMap to org/apache/storm/utils/TimeCacheMap in storm/starter/bolt/SingleJoinBolt$ExpireCallback.class. please modify your code to use the new namespace 1374 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/spout/ISpout to org/apache/storm/spout/ISpout in storm/starter/clj/word_count$sentence_spout__$fn$reify__23.class. please modify your code to use the new namespace 1379 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/IBolt to org/apache/storm/task/IBolt in storm/starter/clj/word_count$split_sentence__$fn$reify__42.class. please modify your code to use the new namespace 1454 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/TopologyBuilder to org/apache/storm/topology/TopologyBuilder in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1454 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/testing/TestWordSpout to org/apache/storm/testing/TestWordSpout in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1455 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/IRichSpout to org/apache/storm/topology/IRichSpout in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1455 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/SpoutDeclarer to org/apache/storm/topology/SpoutDeclarer in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1455 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/IRichBolt to org/apache/storm/topology/IRichBolt in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1456 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/BoltDeclarer to org/apache/storm/topology/BoltDeclarer in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1456 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/InputDeclarer to org/apache/storm/topology/InputDeclarer in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1456 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/Utils to org/apache/storm/utils/Utils in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1458 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/DRPCSpout to org/apache/storm/drpc/DRPCSpout in storm/starter/ManualDRPC.class. please modify your code to use the new namespace 1458 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/ReturnResults to org/apache/storm/drpc/ReturnResults in storm/starter/ManualDRPC.class. please modify your code to use the new namespace 1460 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseBatchBolt to org/apache/storm/topology/base/BaseBatchBolt in storm/starter/ReachTopology$CountAggregator.class. please modify your code to use the new namespace 1460 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/coordination/BatchOutputCollector to org/apache/storm/coordination/BatchOutputCollector in storm/starter/ReachTopology$CountAggregator.class. please modify your code to use the new namespace 1464 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/coordination/IBatchBolt to org/apache/storm/coordination/IBatchBolt in storm/starter/ReachTopology.class. please modify your code to use the new namespace 1467 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/testing/FeederSpout to org/apache/storm/testing/FeederSpout in storm/starter/SingleJoinExample.class. please modify your code to use the new namespace 1469 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseRichSpout to org/apache/storm/topology/base/BaseRichSpout in storm/starter/spout/RandomSentenceSpout.class. please modify your code to use the new namespace 1469 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/spout/SpoutOutputCollector to org/apache/storm/spout/SpoutOutputCollector in storm/starter/spout/RandomSentenceSpout.class. please modify your code to use the new namespace 1470 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/Time to org/apache/storm/utils/Time in storm/starter/tools/NthLastModifiedTimeTracker.class. please modify your code to use the new namespace 1480 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseTransactionalBolt to org/apache/storm/topology/base/BaseTransactionalBolt in storm/starter/TransactionalGlobalCount$UpdateGlobalCount.class. please modify your code to use the new namespace 1480 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/ICommitter to org/apache/storm/transactional/ICommitter in storm/starter/TransactionalGlobalCount$UpdateGlobalCount.class. please modify your code to use the new namespace 1480 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/TransactionAttempt to org/apache/storm/transactional/TransactionAttempt in storm/starter/TransactionalGlobalCount$UpdateGlobalCount.class. please modify your code to use the new namespace 1482 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/testing/MemoryTransactionalSpout to org/apache/storm/testing/MemoryTransactionalSpout in storm/starter/TransactionalGlobalCount.class. please modify your code to use the new namespace 1482 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/TransactionalTopologyBuilder to org/apache/storm/transactional/TransactionalTopologyBuilder in storm/starter/TransactionalGlobalCount.class. please modify your code to use the new namespace 1482 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/partitioned/IPartitionedTransactionalSpout to org/apache/storm/transactional/partitioned/IPartitionedTransactionalSpout in storm/starter/TransactionalGlobalCount.class. please modify your code to use the new namespace 1491 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/BaseFunction to org/apache/storm/trident/operation/BaseFunction in storm/starter/trident/TridentReach$ExpandList.class. please modify your code to use the new namespace 1491 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/tuple/TridentTuple to org/apache/storm/trident/tuple/TridentTuple in storm/starter/trident/TridentReach$ExpandList.class. please modify your code to use the new namespace 1492 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/TridentCollector to org/apache/storm/trident/operation/TridentCollector in storm/starter/trident/TridentReach$ExpandList.class. please modify your code to use the new namespace 1492 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/CombinerAggregator to org/apache/storm/trident/operation/CombinerAggregator in storm/starter/trident/TridentReach$One.class. please modify your code to use the new namespace 1493 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/StateFactory to org/apache/storm/trident/state/StateFactory in storm/starter/trident/TridentReach$StaticSingleKeyMapState$Factory.class. please modify your code to use the new namespace 1493 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/IMetricsContext to org/apache/storm/task/IMetricsContext in storm/starter/trident/TridentReach$StaticSingleKeyMapState$Factory.class. please modify your code to use the new namespace 1493 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/State to org/apache/storm/trident/state/State in storm/starter/trident/TridentReach$StaticSingleKeyMapState$Factory.class. please modify your code to use the new namespace 1494 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/ReadOnlyState to org/apache/storm/trident/state/ReadOnlyState in storm/starter/trident/TridentReach$StaticSingleKeyMapState.class. please modify your code to use the new namespace 1494 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/map/ReadOnlyMapState to org/apache/storm/trident/state/map/ReadOnlyMapState in storm/starter/trident/TridentReach$StaticSingleKeyMapState.class. please modify your code to use the new namespace 1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/TridentTopology to org/apache/storm/trident/TridentTopology in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/TridentState to org/apache/storm/trident/TridentState in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/Stream to org/apache/storm/trident/Stream in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/MapGet to org/apache/storm/trident/operation/builtin/MapGet in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1496 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/QueryFunction to org/apache/storm/trident/state/QueryFunction in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1496 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/Function to org/apache/storm/trident/operation/Function in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1496 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/fluent/GroupedStream to org/apache/storm/trident/fluent/GroupedStream in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1497 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/Sum to org/apache/storm/trident/operation/builtin/Sum in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/testing/MemoryMapState$Factory to org/apache/storm/trident/testing/MemoryMapState$Factory in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace 1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/testing/MemoryMapState to org/apache/storm/trident/testing/MemoryMapState in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace 1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/testing/FixedBatchSpout to org/apache/storm/trident/testing/FixedBatchSpout in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace 1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/spout/IBatchSpout to org/apache/storm/trident/spout/IBatchSpout in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace 1500 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/Count to org/apache/storm/trident/operation/builtin/Count in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace 1500 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/FilterNull to org/apache/storm/trident/operation/builtin/FilterNull in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace 1501 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/Filter to org/apache/storm/trident/operation/Filter in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace 1503 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/ShellBolt to org/apache/storm/task/ShellBolt in storm/starter/WordCountTopology$SplitSentence.class. please modify your code to use the new namespace Running: /usr/jdk64/jdk1.8.0_112/bin/java -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/hdp/2.6.2.14-5/storm -Dstorm.log.dir=/var/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/hdp/current/storm-client/lib -Dstorm.conf.file= -cp /usr/hdp/2.6.2.14-5/storm/lib/asm-5.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/clojure-1.7.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/disruptor-3.3.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/kryo-3.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-api-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-core-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-slf4j-impl-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/minlog-1.3.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/objenesis-2.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/reflectasm-1.10.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/ring-cors-0.1.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/slf4j-api-1.7.21.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-core-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-rename-hack-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/zookeeper.jar:/usr/hdp/2.6.2.14-5/storm/lib/ambari-metrics-storm-sink.jar:/usr/hdp/2.6.2.14-5/storm/extlib/atlas-plugin-classloader-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib/storm-bridge-shim-0.8.0.2.6.2.14-5.jar:/tmp/ea59a668bcca11e7ae97fa163eb0f425.jar:/usr/hdp/current/storm-supervisor/conf:/usr/hdp/2.6.2.14-5/storm/bin -Dstorm.jar=/tmp/ea59a668bcca11e7ae97fa163eb0f425.jar -Dstorm.dependency.jars= -Dstorm.dependency.artifacts={} storm.starter.WordCountTopology WordCountid1aaca8ef_date022917 948 [main] INFO o.a.s.StormSubmitter - Generated ZooKeeper secret payload for MD5-digest: -6085484404615721045:-8414510412923341525 1095 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2001ms (NOT MAX) 3099 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2002ms (NOT MAX) 5103 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2005ms (NOT MAX) 7110 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2013ms (NOT MAX) 9125 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2016ms (NOT MAX) 11144 [main] WARN o.a.s.u.NimbusClient - Ignoring exception while trying to get leader nimbus info from mst2-an05.field.hortonworks.com. will retry with a different seed host. java.lang.RuntimeException: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:108) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.ThriftClient.<init>(ThriftClient.java:69) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.utils.NimbusClient.<init>(NimbusClient.java:128) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:84) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.StormSubmitter.getListOfKeysFromBlobStore(StormSubmitter.java:598) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.StormSubmitter.validateConfs(StormSubmitter.java:564) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:210) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:390) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:162) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at storm.starter.WordCountTopology.main(WordCountTopology.java:77) [ea59a668bcca11e7ae97fa163eb0f425.jar:?] Caused by: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.storm.security.auth.TBackoffConnect.retryNext(TBackoffConnect.java:64) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:56) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] ... 11 more Caused by: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:226) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] ... 11 more Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_112] at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_112] at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_112] at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_112] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_112] at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_112] at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:221) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] ... 11 more Exception in thread "main" org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [mst2-an05.field.hortonworks.com]. Did you specify a valid list of nimbus hosts for config nimbus.seeds? at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:112) at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58) at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) at org.apache.storm.StormSubmitter.getListOfKeysFromBlobStore(StormSubmitter.java:598) at org.apache.storm.StormSubmitter.validateConfs(StormSubmitter.java:564) at org.apache.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:210) at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:390) at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:162) at storm.starter.WordCountTopology.main(WordCountTopology.java:77) stdout: 2017-10-29 17:02:12,776 - Stack Feature Version Info: Cluster Stack=2.6, Cluster Current Version=None, Command Stack=None, Command Version=2.6.2.14-5 -> 2.6.2.14-5 2017-10-29 17:02:12,815 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf 2017-10-29 17:02:12,817 - checked_call['hostid'] {} 2017-10-29 17:02:12,827 - checked_call returned (0, '1aaca8ef') 2017-10-29 17:02:12,827 - File['/tmp/wordCount.jar'] {'owner': 'storm', 'content': StaticFile('wordCount.jar')} 2017-10-29 17:02:12,831 - Writing File['/tmp/wordCount.jar'] because it doesn't exist 2017-10-29 17:02:12,833 - Changing owner for /tmp/wordCount.jar from 0 to storm 2017-10-29 17:02:12,833 - Execute['storm jar /tmp/wordCount.jar storm.starter.WordCountTopology WordCountid1aaca8ef_date022917'] {'logoutput': True, 'path': [u'/usr/hdp/current/storm-client/bin'], 'user': 'storm'} Running: /usr/jdk64/jdk1.8.0_112/bin/java -server -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/hdp/2.6.2.14-5/storm -Dstorm.log.dir=/var/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= -cp /usr/hdp/2.6.2.14-5/storm/lib/asm-5.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/clojure-1.7.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/disruptor-3.3.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/kryo-3.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-api-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-core-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-slf4j-impl-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/minlog-1.3.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/objenesis-2.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/reflectasm-1.10.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/ring-cors-0.1.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/slf4j-api-1.7.21.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-core-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-rename-hack-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/zookeeper.jar:/usr/hdp/2.6.2.14-5/storm/lib/ambari-metrics-storm-sink.jar:/usr/hdp/2.6.2.14-5/storm/extlib/atlas-plugin-classloader-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib/storm-bridge-shim-0.8.0.2.6.2.14-5.jar org.apache.storm.daemon.ClientJarTransformerRunner org.apache.storm.hack.StormShadeTransformer /tmp/wordCount.jar /tmp/ea59a668bcca11e7ae97fa163eb0f425.jar 1330 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseBasicBolt to org/apache/storm/topology/base/BaseBasicBolt in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace 1337 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/tuple/Tuple to org/apache/storm/tuple/Tuple in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace 1338 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/BasicOutputCollector to org/apache/storm/topology/BasicOutputCollector in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace 1338 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/tuple/Values to org/apache/storm/tuple/Values in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace 1339 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/OutputFieldsDeclarer to org/apache/storm/topology/OutputFieldsDeclarer in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace 1339 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/tuple/Fields to org/apache/storm/tuple/Fields in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace 1340 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/LinearDRPCTopologyBuilder to org/apache/storm/drpc/LinearDRPCTopologyBuilder in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1341 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/IBasicBolt to org/apache/storm/topology/IBasicBolt in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1341 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/LinearDRPCInputDeclarer to org/apache/storm/drpc/LinearDRPCInputDeclarer in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1341 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/Config to org/apache/storm/Config in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1343 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/LocalDRPC to org/apache/storm/LocalDRPC in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1343 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/LocalCluster to org/apache/storm/LocalCluster in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1344 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/ILocalDRPC to org/apache/storm/ILocalDRPC in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1344 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/generated/StormTopology to org/apache/storm/generated/StormTopology in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1345 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/StormSubmitter to org/apache/storm/StormSubmitter in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace 1353 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseRichBolt to org/apache/storm/topology/base/BaseRichBolt in storm/starter/bolt/RollingCountBolt.class. please modify your code to use the new namespace 1354 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/OutputCollector to org/apache/storm/task/OutputCollector in storm/starter/bolt/RollingCountBolt.class. please modify your code to use the new namespace 1354 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/TopologyContext to org/apache/storm/task/TopologyContext in storm/starter/bolt/RollingCountBolt.class. please modify your code to use the new namespace 1358 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/TimeCacheMap$ExpiredCallback to org/apache/storm/utils/TimeCacheMap$ExpiredCallback in storm/starter/bolt/SingleJoinBolt$ExpireCallback.class. please modify your code to use the new namespace 1358 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/generated/GlobalStreamId to org/apache/storm/generated/GlobalStreamId in storm/starter/bolt/SingleJoinBolt$ExpireCallback.class. please modify your code to use the new namespace 1358 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/TimeCacheMap to org/apache/storm/utils/TimeCacheMap in storm/starter/bolt/SingleJoinBolt$ExpireCallback.class. please modify your code to use the new namespace 1374 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/spout/ISpout to org/apache/storm/spout/ISpout in storm/starter/clj/word_count$sentence_spout__$fn$reify__23.class. please modify your code to use the new namespace 1379 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/IBolt to org/apache/storm/task/IBolt in storm/starter/clj/word_count$split_sentence__$fn$reify__42.class. please modify your code to use the new namespace 1454 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/TopologyBuilder to org/apache/storm/topology/TopologyBuilder in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1454 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/testing/TestWordSpout to org/apache/storm/testing/TestWordSpout in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1455 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/IRichSpout to org/apache/storm/topology/IRichSpout in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1455 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/SpoutDeclarer to org/apache/storm/topology/SpoutDeclarer in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1455 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/IRichBolt to org/apache/storm/topology/IRichBolt in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1456 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/BoltDeclarer to org/apache/storm/topology/BoltDeclarer in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1456 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/InputDeclarer to org/apache/storm/topology/InputDeclarer in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1456 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/Utils to org/apache/storm/utils/Utils in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace 1458 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/DRPCSpout to org/apache/storm/drpc/DRPCSpout in storm/starter/ManualDRPC.class. please modify your code to use the new namespace 1458 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/ReturnResults to org/apache/storm/drpc/ReturnResults in storm/starter/ManualDRPC.class. please modify your code to use the new namespace 1460 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseBatchBolt to org/apache/storm/topology/base/BaseBatchBolt in storm/starter/ReachTopology$CountAggregator.class. please modify your code to use the new namespace 1460 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/coordination/BatchOutputCollector to org/apache/storm/coordination/BatchOutputCollector in storm/starter/ReachTopology$CountAggregator.class. please modify your code to use the new namespace 1464 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/coordination/IBatchBolt to org/apache/storm/coordination/IBatchBolt in storm/starter/ReachTopology.class. please modify your code to use the new namespace 1467 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/testing/FeederSpout to org/apache/storm/testing/FeederSpout in storm/starter/SingleJoinExample.class. please modify your code to use the new namespace 1469 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseRichSpout to org/apache/storm/topology/base/BaseRichSpout in storm/starter/spout/RandomSentenceSpout.class. please modify your code to use the new namespace 1469 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/spout/SpoutOutputCollector to org/apache/storm/spout/SpoutOutputCollector in storm/starter/spout/RandomSentenceSpout.class. please modify your code to use the new namespace 1470 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/Time to org/apache/storm/utils/Time in storm/starter/tools/NthLastModifiedTimeTracker.class. please modify your code to use the new namespace 1480 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseTransactionalBolt to org/apache/storm/topology/base/BaseTransactionalBolt in storm/starter/TransactionalGlobalCount$UpdateGlobalCount.class. please modify your code to use the new namespace 1480 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/ICommitter to org/apache/storm/transactional/ICommitter in storm/starter/TransactionalGlobalCount$UpdateGlobalCount.class. please modify your code to use the new namespace 1480 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/TransactionAttempt to org/apache/storm/transactional/TransactionAttempt in storm/starter/TransactionalGlobalCount$UpdateGlobalCount.class. please modify your code to use the new namespace 1482 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/testing/MemoryTransactionalSpout to org/apache/storm/testing/MemoryTransactionalSpout in storm/starter/TransactionalGlobalCount.class. please modify your code to use the new namespace 1482 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/TransactionalTopologyBuilder to org/apache/storm/transactional/TransactionalTopologyBuilder in storm/starter/TransactionalGlobalCount.class. please modify your code to use the new namespace 1482 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/partitioned/IPartitionedTransactionalSpout to org/apache/storm/transactional/partitioned/IPartitionedTransactionalSpout in storm/starter/TransactionalGlobalCount.class. please modify your code to use the new namespace 1491 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/BaseFunction to org/apache/storm/trident/operation/BaseFunction in storm/starter/trident/TridentReach$ExpandList.class. please modify your code to use the new namespace 1491 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/tuple/TridentTuple to org/apache/storm/trident/tuple/TridentTuple in storm/starter/trident/TridentReach$ExpandList.class. please modify your code to use the new namespace 1492 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/TridentCollector to org/apache/storm/trident/operation/TridentCollector in storm/starter/trident/TridentReach$ExpandList.class. please modify your code to use the new namespace 1492 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/CombinerAggregator to org/apache/storm/trident/operation/CombinerAggregator in storm/starter/trident/TridentReach$One.class. please modify your code to use the new namespace 1493 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/StateFactory to org/apache/storm/trident/state/StateFactory in storm/starter/trident/TridentReach$StaticSingleKeyMapState$Factory.class. please modify your code to use the new namespace 1493 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/IMetricsContext to org/apache/storm/task/IMetricsContext in storm/starter/trident/TridentReach$StaticSingleKeyMapState$Factory.class. please modify your code to use the new namespace 1493 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/State to org/apache/storm/trident/state/State in storm/starter/trident/TridentReach$StaticSingleKeyMapState$Factory.class. please modify your code to use the new namespace 1494 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/ReadOnlyState to org/apache/storm/trident/state/ReadOnlyState in storm/starter/trident/TridentReach$StaticSingleKeyMapState.class. please modify your code to use the new namespace 1494 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/map/ReadOnlyMapState to org/apache/storm/trident/state/map/ReadOnlyMapState in storm/starter/trident/TridentReach$StaticSingleKeyMapState.class. please modify your code to use the new namespace 1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/TridentTopology to org/apache/storm/trident/TridentTopology in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/TridentState to org/apache/storm/trident/TridentState in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/Stream to org/apache/storm/trident/Stream in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/MapGet to org/apache/storm/trident/operation/builtin/MapGet in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1496 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/QueryFunction to org/apache/storm/trident/state/QueryFunction in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1496 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/Function to org/apache/storm/trident/operation/Function in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1496 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/fluent/GroupedStream to org/apache/storm/trident/fluent/GroupedStream in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1497 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/Sum to org/apache/storm/trident/operation/builtin/Sum in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace 1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/testing/MemoryMapState$Factory to org/apache/storm/trident/testing/MemoryMapState$Factory in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace 1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/testing/MemoryMapState to org/apache/storm/trident/testing/MemoryMapState in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace 1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/testing/FixedBatchSpout to org/apache/storm/trident/testing/FixedBatchSpout in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace 1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/spout/IBatchSpout to org/apache/storm/trident/spout/IBatchSpout in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace 1500 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/Count to org/apache/storm/trident/operation/builtin/Count in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace 1500 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/FilterNull to org/apache/storm/trident/operation/builtin/FilterNull in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace 1501 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/Filter to org/apache/storm/trident/operation/Filter in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace 1503 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/ShellBolt to org/apache/storm/task/ShellBolt in storm/starter/WordCountTopology$SplitSentence.class. please modify your code to use the new namespace Running: /usr/jdk64/jdk1.8.0_112/bin/java -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/hdp/2.6.2.14-5/storm -Dstorm.log.dir=/var/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/hdp/current/storm-client/lib -Dstorm.conf.file= -cp /usr/hdp/2.6.2.14-5/storm/lib/asm-5.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/clojure-1.7.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/disruptor-3.3.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/kryo-3.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-api-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-core-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-slf4j-impl-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/minlog-1.3.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/objenesis-2.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/reflectasm-1.10.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/ring-cors-0.1.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/slf4j-api-1.7.21.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-core-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-rename-hack-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/zookeeper.jar:/usr/hdp/2.6.2.14-5/storm/lib/ambari-metrics-storm-sink.jar:/usr/hdp/2.6.2.14-5/storm/extlib/atlas-plugin-classloader-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib/storm-bridge-shim-0.8.0.2.6.2.14-5.jar:/tmp/ea59a668bcca11e7ae97fa163eb0f425.jar:/usr/hdp/current/storm-supervisor/conf:/usr/hdp/2.6.2.14-5/storm/bin -Dstorm.jar=/tmp/ea59a668bcca11e7ae97fa163eb0f425.jar -Dstorm.dependency.jars= -Dstorm.dependency.artifacts={} storm.starter.WordCountTopology WordCountid1aaca8ef_date022917 948 [main] INFO o.a.s.StormSubmitter - Generated ZooKeeper secret payload for MD5-digest: -6085484404615721045:-8414510412923341525 1095 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2001ms (NOT MAX) 3099 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2002ms (NOT MAX) 5103 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2005ms (NOT MAX) 7110 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2013ms (NOT MAX) 9125 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2016ms (NOT MAX) 11144 [main] WARN o.a.s.u.NimbusClient - Ignoring exception while trying to get leader nimbus info from mst2-an05.field.hortonworks.com. will retry with a different seed host. java.lang.RuntimeException: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:108) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.ThriftClient.<init>(ThriftClient.java:69) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.utils.NimbusClient.<init>(NimbusClient.java:128) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:84) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.StormSubmitter.getListOfKeysFromBlobStore(StormSubmitter.java:598) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.StormSubmitter.validateConfs(StormSubmitter.java:564) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:210) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:390) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:162) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at storm.starter.WordCountTopology.main(WordCountTopology.java:77) [ea59a668bcca11e7ae97fa163eb0f425.jar:?] Caused by: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.storm.security.auth.TBackoffConnect.retryNext(TBackoffConnect.java:64) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:56) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] ... 11 more Caused by: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:226) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] ... 11 more Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_112] at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_112] at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_112] at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_112] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_112] at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_112] at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:221) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5] ... 11 more Exception in thread "main" org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [mst2-an05.field.hortonworks.com]. Did you specify a valid list of nimbus hosts for config nimbus.seeds? at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:112) at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58) at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) at org.apache.storm.StormSubmitter.getListOfKeysFromBlobStore(StormSubmitter.java:598) at org.apache.storm.StormSubmitter.validateConfs(StormSubmitter.java:564) at org.apache.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:210) at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:390) at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:162) at storm.starter.WordCountTopology.main(WordCountTopology.java:77) Command failed after 1 tries Tried stop all storm services and restart Error opening zip file or JAR manifest missing : /usr/hdp/current/storm-supervisor/contrib/storm-jmxetric/lib/jmxetric-1.0.4.jar lrwxrwxrwx. 1 storm storm 14 Oct 29 16:44 /usr/hdp/current/storm-nimbus/logs -> /var/log/storm [root@mst2-an05 ~]# cd /var/log/storm [root@mst2-an05 storm]# ll total 2012 -rw-r--r--. 1 storm hadoop 0 Oct 29 16:59 access-drpc.log -rw-r--r--. 1 storm hadoop 0 Oct 29 16:59 access-logviewer.log -rw-r--r--. 1 storm hadoop 0 Oct 29 16:59 access-ui.log -rw-r--r--. 1 storm hadoop 0 Oct 29 16:59 access-web-drpc.log -rw-r--r--. 1 storm hadoop 0 Oct 29 16:59 access-web-logviewer.log -rw-r--r--. 1 storm hadoop 93398 Oct 29 19:58 access-web-ui.log -rw-r--r--. 1 storm hadoop 2505 Oct 29 19:51 drpc.log -rw-r--r--. 1 storm hadoop 1836 Oct 29 19:51 drpc.out -rw-r--r--. 1 storm hadoop 2280 Oct 29 19:51 logviewer.log -rw-r--r--. 1 storm hadoop 1881 Oct 29 19:51 logviewer.out -rw-r--r--. 1 storm hadoop 2300 Oct 29 19:52 nimbus.out -rw-r--r--. 1 storm hadoop 2506 Oct 29 19:51 supervisor.out -rw-r--r--. 1 storm hadoop 1925972 Oct 29 19:52 ui.log -rw-r--r--. 1 storm hadoop 1854 Oct 29 19:51 ui.out [root@mst2-an05 storm]# cat nimbus.out Running: /usr/jdk64/jdk1.8.0_112/bin/java -server -Ddaemon.name=nimbus -Dstorm.options= -Dstorm.home=/usr/hdp/2.6.2.14-5/storm -Dstorm.log.dir=/var/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/hdp/current/storm-client/lib -Dstorm.conf.file= -cp /usr/hdp/2.6.2.14-5/storm/lib/asm-5.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/clojure-1.7.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/disruptor-3.3.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/kryo-3.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-api-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-core-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-slf4j-impl-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/minlog-1.3.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/objenesis-2.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/reflectasm-1.10.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/ring-cors-0.1.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/slf4j-api-1.7.21.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-core-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-rename-hack-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/zookeeper.jar:/usr/hdp/2.6.2.14-5/storm/lib/ambari-metrics-storm-sink.jar:/usr/hdp/2.6.2.14-5/storm/extlib/atlas-plugin-classloader-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib/storm-bridge-shim-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib-daemon/ojdbc6.jar:/usr/hdp/2.6.2.14-5/storm/extlib-daemon/ranger-plugin-classloader-0.7.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib-daemon/ranger-storm-plugin-shim-0.7.0.2.6.2.14-5.jar:/usr/hdp/current/storm-nimbus/conf -Xmx1024m -javaagent:/usr/hdp/current/storm-nimbus/contrib/storm-jmxetric/lib/jmxetric-1.0.4.jar=host=localhost,port=8649,wireformat31x=true,mode=multicast,config=/usr/hdp/current/storm-nimbus/contrib/storm-jmxetric/conf/jmxetric-conf.xml,process=Nimbus_JVM -Dlogfile.name=nimbus.log -DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector -Dlog4j.configurationFile=/usr/hdp/2.6.2.14-5/storm/log4j2/cluster.xml org.apache.storm.daemon.nimbus Error opening zip file or JAR manifest missing : /usr/hdp/current/storm-nimbus/contrib/storm-jmxetric/lib/jmxetric-1.0.4.jar Error occurred during initialization of VM agent library failed to init: instrument [root@mst2-an05 storm]# cat supervisor.out Running: /usr/jdk64/jdk1.8.0_112/bin/java -server -Ddaemon.name=supervisor -Dstorm.options= -Dstorm.home=/usr/hdp/2.6.2.14-5/storm -Dstorm.log.dir=/var/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/hdp/current/storm-client/lib -Dstorm.conf.file= -cp /usr/hdp/2.6.2.14-5/storm/lib/asm-5.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/clojure-1.7.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/disruptor-3.3.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/kryo-3.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-api-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-core-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-slf4j-impl-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/minlog-1.3.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/objenesis-2.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/reflectasm-1.10.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/ring-cors-0.1.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/slf4j-api-1.7.21.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-core-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-rename-hack-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/zookeeper.jar:/usr/hdp/2.6.2.14-5/storm/lib/ambari-metrics-storm-sink.jar:/usr/hdp/2.6.2.14-5/storm/extlib/atlas-plugin-classloader-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib/storm-bridge-shim-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib-daemon/ojdbc6.jar:/usr/hdp/2.6.2.14-5/storm/extlib-daemon/ranger-plugin-classloader-0.7.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib-daemon/ranger-storm-plugin-shim-0.7.0.2.6.2.14-5.jar:/usr/hdp/current/storm-supervisor/conf -Xmx256m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=56431 -javaagent:/usr/hdp/current/storm-supervisor/contrib/storm-jmxetric/lib/jmxetric-1.0.4.jar=host=localhost,port=8650,wireformat31x=true,mode=multicast,config=/usr/hdp/current/storm-supervisor/contrib/storm-jmxetric/conf/jmxetric-conf.xml,process=Supervisor_JVM -Dlogfile.name=supervisor.log -DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector -Dlog4j.configurationFile=/usr/hdp/2.6.2.14-5/storm/log4j2/cluster.xml org.apache.storm.daemon.supervisor.Supervisor Error opening zip file or JAR manifest missing : /usr/hdp/current/storm-supervisor/contrib/storm-jmxetric/lib/jmxetric-1.0.4.jar Error occurred during initialization of VM agent library failed to init: instrument

anandi · ‎09-27-2017

A Machine Learning Model learns from data. As you get new incremental data, the Machine Learning model needs to be upgraded. A Machine Learning Model factory ensures that as you have deployed model in production, continuous learning is also happening on incremental new data ingested in the Production environment. As deployed ML Model's performance decays, a new trained and serialized model needs to be deployed. An A/B test between the deployed model and the newly trained model can score them to evaluate the performance of the deployed model versus the incrementally trained model. In order to build a Machine Learning Model factory, we have to establish a robust road to production, first. The foundational framework is first to establish three environments: DEV, TEST and PROD. 1- DEV - A development environment where the Data Scientists have their own data puddle in order to perform data exploration, profile the data, develop the machine learning features from the data, build the model, train and test it on the limited subset and then commit to git to transport the code to the next stages. For the purpose of scaling and tuning the learning of the Machine Learning model, we establish a DEV Validation environment, where the model learning is scaled with as much historical data as possible and tuned. 2- TEST - The TEST environment is a pre-production environment where we running the machine learning models through integration tests and readying the move of the Machine Learning model to production in two branches: 2a - model deployment: where the trained serialized Machine Learning model is deployed in the production environment 2b - continuous training: where the Machine Learning model is going through continuous training on incremental data 3- PROD - The Production environment is where live data is ingested. In the production environment a deployment server is hosting the serialized trained model. The deployed model exposes a REST api to deliver predictions on live data queries. The ML model code is running in production ingesting incremental live data and getting continuously trained. The deployed model and the continuous training model performances are measured. If the deployed model is showing decay in prediction performance, then it is switched with a newer serialized version of the continuous training model. The model performance measure can be tracked by closing the loop with the users feedback and tracking True Positive, False Positive, True Negative and False Negative. This choreography of training and deploying machine learning models in production is the heart of the ML model factory. The road to production is depicting the journey of building Machine Learning models within the DEV/TEST/PROD environments.

anandi · ‎06-28-2017

Setting Up a Data Science Platform on HDP using Anaconda Building a Data Science Platform using Anaconda needs to be able to Launch PySpark jobs on the cluster Synchronize python libraries from vetted public repositories Isolate environments with specific dependencies to run production jobs using an older version of a package whilst simultaneously running new version of the package Launching notebooks and PySpark jobs using different kernels such as Python_2.7, Python_3.x, R, Scala Framework of the Data Science Platform Private Repo Server Edge Nodes Dev Test Prod Ansible Git Jenkins Building blocks of the Data Science Platform Anaconda Ansible Git Jenkins

anandi · ‎04-03-2017

Thanks for the comment Michael. I wrote these commands for hdp environments using standard python 2.7 where we can not do a pip install of snakebite. (i.e. hdp clusters are behind the firewall in secure zone with no pip download allowed)

anandi · ‎03-31-2017

Interacting with Hadoop HDFS using Python codes This post will go through the following: Introducing python “subprocess” module Running HDFS commands with Python Examples of HDFS commands from Python 1-Introducing python “subprocess” module The Python “subprocess” module allows us to: spawn new Unix processes connect to their input/output/error pipes obtain their return codes To run UNIX commands we need to create a subprocess that runs the command. The recommended approach to invoking subprocesses is to use the convenience functions for all use cases they can handle. Or we can use the underlying Popen interface can be used directly. 2-Running HDFS commands with Python We will create a Python function called run_cmd that will effectively allow us to run any unix or linux commands or in our case hdfs dfs commands as linux pipe capturing stdout and stderr and piping the input as list of arguments of the elements of the native unix or HDFS command. It is passed as a Python list rather than a string of characters as you don't have to parse or escape characters. # import the python subprocess module import subprocess def run_cmd(args_list): """ run linux commands """ # import subprocess print('Running system command: {0}'.format(' '.join(args_list))) proc = subprocess.Popen(args_list, stdout=subprocess.PIPE, stderr=subprocess.PIPE) s_output, s_err = proc.communicate() s_return = proc.returncode return s_return, s_output, s_err 3-Examples of HDFS commands from Python Run Hadoop ls command in Python (ret, out, err)= run_cmd(['hdfs', 'dfs', '-ls', 'hdfs_file_path']) lines = out.split('\n') Run Hadoop get command in Python (ret, out, err)= run_cmd(['hdfs', 'dfs', '-get', 'hdfs_file_path', 'local_path']) Run Hadoop put command in Python (ret, out, err)= run_cmd(['hdfs', 'dfs', '-put', 'local_file', 'hdfs_file_path']) Run Hadoop copyFromLocal command in Python (ret, out, err)= run_cmd(['hdfs', 'dfs', '-copyFromLocal', 'local_file', 'hdfs_file_path']) Run Hadoop copyToLocal command in Python (ret, out, err)= run_cmd(['hdfs', 'dfs', '-copyToLocal', 'hdfs_file_path', 'local_file']) hdfs dfs -rm -skipTrash /path/to/file/you/want/to/remove/permanently Run Hadoop remove file command in Python (ret, out, err)= run_cmd(['hdfs', 'dfs', '-rm', 'hdfs_file_path']) (ret, out, err)= run_cmd(['hdfs', 'dfs', '-rm', '-skipTrash', 'hdfs_file_path']) rm -r HDFS Command to remove the entire directory and all of its content from HDFS. Usage: hdfs dfs -rm -r <path> (ret, out, err)= run_cmd(['hdfs', 'dfs', '-rm', '-r', 'hdfs_file_path']) (ret, out, err)= run_cmd(['hdfs', 'dfs', '-rm', '-r', '-skipTrash', 'hdfs_file_path']) Check if a file exist in HDFS Usage: hadoop fs -test -[defsz] URI Options: -d: f the path is a directory, return 0. -e: if the path exists, return 0. -f: if the path is a file, return 0. -s: if the path is not empty, return 0. -z: if the file is zero length, return 0. Example: hadoop fs -test -e filename hdfs_file_path = '/tmpo' cmd = ['hdfs', 'dfs', '-test', '-e', hdfs_file_path] ret, out, err = run_cmd(cmd) print(ret, out, err) if ret: print('file does not exist') These simple but very powerful lines of code allow to interact with HDFS in a programmatic way and can be easily scheduled as part of schedule cron jobs.

anandi · ‎01-04-2017

@SBandaru If you are using spark with hdp, then you have to do following things. Add these entries in your $SPARK_HOME/conf/spark-defaults.conf spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0-2041 (your installed HDP version) spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0-2041 (your installed HDP version) create java-opts file in $SPARK_HOME/conf and add the installed HDP version in that file like -Dhdp.version=2.2.0.0-2041 (your installed HDP version) to know hdp version please run command hdp-select status hadoop-client in the cluster

anandi · ‎12-31-2016

Installing and Exploring Spark 2.0 with Jupyter Notebook and Anaconda Python in your laptop 1-Objective 2-Installing Anaconda Python 3-Checking Python Install 4-Installing Spark 5-Checking Spark Install 6-Launching Jupyter Notebook with PySpark 2.0.2 7-Exploring PySpark 2.0.2 a.Spark Session b.Read CSV i.Spark 2.0 and Spark 1.6 ii.Pandas c.Pandas DataFrames, Spark DataSets, DataFrames and RDDs d.Machine Learning Pipeline i.SciKit Learn ii.Spark MLLib, ML 8-Conclusion 1-Objective It is often useful to have python with the Jupyter notebook installed on your laptop in order to quickly develop and test some code ideas or to explore some data. Adding the ability to combine Apache Spark to this will also allow you to prototype ideas and exploratory data pipelines before hitting a Hadoop cluster and paying for Amazon Web Services. We leverage the power of the Python ecosystem with libraries such as Numpy (scientific computing library of high-level mathematical functions to operate on arrays and matrices), SciPy (SciPy library depends on NumPy, which provides convenient and fast N-dimensional array manipulation), Pandas (high performance data structure and data analysis library to build complex data transformation flows), Scikit-Learn (library that implements a range of machine learning, preprocessing, cross-validation and visualization algorithms), NLTK (Natural Language Tool Kit to process text data, libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries)… We also leverage the strengths of Spark including Spark-SQL, Spark-MLLib or ML. 2-Installing Anaconda Python We install Continuum’s Anaconda distribution by downloading the install script from the Continuum website. https://www.continuum.io/downloads The advantage of the Anaconda distribution is that lot of the essential python packages comes in bundled. You do not have to struggle with all the dependencies synchronization. We will use the following commands to download the install script. The command is to install Python version 3.5 HW12256:~ usr000$ wget http://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh If you wish to install Python 2.7, the following download is recommended. HW12256:~ usr000$ wget http://repo.continuum.io/archive/Anaconda2-4.2.0-Linux-x86_64.sh Accordingly, in the terminal, issue the following bash command to launch the install. Python 3.5 version HW12256:~ usr000$ bash Anaconda3-4.2.0-Linux-x86_64.sh Python 2.7 version HW12256:~ usr000$ bash Anaconda2-4.2.0-Linux-x86_64.sh In the following steps, we are using Python 3.5 as the base environment. 3-Checking Python Install In order to check the Python install, we issue the following commands in the terminal. HW12256:~ usr000$ which python /Users/usr000/anaconda/bin/python HW12256:~ usr000$ echo $PATH /Users/usr000/anaconda/bin:/usr/local/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin HW12256:~ usr000$ python --version Python 3.5.2 :: Anaconda 4.1.1 (x86_64) HW12256:~ usr000$ python Python 3.5.2 |Anaconda 4.1.1 (x86_64)| (default, Jul 2 2016, 17:52:12)[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwinType "help", "copyright", "credits" or "license" for more information. >>> import sys >>> print("Python version: {} ".format(sys.version))Python version: 3.5.2 |Anaconda 4.1.1 (x86_64)| (default, Jul 2 2016, 17:52:12)[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] >>> from datetime import datetime >>> print('current date and time: {}'.format(datetime.now()))current date and time: 2016-12-29 09:46:32.393985 >>> print('current date and time:{}'.format(datetime.now().strftime('%Y-%m-%d %H:%M:%S')))current date and time: 2016-12-29 09:51:33 >>> exit() Anaconda Python includes a package manager called ‘conda’ which can list and update the existing libraries available in the current system. HW12256:~ usr000$ conda info Current conda install: platform : osx-64 conda version : 4.2.12 conda is private : False conda-env version : 4.2.12 conda-build version : 0+unknown python version : 3.5.2.final.0 requests version : 2.10.0 root environment : /Users/usr000/anaconda (writable) default environment : /Users/usr000/anaconda envs directories : /Users/usr000/anaconda/envs package cache : /Users/usr000/anaconda/pkgs channel URLs : https://repo.continuum.io/pkgs/free/osx-64 https://repo.continuum.io/pkgs/free/noarch https://repo.continuum.io/pkgs/pro/osx-64 https://repo.continuum.io/pkgs/pro/noarch config file : None offline mode : False HW12256:~ usr000$ conda list 4-Installing Spark To install Spark, we download the pre-built spark tarball spark-2.0.2-bin-hadoop2.7.tgz from http://spark.apache.org/downloads.html and move to your target Spark directory. Untar the tarball in your chosen directory HW12256:bin usr000$ tar -xvfz spark-2.0.2-bin-hadoop2.7.tgz Create symlink to spark2 directory HW12256:bin usr000$ ln -s ~/bin/sparks/spark-2.0.2-bin-hadoop2.7 ~/bin/spark2 5-Checking Spark Install Check the directories created under Spark 2 HW12256:bin usr000$ ls -lru total 16drwxr-xr-x 5 usr000 staff 170 Dec 28 10:39 sparkslrwxr-xr-x 1 usr000 staff 50 Dec 28 10:39 spark2 -> /Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7lrwxr-xr-x 1 usr000 staff 51 May 23 2016 spark -> /Users/usr000/bin/sparks/spark-1.6.1-bin-hadoop2.6/HW12256:bin usr000$ cd spark2HW12256:spark2 usr000$ ls -lrutotal 112drwxr-xr-x@ 3 usr000 staff 102 Jan 1 1970 yarndrwxr-xr-x@ 24 usr000 staff 816 Jan 1 1970 sbindrwxr-xr-x@ 10 usr000 staff 340 Dec 28 10:30 pythondrwxr-xr-x@ 38 usr000 staff 1292 Jan 1 1970 licensesdrwxr-xr-x@ 208 usr000 staff 7072 Dec 28 10:30 jarsdrwxr-xr-x@ 4 usr000 staff 136 Jan 1 1970 examplesdrwxr-xr-x@ 5 usr000 staff 170 Jan 1 1970 datadrwxr-xr-x@ 9 usr000 staff 306 Dec 28 10:27 confdrwxr-xr-x@ 24 usr000 staff 816 Dec 28 10:30 bin-rw-r--r--@ 1 usr000 staff 120 Dec 28 10:25 RELEASE-rw-r--r--@ 1 usr000 staff 3828 Dec 28 10:25 README.mddrwxr-xr-x@ 3 usr000 staff 102 Jan 1 1970 R-rw-r--r--@ 1 usr000 staff 24749 Dec 28 10:25 NOTICE-rw-r--r--@ 1 usr000 staff 17811 Dec 28 10:25 LICENSEHW12256:spark2 usr000$ Running SparkPi example in local mode. Scala command # export SPARK_HOMEHW12256:spark2 usr000$ export SPARK_HOME=/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7HW12256:spark2 usr000$ echo $SPARK_HOME/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7# Run Spark PI example in ScalaHW12256:spark2 usr000$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi --driver-memory 512m --executor-memory 512m --executor-cores 1 $SPARK_HOME/examples/jars/spark-examples*.jar 5Python commandHW12256:spark2 usr000$ ./bin/spark-submit --driver-memory 512m --executor-memory 512m --executor-cores 1 examples/src/main/python/pi.py 10Scala exampleHW12256:spark2 usr000$ export SPARK_HOME=/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7HW12256:spark2 usr000$ echo $SPARK_HOME/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7HW12256:spark2 usr000$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi --driver-memory 512m --executor-memory 512m --executor-cores 1 $SPARK_HOME/examples/jars/spark-examples*.jar 5Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties16/12/29 11:40:53 INFO SparkContext: Running Spark version 2.0.216/12/29 11:40:53 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable...16/12/29 11:40:55 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 0.851288 sPi is roughly 3.139094278188556216/12/29 11:40:55 INFO SparkUI: Stopped Spark web UI at http://000.000.0.0:4040...16/12/29 11:40:55 INFO SparkContext: Successfully stopped SparkContext16/12/29 11:40:55 INFO ShutdownHookManager: Shutdown hook called16/12/29 11:40:55 INFO ShutdownHookManager: Deleting directory /private/var/folders/1r/8qylt4bj4h59b3h_1xq_nsw00000gp/T/spark-35b67f21-1d52-4dee-9c75-7e9d9c153adaHW12256:spark2 usr000$ Python example HW12256:spark2 usr000$ ./bin/spark-submit examples/src/main/python/pi.py 10 Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties16/12/29 11:27:33 INFO SparkContext: Running Spark version 2.0.216/12/29 11:27:33 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable...16/12/29 11:27:36 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool16/12/29 11:27:36 INFO DAGScheduler: Job 0 finished: reduce at /Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7/examples/src/main/python/pi.py:43, took 1.199257 sPi is roughly 3.13836016/12/29 11:27:36 INFO SparkUI: Stopped Spark web UI at http://http://000.000.0.0:404016/12/29 11:27:36 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!...16/12/29 11:27:36 INFO SparkContext: Successfully stopped SparkContext16/12/29 11:27:37 INFO ShutdownHookManager: Shutdown hook called16/12/29 11:27:37 INFO ShutdownHookManager: Deleting directory /private/var/folders/1r/8qylt4bj4h59b3h_1xq_nsw00000gp/T/spark-eb12faa9-b7ff-4556-9538-45ddcdc6797b16/12/29 11:27:37 INFO ShutdownHookManager: Deleting directory /private/var/folders/1r/8qylt4bj4h59b3h_1xq_nsw00000gp/T/spark-eb12faa9-b7ff-4556-9538-45ddcdc6797b/pyspark-ba9947c5-dbea-4edc-9c4c-c2c316e6caba Wordcount program using PySpark HW12256:spark2 usr000$ ./bin/pyspark Python 2.7.10 (default, Jul 30 2016, 19:40:32)[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)] on darwinType "help", "copyright", "credits" or "license" for more information.Using Spark's default log4j profile: org/apache/spark/log4j-defaults.propertiesSetting default log level to "WARN".To adjust logging level use sc.setLogLevel(newLevel).16/12/29 12:25:15 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableWelcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.0.2 /_/Using Python version 2.7.10 (default, Jul 30 2016 19:40:32)SparkSession available as 'spark'.>>> import os>>> print(os.getcwd())/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7>>> import re>>> from operator import add>>> wordcounts_in = sc.textFile('README.md').flatMap(lambda l: re.split('\W+', l.strip())).filter(lambda w: len(w)>0).map(lambda w: (w,1)).reduceByKey(add).map(lambda (a,b): (b,a)).sortByKey(ascending = False)/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/shuffle.py:58: UserWarning: Please install psutil to have better support with spilling/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/shuffle.py:58: UserWarning: Please install psutil to have better support with spilling>>> wordcounts_in.take(10)[(23, u'the'), (18, u'Spark'), (14, u'to'), (13, u'run'), (11, u'for'), (11, u'apache'), (11, u'spark'), (11, u'and'), (11, u'org'), (8, u'a')]>>> wordcounts_in = sc.textFile('README.md').flatMap(lambda l: re.split('\W+', l.strip())).filter(lambda w: len(w)>0).map(lambda w: (w,1)).reduceByKey(add).map(lambda (a,b): (b,a)).sortByKey(ascending = False).map(lambda (a,b): (b,a))/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/shuffle.py:58: UserWarning: Please install psutil to have better support with spilling/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/shuffle.py:58: UserWarning: Please install psutil to have better support with spilling>>> wordcounts_in.take(10) [(u'the', 23), (u'Spark', 18), (u'to', 14), (u'run', 13), (u'for', 11), (u'apache', 11), (u'spark', 11), (u'and', 11), (u'org', 11), (u'a', 8)]>>>exit() 6-Launching Jupyter Notebook with PySpark Launching Jupyter Notebook with Spark 1.6.*, we use to associate the --packages com.databricks:spark-csv_2.11:1.4.0 parameter in the command as the csv package was not natively part of Spark. HW12256:~ usr000$ PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS='notebook' PYSPARK_PYTHON=python3 /Users/usr000/bin/spark/bin/pyspark --packages com.databricks:spark-csv_2.11:1.4.0 In the case of Spark 2.0.*, we do not need to associate the spark-csv –packages parameter, as spark-csv is part of the standard Spark 2.0 library. HW12256:~ usr000$ PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS='notebook' PYSPARK_PYTHON=python3 /Users/usr000/bin/spark2/bin/pyspark 7-Exploring PySpark 2.0.2 We will explore the new features of Spark 2.0.2 using PySpark and contrasting where appropriate with previous version of spark and with pandas. In the case of the machine learning pipeline, we will contract Spark MLLib or ML with Scikit Learn. a.Spark Session Spark 2.0 introduces SparkSession. SparkSession is the single entry point for interacting with Spark functionality. It replaces and encapsulates the SQLContext, HiveContext and StreamingContext for a more unified access to the DataFrame and Dataset APIs. The SQLContext, HiveContext and StreamingContext still exist under the hood in Spark 2.0 for continuity purpose with the Spark legacy code. The Spark session has to be created when using spark-submit command. An example on how to do that: from pyspark.sql import SparkSession from pyspark import SparkContext from pyspark import SparkConf # from pyspark.sql import SQLContext spark = SparkSession\ .builder\ .appName("example-spark")\ .config("spark.sql.crossJoin.enabled","true")\ .getOrCreate()sc = SparkContext() # sqlContext = SQLContext(sc) When typing ‘pyspark’ at the terminal, python automatically creates the spark context sc. A SparkSession is automatically generated and available as 'spark'. Application name can be accessed using SparkContext. spark.sparkContext.appName# Configuration is accessible using RuntimeConfig:from py4j.protocol import Py4JErrortry: spark.conf.get("some.conf")except Py4JError as e: pass The following code outline the available spark context sc as well as the new spark session under the name "spark" which includes the previous sqlContext, HiveContext, StreamingContext under one unified single entry point. sqlContext, HiveContext, StreamingContext still exist to ensure continuity with legacy code in Spark. HW12256:spark2 usr000$ ./bin/pyspark Python 2.7.10 (default, Jul 30 2016, 19:40:32)[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)] on darwinType "help", "copyright", "credits" or "license" for more information.Using Spark's default log4j profile: org/apache/spark/log4j-defaults.propertiesSetting default log level to "WARN".To adjust logging level use sc.setLogLevel(newLevel).16/12/29 20:41:27 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableWelcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.0.2 /_/Using Python version 2.7.10 (default, Jul 30 2016 19:40:32)SparkSession available as 'spark'.>>> sc<pyspark.context.SparkContext object at 0x101e9c850>>>> sc._conf.getAll()[(u'spark.app.id', u'local-1483040488671'), (u'spark.sql.catalogImplementation', u'hive'), (u'spark.rdd.compress', u'True'), (u'spark.serializer.objectStreamReset', u'100'), (u'spark.master', u'local[*]'), (u'spark.executor.id', u'driver'), (u'spark.submit.deployMode', u'client'), (u'hive.metastore.warehouse.dir', u'file:/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7/spark-warehouse'), (u'spark.driver.port', u'57764'), (u'spark.app.name', u'PySparkShell'), (u'spark.driver.host', u'000.000.0.0')]>>> spark<pyspark.sql.session.SparkSession object at 0x102df9b50>>>> spark.sparkContext<pyspark.context.SparkContext object at 0x101e9c850>>>> spark.sparkContext.appNameu'PySparkShell'>>> from pyspark.sql.functions import *>>> spark.range(1, 7, 2).collect()16/12/29 20:58:32 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.016/12/29 20:58:32 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException[Row(id=1), Row(id=3), Row(id=5)] b.Read CSV We describe how to easily access csv files from spark and from pandas and load them into dataframe for data exploration, maniputation and mining. i.Spark 2.0 & Spark 1.6 We can create a spark dataframe directly from reading the csv file. In order to be compatible with previous format we have include a conditional switch in the format statement ## Spark 2.0 and Spark 1.6 compatible read csv#formatPackage = "csv" if sc.version > '1.6' else "com.databricks.spark.csv"df = sqlContext.read.format(formatPackage).options(header='true', delimiter = '|').load("s00_dat/dataframe_sample.csv")df.printSchema() ii.Pandas We can create the iris pandas dataframe from the existing dataset from sklearn. from sklearn.datasets import load_irisimport numpy as npimport pandas as pdimport matplotlib.pyplot as pltiris = load_iris()df = pd.DataFrame(iris.data, columns=iris.feature_names)df['species'] = pd.Categorical.from_codes(iris.target, iris.target_names) c.Dataframes i.Pandas DataFrames Pandas dataframes in conjunction with visualization libraries such as matplotlib and seaborn give us some nice insights into the data ii.Spark DataSets, Spark DataFrames and Spark RDDs Spark Dataframe and Spark RDDs are the fundamental data structure that allow us to manipulate and interact with the various Spark libraries. Spark DataSets are more relevant for Scala developpers and give the ability to create typed spark dataframe. d.Machine Learning i.SciKit Learn We demonstrate a random forest machine learning pipeline using scikit learn in the ipython notebook. ii.Spark MLLib, Spark ML We demonstrate a random forest machine learning pipeline using Spark MLlib and Spark ML 8-Conclusion Spark and Jupyter Notebook using the Anaconda Python distribution provide a very powerful development environment in your laptop. It allows quick exploration of data mining, machine learning, visualizations in a flexible and easy to use environment. We have described the installation of Jupyter Notebook, Spark. We have described few data processing pipeline as well as a machine learning classification using Random Forest.

anandi · ‎09-18-2016

Hi Mike, follow the following steps: 1- in the CLI where spark is installed, first export Hadoop conf export HADOOP_CONF_DIR= ~/etc/hadoop/conf (you may want to put it in your spark conf file: export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/etc/hadoop/conf}) 2- launch spark-shell val input = sc.textFile("hdfs:///....insert/your/hdfs/file/path...") input.count() //prints the nr of lines read ...

Online	Offline
Last Visited	‎03-12-2019 01:10 PM

Member Since	‎04-11-2016 09:25 AM
Last Visited	‎03-12-2019 01:10 PM
Posts	38
Kudos received	14

Cloudera Community

Re: Spark Job Failing "Could not find or load main...

Re: How to SFTP a generated file in spark 1.4.1

Re: How to check a correct install of spark? ( Whe...

Re: Sandbox HDP 2.5.0 - Spark 1.6.2 - Issues: GPLN...

Re: Sandbox HDP-2.5.0 Spark 2.0.0 - Spark Submit Y...

Re: Complex Json transformation using Hive functio...

Re: How To Install ELK Stack (6.3.2) in Ambari

Storm - Supervisor and Nimbus dropping immediately...

Machine Learning Model Factory and Road To Product...

Setting Up a Data Science Platform on HDP using An...

Re: Interacting with Hadoop HDFS using Python code...

Interacting with Hadoop HDFS using Python codes

Re: Spark Job Failing "Could not find or load main...

Installing and Exploring Spark 2.0 with Jupyter No...

Re: How to access file in HDFS from Spark-shell or...