Member since
04-11-2016
38
Posts
13
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
50000 | 01-04-2017 11:43 PM | |
4088 | 09-05-2016 04:07 PM | |
10664 | 09-05-2016 03:50 PM | |
2447 | 08-30-2016 08:15 PM | |
4055 | 08-30-2016 01:01 PM |
11-21-2018
08:31 PM
Hey Jasper, great article! Thanks for sharing. Would you recommend using hive to spark. what about a similar article using spark? 😉
... View more
09-01-2018
09:06 PM
@ Steve Matison Thank you for posting the ES MPACK. Would you be able to share how to build our own custom MPACK. Maybe a follow on blog. Cheers Amit
... View more
10-29-2017
08:19 PM
Storm - Supervisor and Nimbus dropping immediately after start - Please advise on remediation. Thanks. HDP install on OpenStack - Centos 7.2 with Ambari 2.5.2.0: HDP-2.6.2.14 - Storm 1.1.0 Storm failure to start at install time with the following log: stderr: /var/lib/ambari-agent/data/errors-238.txt
stderr:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/STORM/0.9.1/package/scripts/service_check.py", line 79, in <module>
ServiceCheck().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 329, in execute
method(env)
File "/var/lib/ambari-agent/cache/common-services/STORM/0.9.1/package/scripts/service_check.py", line 70, in service_check
user=params.storm_user
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 166, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 262, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call
tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call
raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 'storm jar /tmp/wordCount.jar storm.starter.WordCountTopology WordCountid1aaca8ef_date022917' returned 1. Running: /usr/jdk64/jdk1.8.0_112/bin/java -server -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/hdp/2.6.2.14-5/storm -Dstorm.log.dir=/var/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= -cp /usr/hdp/2.6.2.14-5/storm/lib/asm-5.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/clojure-1.7.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/disruptor-3.3.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/kryo-3.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-api-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-core-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-slf4j-impl-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/minlog-1.3.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/objenesis-2.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/reflectasm-1.10.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/ring-cors-0.1.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/slf4j-api-1.7.21.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-core-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-rename-hack-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/zookeeper.jar:/usr/hdp/2.6.2.14-5/storm/lib/ambari-metrics-storm-sink.jar:/usr/hdp/2.6.2.14-5/storm/extlib/atlas-plugin-classloader-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib/storm-bridge-shim-0.8.0.2.6.2.14-5.jar org.apache.storm.daemon.ClientJarTransformerRunner org.apache.storm.hack.StormShadeTransformer /tmp/wordCount.jar /tmp/ea59a668bcca11e7ae97fa163eb0f425.jar
1330 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseBasicBolt to org/apache/storm/topology/base/BaseBasicBolt in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace
1337 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/tuple/Tuple to org/apache/storm/tuple/Tuple in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace
1338 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/BasicOutputCollector to org/apache/storm/topology/BasicOutputCollector in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace
1338 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/tuple/Values to org/apache/storm/tuple/Values in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace
1339 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/OutputFieldsDeclarer to org/apache/storm/topology/OutputFieldsDeclarer in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace
1339 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/tuple/Fields to org/apache/storm/tuple/Fields in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace
1340 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/LinearDRPCTopologyBuilder to org/apache/storm/drpc/LinearDRPCTopologyBuilder in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1341 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/IBasicBolt to org/apache/storm/topology/IBasicBolt in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1341 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/LinearDRPCInputDeclarer to org/apache/storm/drpc/LinearDRPCInputDeclarer in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1341 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/Config to org/apache/storm/Config in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1343 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/LocalDRPC to org/apache/storm/LocalDRPC in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1343 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/LocalCluster to org/apache/storm/LocalCluster in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1344 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/ILocalDRPC to org/apache/storm/ILocalDRPC in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1344 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/generated/StormTopology to org/apache/storm/generated/StormTopology in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1345 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/StormSubmitter to org/apache/storm/StormSubmitter in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1353 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseRichBolt to org/apache/storm/topology/base/BaseRichBolt in storm/starter/bolt/RollingCountBolt.class. please modify your code to use the new namespace
1354 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/OutputCollector to org/apache/storm/task/OutputCollector in storm/starter/bolt/RollingCountBolt.class. please modify your code to use the new namespace
1354 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/TopologyContext to org/apache/storm/task/TopologyContext in storm/starter/bolt/RollingCountBolt.class. please modify your code to use the new namespace
1358 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/TimeCacheMap$ExpiredCallback to org/apache/storm/utils/TimeCacheMap$ExpiredCallback in storm/starter/bolt/SingleJoinBolt$ExpireCallback.class. please modify your code to use the new namespace
1358 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/generated/GlobalStreamId to org/apache/storm/generated/GlobalStreamId in storm/starter/bolt/SingleJoinBolt$ExpireCallback.class. please modify your code to use the new namespace
1358 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/TimeCacheMap to org/apache/storm/utils/TimeCacheMap in storm/starter/bolt/SingleJoinBolt$ExpireCallback.class. please modify your code to use the new namespace
1374 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/spout/ISpout to org/apache/storm/spout/ISpout in storm/starter/clj/word_count$sentence_spout__$fn$reify__23.class. please modify your code to use the new namespace
1379 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/IBolt to org/apache/storm/task/IBolt in storm/starter/clj/word_count$split_sentence__$fn$reify__42.class. please modify your code to use the new namespace
1454 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/TopologyBuilder to org/apache/storm/topology/TopologyBuilder in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1454 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/testing/TestWordSpout to org/apache/storm/testing/TestWordSpout in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1455 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/IRichSpout to org/apache/storm/topology/IRichSpout in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1455 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/SpoutDeclarer to org/apache/storm/topology/SpoutDeclarer in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1455 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/IRichBolt to org/apache/storm/topology/IRichBolt in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1456 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/BoltDeclarer to org/apache/storm/topology/BoltDeclarer in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1456 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/InputDeclarer to org/apache/storm/topology/InputDeclarer in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1456 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/Utils to org/apache/storm/utils/Utils in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1458 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/DRPCSpout to org/apache/storm/drpc/DRPCSpout in storm/starter/ManualDRPC.class. please modify your code to use the new namespace
1458 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/ReturnResults to org/apache/storm/drpc/ReturnResults in storm/starter/ManualDRPC.class. please modify your code to use the new namespace
1460 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseBatchBolt to org/apache/storm/topology/base/BaseBatchBolt in storm/starter/ReachTopology$CountAggregator.class. please modify your code to use the new namespace
1460 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/coordination/BatchOutputCollector to org/apache/storm/coordination/BatchOutputCollector in storm/starter/ReachTopology$CountAggregator.class. please modify your code to use the new namespace
1464 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/coordination/IBatchBolt to org/apache/storm/coordination/IBatchBolt in storm/starter/ReachTopology.class. please modify your code to use the new namespace
1467 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/testing/FeederSpout to org/apache/storm/testing/FeederSpout in storm/starter/SingleJoinExample.class. please modify your code to use the new namespace
1469 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseRichSpout to org/apache/storm/topology/base/BaseRichSpout in storm/starter/spout/RandomSentenceSpout.class. please modify your code to use the new namespace
1469 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/spout/SpoutOutputCollector to org/apache/storm/spout/SpoutOutputCollector in storm/starter/spout/RandomSentenceSpout.class. please modify your code to use the new namespace
1470 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/Time to org/apache/storm/utils/Time in storm/starter/tools/NthLastModifiedTimeTracker.class. please modify your code to use the new namespace
1480 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseTransactionalBolt to org/apache/storm/topology/base/BaseTransactionalBolt in storm/starter/TransactionalGlobalCount$UpdateGlobalCount.class. please modify your code to use the new namespace
1480 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/ICommitter to org/apache/storm/transactional/ICommitter in storm/starter/TransactionalGlobalCount$UpdateGlobalCount.class. please modify your code to use the new namespace
1480 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/TransactionAttempt to org/apache/storm/transactional/TransactionAttempt in storm/starter/TransactionalGlobalCount$UpdateGlobalCount.class. please modify your code to use the new namespace
1482 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/testing/MemoryTransactionalSpout to org/apache/storm/testing/MemoryTransactionalSpout in storm/starter/TransactionalGlobalCount.class. please modify your code to use the new namespace
1482 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/TransactionalTopologyBuilder to org/apache/storm/transactional/TransactionalTopologyBuilder in storm/starter/TransactionalGlobalCount.class. please modify your code to use the new namespace
1482 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/partitioned/IPartitionedTransactionalSpout to org/apache/storm/transactional/partitioned/IPartitionedTransactionalSpout in storm/starter/TransactionalGlobalCount.class. please modify your code to use the new namespace
1491 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/BaseFunction to org/apache/storm/trident/operation/BaseFunction in storm/starter/trident/TridentReach$ExpandList.class. please modify your code to use the new namespace
1491 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/tuple/TridentTuple to org/apache/storm/trident/tuple/TridentTuple in storm/starter/trident/TridentReach$ExpandList.class. please modify your code to use the new namespace
1492 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/TridentCollector to org/apache/storm/trident/operation/TridentCollector in storm/starter/trident/TridentReach$ExpandList.class. please modify your code to use the new namespace
1492 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/CombinerAggregator to org/apache/storm/trident/operation/CombinerAggregator in storm/starter/trident/TridentReach$One.class. please modify your code to use the new namespace
1493 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/StateFactory to org/apache/storm/trident/state/StateFactory in storm/starter/trident/TridentReach$StaticSingleKeyMapState$Factory.class. please modify your code to use the new namespace
1493 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/IMetricsContext to org/apache/storm/task/IMetricsContext in storm/starter/trident/TridentReach$StaticSingleKeyMapState$Factory.class. please modify your code to use the new namespace
1493 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/State to org/apache/storm/trident/state/State in storm/starter/trident/TridentReach$StaticSingleKeyMapState$Factory.class. please modify your code to use the new namespace
1494 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/ReadOnlyState to org/apache/storm/trident/state/ReadOnlyState in storm/starter/trident/TridentReach$StaticSingleKeyMapState.class. please modify your code to use the new namespace
1494 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/map/ReadOnlyMapState to org/apache/storm/trident/state/map/ReadOnlyMapState in storm/starter/trident/TridentReach$StaticSingleKeyMapState.class. please modify your code to use the new namespace
1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/TridentTopology to org/apache/storm/trident/TridentTopology in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/TridentState to org/apache/storm/trident/TridentState in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/Stream to org/apache/storm/trident/Stream in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/MapGet to org/apache/storm/trident/operation/builtin/MapGet in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1496 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/QueryFunction to org/apache/storm/trident/state/QueryFunction in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1496 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/Function to org/apache/storm/trident/operation/Function in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1496 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/fluent/GroupedStream to org/apache/storm/trident/fluent/GroupedStream in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1497 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/Sum to org/apache/storm/trident/operation/builtin/Sum in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/testing/MemoryMapState$Factory to org/apache/storm/trident/testing/MemoryMapState$Factory in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace
1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/testing/MemoryMapState to org/apache/storm/trident/testing/MemoryMapState in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace
1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/testing/FixedBatchSpout to org/apache/storm/trident/testing/FixedBatchSpout in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace
1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/spout/IBatchSpout to org/apache/storm/trident/spout/IBatchSpout in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace
1500 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/Count to org/apache/storm/trident/operation/builtin/Count in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace
1500 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/FilterNull to org/apache/storm/trident/operation/builtin/FilterNull in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace
1501 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/Filter to org/apache/storm/trident/operation/Filter in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace
1503 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/ShellBolt to org/apache/storm/task/ShellBolt in storm/starter/WordCountTopology$SplitSentence.class. please modify your code to use the new namespace
Running: /usr/jdk64/jdk1.8.0_112/bin/java -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/hdp/2.6.2.14-5/storm -Dstorm.log.dir=/var/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/hdp/current/storm-client/lib -Dstorm.conf.file= -cp /usr/hdp/2.6.2.14-5/storm/lib/asm-5.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/clojure-1.7.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/disruptor-3.3.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/kryo-3.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-api-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-core-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-slf4j-impl-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/minlog-1.3.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/objenesis-2.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/reflectasm-1.10.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/ring-cors-0.1.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/slf4j-api-1.7.21.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-core-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-rename-hack-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/zookeeper.jar:/usr/hdp/2.6.2.14-5/storm/lib/ambari-metrics-storm-sink.jar:/usr/hdp/2.6.2.14-5/storm/extlib/atlas-plugin-classloader-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib/storm-bridge-shim-0.8.0.2.6.2.14-5.jar:/tmp/ea59a668bcca11e7ae97fa163eb0f425.jar:/usr/hdp/current/storm-supervisor/conf:/usr/hdp/2.6.2.14-5/storm/bin -Dstorm.jar=/tmp/ea59a668bcca11e7ae97fa163eb0f425.jar -Dstorm.dependency.jars= -Dstorm.dependency.artifacts={} storm.starter.WordCountTopology WordCountid1aaca8ef_date022917
948 [main] INFO o.a.s.StormSubmitter - Generated ZooKeeper secret payload for MD5-digest: -6085484404615721045:-8414510412923341525
1095 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2001ms (NOT MAX)
3099 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2002ms (NOT MAX)
5103 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2005ms (NOT MAX)
7110 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2013ms (NOT MAX)
9125 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2016ms (NOT MAX)
11144 [main] WARN o.a.s.u.NimbusClient - Ignoring exception while trying to get leader nimbus info from mst2-an05.field.hortonworks.com. will retry with a different seed host.
java.lang.RuntimeException: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:108) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.ThriftClient.<init>(ThriftClient.java:69) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.utils.NimbusClient.<init>(NimbusClient.java:128) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:84) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.StormSubmitter.getListOfKeysFromBlobStore(StormSubmitter.java:598) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.StormSubmitter.validateConfs(StormSubmitter.java:564) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:210) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:390) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:162) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at storm.starter.WordCountTopology.main(WordCountTopology.java:77) [ea59a668bcca11e7ae97fa163eb0f425.jar:?]
Caused by: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.storm.security.auth.TBackoffConnect.retryNext(TBackoffConnect.java:64) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:56) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
... 11 more
Caused by: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:226) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
... 11 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_112]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_112]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_112]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_112]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_112]
at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_112]
at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:221) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
... 11 more
Exception in thread "main" org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [mst2-an05.field.hortonworks.com]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:112)
at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58)
at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268)
at org.apache.storm.StormSubmitter.getListOfKeysFromBlobStore(StormSubmitter.java:598)
at org.apache.storm.StormSubmitter.validateConfs(StormSubmitter.java:564)
at org.apache.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:210)
at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:390)
at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:162)
at storm.starter.WordCountTopology.main(WordCountTopology.java:77)
stdout:
2017-10-29 17:02:12,776 - Stack Feature Version Info: Cluster Stack=2.6, Cluster Current Version=None, Command Stack=None, Command Version=2.6.2.14-5 -> 2.6.2.14-5
2017-10-29 17:02:12,815 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
2017-10-29 17:02:12,817 - checked_call['hostid'] {}
2017-10-29 17:02:12,827 - checked_call returned (0, '1aaca8ef')
2017-10-29 17:02:12,827 - File['/tmp/wordCount.jar'] {'owner': 'storm', 'content': StaticFile('wordCount.jar')}
2017-10-29 17:02:12,831 - Writing File['/tmp/wordCount.jar'] because it doesn't exist
2017-10-29 17:02:12,833 - Changing owner for /tmp/wordCount.jar from 0 to storm
2017-10-29 17:02:12,833 - Execute['storm jar /tmp/wordCount.jar storm.starter.WordCountTopology WordCountid1aaca8ef_date022917'] {'logoutput': True, 'path': [u'/usr/hdp/current/storm-client/bin'], 'user': 'storm'}
Running: /usr/jdk64/jdk1.8.0_112/bin/java -server -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/hdp/2.6.2.14-5/storm -Dstorm.log.dir=/var/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= -cp /usr/hdp/2.6.2.14-5/storm/lib/asm-5.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/clojure-1.7.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/disruptor-3.3.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/kryo-3.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-api-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-core-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-slf4j-impl-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/minlog-1.3.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/objenesis-2.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/reflectasm-1.10.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/ring-cors-0.1.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/slf4j-api-1.7.21.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-core-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-rename-hack-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/zookeeper.jar:/usr/hdp/2.6.2.14-5/storm/lib/ambari-metrics-storm-sink.jar:/usr/hdp/2.6.2.14-5/storm/extlib/atlas-plugin-classloader-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib/storm-bridge-shim-0.8.0.2.6.2.14-5.jar org.apache.storm.daemon.ClientJarTransformerRunner org.apache.storm.hack.StormShadeTransformer /tmp/wordCount.jar /tmp/ea59a668bcca11e7ae97fa163eb0f425.jar
1330 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseBasicBolt to org/apache/storm/topology/base/BaseBasicBolt in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace
1337 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/tuple/Tuple to org/apache/storm/tuple/Tuple in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace
1338 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/BasicOutputCollector to org/apache/storm/topology/BasicOutputCollector in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace
1338 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/tuple/Values to org/apache/storm/tuple/Values in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace
1339 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/OutputFieldsDeclarer to org/apache/storm/topology/OutputFieldsDeclarer in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace
1339 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/tuple/Fields to org/apache/storm/tuple/Fields in storm/starter/BasicDRPCTopology$ExclaimBolt.class. please modify your code to use the new namespace
1340 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/LinearDRPCTopologyBuilder to org/apache/storm/drpc/LinearDRPCTopologyBuilder in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1341 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/IBasicBolt to org/apache/storm/topology/IBasicBolt in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1341 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/LinearDRPCInputDeclarer to org/apache/storm/drpc/LinearDRPCInputDeclarer in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1341 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/Config to org/apache/storm/Config in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1343 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/LocalDRPC to org/apache/storm/LocalDRPC in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1343 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/LocalCluster to org/apache/storm/LocalCluster in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1344 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/ILocalDRPC to org/apache/storm/ILocalDRPC in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1344 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/generated/StormTopology to org/apache/storm/generated/StormTopology in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1345 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/StormSubmitter to org/apache/storm/StormSubmitter in storm/starter/BasicDRPCTopology.class. please modify your code to use the new namespace
1353 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseRichBolt to org/apache/storm/topology/base/BaseRichBolt in storm/starter/bolt/RollingCountBolt.class. please modify your code to use the new namespace
1354 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/OutputCollector to org/apache/storm/task/OutputCollector in storm/starter/bolt/RollingCountBolt.class. please modify your code to use the new namespace
1354 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/TopologyContext to org/apache/storm/task/TopologyContext in storm/starter/bolt/RollingCountBolt.class. please modify your code to use the new namespace
1358 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/TimeCacheMap$ExpiredCallback to org/apache/storm/utils/TimeCacheMap$ExpiredCallback in storm/starter/bolt/SingleJoinBolt$ExpireCallback.class. please modify your code to use the new namespace
1358 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/generated/GlobalStreamId to org/apache/storm/generated/GlobalStreamId in storm/starter/bolt/SingleJoinBolt$ExpireCallback.class. please modify your code to use the new namespace
1358 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/TimeCacheMap to org/apache/storm/utils/TimeCacheMap in storm/starter/bolt/SingleJoinBolt$ExpireCallback.class. please modify your code to use the new namespace
1374 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/spout/ISpout to org/apache/storm/spout/ISpout in storm/starter/clj/word_count$sentence_spout__$fn$reify__23.class. please modify your code to use the new namespace
1379 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/IBolt to org/apache/storm/task/IBolt in storm/starter/clj/word_count$split_sentence__$fn$reify__42.class. please modify your code to use the new namespace
1454 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/TopologyBuilder to org/apache/storm/topology/TopologyBuilder in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1454 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/testing/TestWordSpout to org/apache/storm/testing/TestWordSpout in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1455 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/IRichSpout to org/apache/storm/topology/IRichSpout in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1455 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/SpoutDeclarer to org/apache/storm/topology/SpoutDeclarer in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1455 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/IRichBolt to org/apache/storm/topology/IRichBolt in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1456 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/BoltDeclarer to org/apache/storm/topology/BoltDeclarer in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1456 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/InputDeclarer to org/apache/storm/topology/InputDeclarer in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1456 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/Utils to org/apache/storm/utils/Utils in storm/starter/ExclamationTopology.class. please modify your code to use the new namespace
1458 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/DRPCSpout to org/apache/storm/drpc/DRPCSpout in storm/starter/ManualDRPC.class. please modify your code to use the new namespace
1458 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/drpc/ReturnResults to org/apache/storm/drpc/ReturnResults in storm/starter/ManualDRPC.class. please modify your code to use the new namespace
1460 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseBatchBolt to org/apache/storm/topology/base/BaseBatchBolt in storm/starter/ReachTopology$CountAggregator.class. please modify your code to use the new namespace
1460 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/coordination/BatchOutputCollector to org/apache/storm/coordination/BatchOutputCollector in storm/starter/ReachTopology$CountAggregator.class. please modify your code to use the new namespace
1464 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/coordination/IBatchBolt to org/apache/storm/coordination/IBatchBolt in storm/starter/ReachTopology.class. please modify your code to use the new namespace
1467 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/testing/FeederSpout to org/apache/storm/testing/FeederSpout in storm/starter/SingleJoinExample.class. please modify your code to use the new namespace
1469 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseRichSpout to org/apache/storm/topology/base/BaseRichSpout in storm/starter/spout/RandomSentenceSpout.class. please modify your code to use the new namespace
1469 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/spout/SpoutOutputCollector to org/apache/storm/spout/SpoutOutputCollector in storm/starter/spout/RandomSentenceSpout.class. please modify your code to use the new namespace
1470 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/utils/Time to org/apache/storm/utils/Time in storm/starter/tools/NthLastModifiedTimeTracker.class. please modify your code to use the new namespace
1480 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/topology/base/BaseTransactionalBolt to org/apache/storm/topology/base/BaseTransactionalBolt in storm/starter/TransactionalGlobalCount$UpdateGlobalCount.class. please modify your code to use the new namespace
1480 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/ICommitter to org/apache/storm/transactional/ICommitter in storm/starter/TransactionalGlobalCount$UpdateGlobalCount.class. please modify your code to use the new namespace
1480 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/TransactionAttempt to org/apache/storm/transactional/TransactionAttempt in storm/starter/TransactionalGlobalCount$UpdateGlobalCount.class. please modify your code to use the new namespace
1482 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/testing/MemoryTransactionalSpout to org/apache/storm/testing/MemoryTransactionalSpout in storm/starter/TransactionalGlobalCount.class. please modify your code to use the new namespace
1482 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/TransactionalTopologyBuilder to org/apache/storm/transactional/TransactionalTopologyBuilder in storm/starter/TransactionalGlobalCount.class. please modify your code to use the new namespace
1482 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/transactional/partitioned/IPartitionedTransactionalSpout to org/apache/storm/transactional/partitioned/IPartitionedTransactionalSpout in storm/starter/TransactionalGlobalCount.class. please modify your code to use the new namespace
1491 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/BaseFunction to org/apache/storm/trident/operation/BaseFunction in storm/starter/trident/TridentReach$ExpandList.class. please modify your code to use the new namespace
1491 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/tuple/TridentTuple to org/apache/storm/trident/tuple/TridentTuple in storm/starter/trident/TridentReach$ExpandList.class. please modify your code to use the new namespace
1492 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/TridentCollector to org/apache/storm/trident/operation/TridentCollector in storm/starter/trident/TridentReach$ExpandList.class. please modify your code to use the new namespace
1492 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/CombinerAggregator to org/apache/storm/trident/operation/CombinerAggregator in storm/starter/trident/TridentReach$One.class. please modify your code to use the new namespace
1493 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/StateFactory to org/apache/storm/trident/state/StateFactory in storm/starter/trident/TridentReach$StaticSingleKeyMapState$Factory.class. please modify your code to use the new namespace
1493 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/IMetricsContext to org/apache/storm/task/IMetricsContext in storm/starter/trident/TridentReach$StaticSingleKeyMapState$Factory.class. please modify your code to use the new namespace
1493 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/State to org/apache/storm/trident/state/State in storm/starter/trident/TridentReach$StaticSingleKeyMapState$Factory.class. please modify your code to use the new namespace
1494 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/ReadOnlyState to org/apache/storm/trident/state/ReadOnlyState in storm/starter/trident/TridentReach$StaticSingleKeyMapState.class. please modify your code to use the new namespace
1494 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/map/ReadOnlyMapState to org/apache/storm/trident/state/map/ReadOnlyMapState in storm/starter/trident/TridentReach$StaticSingleKeyMapState.class. please modify your code to use the new namespace
1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/TridentTopology to org/apache/storm/trident/TridentTopology in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/TridentState to org/apache/storm/trident/TridentState in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/Stream to org/apache/storm/trident/Stream in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1495 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/MapGet to org/apache/storm/trident/operation/builtin/MapGet in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1496 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/state/QueryFunction to org/apache/storm/trident/state/QueryFunction in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1496 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/Function to org/apache/storm/trident/operation/Function in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1496 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/fluent/GroupedStream to org/apache/storm/trident/fluent/GroupedStream in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1497 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/Sum to org/apache/storm/trident/operation/builtin/Sum in storm/starter/trident/TridentReach.class. please modify your code to use the new namespace
1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/testing/MemoryMapState$Factory to org/apache/storm/trident/testing/MemoryMapState$Factory in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace
1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/testing/MemoryMapState to org/apache/storm/trident/testing/MemoryMapState in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace
1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/testing/FixedBatchSpout to org/apache/storm/trident/testing/FixedBatchSpout in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace
1499 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/spout/IBatchSpout to org/apache/storm/trident/spout/IBatchSpout in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace
1500 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/Count to org/apache/storm/trident/operation/builtin/Count in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace
1500 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/builtin/FilterNull to org/apache/storm/trident/operation/builtin/FilterNull in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace
1501 [main] WARN o.a.s.h.DefaultShader - Relocating storm/trident/operation/Filter to org/apache/storm/trident/operation/Filter in storm/starter/trident/TridentWordCount.class. please modify your code to use the new namespace
1503 [main] WARN o.a.s.h.DefaultShader - Relocating backtype/storm/task/ShellBolt to org/apache/storm/task/ShellBolt in storm/starter/WordCountTopology$SplitSentence.class. please modify your code to use the new namespace
Running: /usr/jdk64/jdk1.8.0_112/bin/java -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/hdp/2.6.2.14-5/storm -Dstorm.log.dir=/var/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/hdp/current/storm-client/lib -Dstorm.conf.file= -cp /usr/hdp/2.6.2.14-5/storm/lib/asm-5.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/clojure-1.7.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/disruptor-3.3.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/kryo-3.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-api-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-core-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-slf4j-impl-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/minlog-1.3.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/objenesis-2.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/reflectasm-1.10.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/ring-cors-0.1.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/slf4j-api-1.7.21.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-core-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-rename-hack-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/zookeeper.jar:/usr/hdp/2.6.2.14-5/storm/lib/ambari-metrics-storm-sink.jar:/usr/hdp/2.6.2.14-5/storm/extlib/atlas-plugin-classloader-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib/storm-bridge-shim-0.8.0.2.6.2.14-5.jar:/tmp/ea59a668bcca11e7ae97fa163eb0f425.jar:/usr/hdp/current/storm-supervisor/conf:/usr/hdp/2.6.2.14-5/storm/bin -Dstorm.jar=/tmp/ea59a668bcca11e7ae97fa163eb0f425.jar -Dstorm.dependency.jars= -Dstorm.dependency.artifacts={} storm.starter.WordCountTopology WordCountid1aaca8ef_date022917
948 [main] INFO o.a.s.StormSubmitter - Generated ZooKeeper secret payload for MD5-digest: -6085484404615721045:-8414510412923341525
1095 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2001ms (NOT MAX)
3099 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2002ms (NOT MAX)
5103 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2005ms (NOT MAX)
7110 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2013ms (NOT MAX)
9125 [main] WARN o.a.s.u.StormBoundedExponentialBackoffRetry - WILL SLEEP FOR 2016ms (NOT MAX)
11144 [main] WARN o.a.s.u.NimbusClient - Ignoring exception while trying to get leader nimbus info from mst2-an05.field.hortonworks.com. will retry with a different seed host.
java.lang.RuntimeException: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:108) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.ThriftClient.<init>(ThriftClient.java:69) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.utils.NimbusClient.<init>(NimbusClient.java:128) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:84) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.StormSubmitter.getListOfKeysFromBlobStore(StormSubmitter.java:598) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.StormSubmitter.validateConfs(StormSubmitter.java:564) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:210) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:390) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:162) [storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at storm.starter.WordCountTopology.main(WordCountTopology.java:77) [ea59a668bcca11e7ae97fa163eb0f425.jar:?]
Caused by: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.storm.security.auth.TBackoffConnect.retryNext(TBackoffConnect.java:64) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:56) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
... 11 more
Caused by: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:226) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
... 11 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_112]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_112]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_112]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_112]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_112]
at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_112]
at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:221) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.0.2.6.2.14-5.jar:1.1.0.2.6.2.14-5]
... 11 more
Exception in thread "main" org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [mst2-an05.field.hortonworks.com]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:112)
at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58)
at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268)
at org.apache.storm.StormSubmitter.getListOfKeysFromBlobStore(StormSubmitter.java:598)
at org.apache.storm.StormSubmitter.validateConfs(StormSubmitter.java:564)
at org.apache.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:210)
at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:390)
at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:162)
at storm.starter.WordCountTopology.main(WordCountTopology.java:77)
Command failed after 1 tries
Tried stop all storm services and restart Error opening zip file or JAR manifest missing : /usr/hdp/current/storm-supervisor/contrib/storm-jmxetric/lib/jmxetric-1.0.4.jar lrwxrwxrwx. 1 storm storm 14 Oct 29 16:44 /usr/hdp/current/storm-nimbus/logs -> /var/log/storm
[root@mst2-an05 ~]# cd /var/log/storm
[root@mst2-an05 storm]# ll
total 2012
-rw-r--r--. 1 storm hadoop 0 Oct 29 16:59 access-drpc.log
-rw-r--r--. 1 storm hadoop 0 Oct 29 16:59 access-logviewer.log
-rw-r--r--. 1 storm hadoop 0 Oct 29 16:59 access-ui.log
-rw-r--r--. 1 storm hadoop 0 Oct 29 16:59 access-web-drpc.log
-rw-r--r--. 1 storm hadoop 0 Oct 29 16:59 access-web-logviewer.log
-rw-r--r--. 1 storm hadoop 93398 Oct 29 19:58 access-web-ui.log
-rw-r--r--. 1 storm hadoop 2505 Oct 29 19:51 drpc.log
-rw-r--r--. 1 storm hadoop 1836 Oct 29 19:51 drpc.out
-rw-r--r--. 1 storm hadoop 2280 Oct 29 19:51 logviewer.log
-rw-r--r--. 1 storm hadoop 1881 Oct 29 19:51 logviewer.out
-rw-r--r--. 1 storm hadoop 2300 Oct 29 19:52 nimbus.out
-rw-r--r--. 1 storm hadoop 2506 Oct 29 19:51 supervisor.out
-rw-r--r--. 1 storm hadoop 1925972 Oct 29 19:52 ui.log
-rw-r--r--. 1 storm hadoop 1854 Oct 29 19:51 ui.out
[root@mst2-an05 storm]# cat nimbus.out
Running: /usr/jdk64/jdk1.8.0_112/bin/java -server -Ddaemon.name=nimbus -Dstorm.options= -Dstorm.home=/usr/hdp/2.6.2.14-5/storm -Dstorm.log.dir=/var/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/hdp/current/storm-client/lib -Dstorm.conf.file= -cp /usr/hdp/2.6.2.14-5/storm/lib/asm-5.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/clojure-1.7.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/disruptor-3.3.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/kryo-3.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-api-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-core-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-slf4j-impl-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/minlog-1.3.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/objenesis-2.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/reflectasm-1.10.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/ring-cors-0.1.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/slf4j-api-1.7.21.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-core-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-rename-hack-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/zookeeper.jar:/usr/hdp/2.6.2.14-5/storm/lib/ambari-metrics-storm-sink.jar:/usr/hdp/2.6.2.14-5/storm/extlib/atlas-plugin-classloader-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib/storm-bridge-shim-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib-daemon/ojdbc6.jar:/usr/hdp/2.6.2.14-5/storm/extlib-daemon/ranger-plugin-classloader-0.7.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib-daemon/ranger-storm-plugin-shim-0.7.0.2.6.2.14-5.jar:/usr/hdp/current/storm-nimbus/conf -Xmx1024m -javaagent:/usr/hdp/current/storm-nimbus/contrib/storm-jmxetric/lib/jmxetric-1.0.4.jar=host=localhost,port=8649,wireformat31x=true,mode=multicast,config=/usr/hdp/current/storm-nimbus/contrib/storm-jmxetric/conf/jmxetric-conf.xml,process=Nimbus_JVM -Dlogfile.name=nimbus.log -DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector -Dlog4j.configurationFile=/usr/hdp/2.6.2.14-5/storm/log4j2/cluster.xml org.apache.storm.daemon.nimbus
Error opening zip file or JAR manifest missing : /usr/hdp/current/storm-nimbus/contrib/storm-jmxetric/lib/jmxetric-1.0.4.jar
Error occurred during initialization of VM
agent library failed to init: instrument
[root@mst2-an05 storm]# cat supervisor.out
Running: /usr/jdk64/jdk1.8.0_112/bin/java -server -Ddaemon.name=supervisor -Dstorm.options= -Dstorm.home=/usr/hdp/2.6.2.14-5/storm -Dstorm.log.dir=/var/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/hdp/current/storm-client/lib -Dstorm.conf.file= -cp /usr/hdp/2.6.2.14-5/storm/lib/asm-5.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/clojure-1.7.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/disruptor-3.3.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/kryo-3.0.3.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-api-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-core-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.6.2.14-5/storm/lib/log4j-slf4j-impl-2.8.2.jar:/usr/hdp/2.6.2.14-5/storm/lib/minlog-1.3.0.jar:/usr/hdp/2.6.2.14-5/storm/lib/objenesis-2.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/reflectasm-1.10.1.jar:/usr/hdp/2.6.2.14-5/storm/lib/ring-cors-0.1.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.6.2.14-5/storm/lib/slf4j-api-1.7.21.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-core-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/storm-rename-hack-1.1.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/lib/zookeeper.jar:/usr/hdp/2.6.2.14-5/storm/lib/ambari-metrics-storm-sink.jar:/usr/hdp/2.6.2.14-5/storm/extlib/atlas-plugin-classloader-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib/storm-bridge-shim-0.8.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib-daemon/ojdbc6.jar:/usr/hdp/2.6.2.14-5/storm/extlib-daemon/ranger-plugin-classloader-0.7.0.2.6.2.14-5.jar:/usr/hdp/2.6.2.14-5/storm/extlib-daemon/ranger-storm-plugin-shim-0.7.0.2.6.2.14-5.jar:/usr/hdp/current/storm-supervisor/conf -Xmx256m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=56431 -javaagent:/usr/hdp/current/storm-supervisor/contrib/storm-jmxetric/lib/jmxetric-1.0.4.jar=host=localhost,port=8650,wireformat31x=true,mode=multicast,config=/usr/hdp/current/storm-supervisor/contrib/storm-jmxetric/conf/jmxetric-conf.xml,process=Supervisor_JVM -Dlogfile.name=supervisor.log -DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector -Dlog4j.configurationFile=/usr/hdp/2.6.2.14-5/storm/log4j2/cluster.xml org.apache.storm.daemon.supervisor.Supervisor
Error opening zip file or JAR manifest missing : /usr/hdp/current/storm-supervisor/contrib/storm-jmxetric/lib/jmxetric-1.0.4.jar
Error occurred during initialization of VM
agent library failed to init: instrument
... View more
Labels:
- Labels:
-
Apache Storm
09-27-2017
01:31 PM
1 Kudo
A Machine Learning Model learns from data. As you get new incremental data, the Machine Learning model needs to be upgraded. A Machine Learning Model factory ensures that as you have deployed model in production, continuous learning is also happening on incremental new data ingested in the Production environment. As deployed ML Model's performance decays, a new trained and serialized model needs to be deployed. An A/B test between the deployed model and the newly trained model can score them to evaluate the performance of the deployed model versus the incrementally trained model.
In order to build a Machine Learning Model factory, we have to establish a robust road to production, first. The foundational framework is first to establish three environments: DEV, TEST and PROD. 1- DEV - A development environment where the Data Scientists have their own data puddle in order to perform data exploration, profile the data, develop the machine learning features from the data, build the model, train and test it on the limited subset and then commit to git to transport the code to the next stages. For the purpose of scaling and tuning the learning of the Machine Learning model, we establish a DEV Validation environment, where the model learning is scaled with as much historical data as possible and tuned. 2- TEST - The TEST environment is a pre-production environment where we running the machine learning models through integration tests and readying the move of the Machine Learning model to production in two branches: 2a - model deployment: where the trained serialized Machine Learning model is deployed in the production environment 2b - continuous training: where the Machine Learning model is going through continuous training on incremental data 3- PROD - The Production environment is where live data is ingested. In the production environment a deployment server is hosting the serialized trained model. The deployed model exposes a REST api to deliver predictions on live data queries.
The ML model code is running in production ingesting incremental live data and getting continuously trained.
The deployed model and the continuous training model performances are measured. If the deployed model is showing decay in prediction performance, then it is switched with a newer serialized version of the continuous training model.
The model performance measure can be tracked by closing the loop with the users feedback and tracking True Positive, False Positive, True Negative and False Negative. This choreography of training and deploying machine learning models in production is the heart of the ML model factory. The road to production is depicting the journey of building Machine Learning models within the DEV/TEST/PROD environments.
... View more
Labels:
06-28-2017
08:58 PM
3 Kudos
Setting Up a Data Science Platform on HDP using Anaconda Building a Data Science Platform using Anaconda needs to be
able to
Launch PySpark jobs on the cluster Synchronize python libraries from vetted public
repositories Isolate environments with specific dependencies
to run production jobs using an older version of a package whilst simultaneously
running new version of the package Launching notebooks and PySpark jobs using
different kernels such as Python_2.7, Python_3.x, R, Scala Framework of the Data Science Platform
Private Repo Server Edge Nodes Dev Test Prod Ansible Git Jenkins Building blocks of the Data Science Platform
Anaconda Ansible Git Jenkins
... View more
04-03-2017
07:09 PM
Thanks for the comment Michael. I wrote these commands for hdp environments using standard python 2.7 where we can not do a pip install of snakebite. (i.e. hdp clusters are behind the firewall in secure zone with no pip download allowed)
... View more
03-31-2017
07:42 PM
2 Kudos
Interacting with Hadoop HDFS using Python codes This post will go through the following:
Introducing python “subprocess” module Running HDFS commands with Python Examples of HDFS commands from Python 1-Introducing python “subprocess” module The Python “subprocess” module allows us to:
spawn new Unix processes connect to their input/output/error pipes obtain their return codes To run UNIX commands we need to create a subprocess that runs the command. The recommended approach to invoking subprocesses is to use the convenience functions for all use cases they can handle. Or we can use the underlying Popen interface can be used directly. 2-Running HDFS commands with Python We will create a Python function called run_cmd that will effectively allow us to run any unix or linux commands or in our case hdfs dfs commands as linux pipe capturing stdout and stderr and piping the input as list of arguments of the elements of the native unix or HDFS command. It is passed as a Python list rather than a string of characters as you don't have to parse or escape characters. # import the python subprocess module
import subprocess
def run_cmd(args_list):
"""
run linux commands
"""
# import subprocess
print('Running system command: {0}'.format(' '.join(args_list)))
proc = subprocess.Popen(args_list, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
s_output, s_err = proc.communicate()
s_return = proc.returncode
return s_return, s_output, s_err
3-Examples of HDFS commands from Python Run Hadoop ls command in Python
(ret, out, err)= run_cmd(['hdfs', 'dfs', '-ls', 'hdfs_file_path'])
lines = out.split('\n')
Run Hadoop get command in Python
(ret, out, err)= run_cmd(['hdfs', 'dfs', '-get', 'hdfs_file_path', 'local_path'])
Run Hadoop put command in Python
(ret, out, err)= run_cmd(['hdfs', 'dfs', '-put', 'local_file', 'hdfs_file_path'])
Run Hadoop copyFromLocal command in Python
(ret, out, err)= run_cmd(['hdfs', 'dfs', '-copyFromLocal', 'local_file', 'hdfs_file_path'])
Run Hadoop copyToLocal command in Python
(ret, out, err)= run_cmd(['hdfs', 'dfs', '-copyToLocal', 'hdfs_file_path', 'local_file'])
hdfs dfs -rm -skipTrash /path/to/file/you/want/to/remove/permanently
Run Hadoop remove file command in Python
(ret, out, err)= run_cmd(['hdfs', 'dfs', '-rm', 'hdfs_file_path'])
(ret, out, err)= run_cmd(['hdfs', 'dfs', '-rm', '-skipTrash', 'hdfs_file_path'])
rm -r
HDFS Command to remove the entire directory and all of its content from HDFS.
Usage: hdfs dfs -rm -r <path>
(ret, out, err)= run_cmd(['hdfs', 'dfs', '-rm', '-r', 'hdfs_file_path'])
(ret, out, err)= run_cmd(['hdfs', 'dfs', '-rm', '-r', '-skipTrash', 'hdfs_file_path'])
Check if a file exist in HDFS
Usage: hadoop fs -test -[defsz] URI
Options:
-d: f the path is a directory, return 0.
-e: if the path exists, return 0.
-f: if the path is a file, return 0.
-s: if the path is not empty, return 0.
-z: if the file is zero length, return 0.
Example:
hadoop fs -test -e filename
hdfs_file_path = '/tmpo'
cmd = ['hdfs', 'dfs', '-test', '-e', hdfs_file_path]
ret, out, err = run_cmd(cmd)
print(ret, out, err)
if ret:
print('file does not exist')
These simple but very powerful lines of code allow to interact with HDFS in a programmatic way and can be easily scheduled as part of schedule cron jobs.
... View more
Labels:
01-04-2017
11:43 PM
@SBandaru If you are using spark with hdp, then you have to do following things.
Add these entries in your $SPARK_HOME/conf/spark-defaults.conf spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0-2041 (your installed HDP version) spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0-2041 (your installed HDP version) create java-opts file in $SPARK_HOME/conf and add the installed HDP version in that file like -Dhdp.version=2.2.0.0-2041 (your installed HDP version) to know hdp version please run command hdp-select status hadoop-client in the cluster
... View more
12-31-2016
10:45 PM
1 Kudo
Installing and Exploring Spark 2.0 with Jupyter Notebook and Anaconda Python in your laptop
1-Objective
2-Installing Anaconda Python
3-Checking Python Install
4-Installing Spark
5-Checking Spark Install
6-Launching Jupyter Notebook with PySpark 2.0.2
7-Exploring PySpark 2.0.2
a.Spark Session
b.Read CSV
i.Spark 2.0 and Spark 1.6
ii.Pandas
c.Pandas DataFrames, Spark DataSets, DataFrames and RDDs
d.Machine Learning Pipeline
i.SciKit Learn
ii.Spark MLLib, ML
8-Conclusion
1-Objective
It is often useful to have python with the Jupyter notebook installed on your laptop in order to quickly develop and test some code ideas or to explore some data. Adding the ability to combine Apache Spark to this will also allow you to prototype ideas and exploratory data pipelines before hitting a Hadoop cluster and paying for Amazon Web Services.
We leverage the power of the Python ecosystem with libraries such as Numpy (scientific computing library of high-level mathematical functions to operate on arrays and matrices), SciPy (SciPy library depends on NumPy, which provides convenient and fast N-dimensional array manipulation), Pandas (high performance data structure and data analysis library to build complex data transformation flows), Scikit-Learn (library that implements a range of machine learning, preprocessing, cross-validation and visualization algorithms), NLTK (Natural Language Tool Kit to process text data, libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries)…
We also leverage the strengths of Spark including Spark-SQL, Spark-MLLib or ML.
2-Installing Anaconda Python
We install Continuum’s Anaconda distribution by downloading the install script from the Continuum website. https://www.continuum.io/downloads
The advantage of the Anaconda distribution is that lot of the essential python packages comes in bundled.
You do not have to struggle with all the dependencies synchronization.
We will use the following commands to download the install script. The command is to install Python version 3.5
HW12256:~ usr000$ wget http://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh
If you wish to install Python 2.7, the following download is recommended.
HW12256:~ usr000$ wget http://repo.continuum.io/archive/Anaconda2-4.2.0-Linux-x86_64.sh
Accordingly, in the terminal, issue the following bash command to launch the install.
Python 3.5 version
HW12256:~ usr000$ bash Anaconda3-4.2.0-Linux-x86_64.sh
Python 2.7 version
HW12256:~ usr000$ bash Anaconda2-4.2.0-Linux-x86_64.sh
In the following steps, we are using Python 3.5 as the base environment.
3-Checking Python Install
In order to check the Python install, we issue the following commands in the terminal.
HW12256:~ usr000$ which python
/Users/usr000/anaconda/bin/python
HW12256:~ usr000$ echo $PATH
/Users/usr000/anaconda/bin:/usr/local/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
HW12256:~ usr000$ python --version
Python 3.5.2 :: Anaconda 4.1.1 (x86_64)
HW12256:~ usr000$ python
Python 3.5.2 |Anaconda 4.1.1 (x86_64)|
(default, Jul 2 2016, 17:52:12)[GCC 4.2.1 Compatible Apple LLVM 4.2
(clang-425.0.28)] on darwinType "help",
"copyright", "credits" or "license" for more
information.
>>> import sys
>>> print("Python version: {}
".format(sys.version))Python version: 3.5.2 |Anaconda 4.1.1
(x86_64)| (default, Jul 2 2016,
17:52:12)[GCC 4.2.1 Compatible Apple LLVM 4.2
(clang-425.0.28)]
>>> from datetime import datetime
>>> print('current date and time:
{}'.format(datetime.now()))current date and time: 2016-12-29
09:46:32.393985
>>> print('current date and time:{}'.format(datetime.now().strftime('%Y-%m-%d
%H:%M:%S')))current date and time: 2016-12-29 09:51:33
>>> exit()
Anaconda Python includes a package manager called ‘conda’ which can list and update the existing libraries available in the current system.
HW12256:~ usr000$ conda info
Current conda install:
platform : osx-64
conda version : 4.2.12
conda is private : False
conda-env version : 4.2.12
conda-build version : 0+unknown
python version : 3.5.2.final.0
requests version : 2.10.0
root environment : /Users/usr000/anaconda (writable)
default environment : /Users/usr000/anaconda
envs directories : /Users/usr000/anaconda/envs
package cache : /Users/usr000/anaconda/pkgs
channel URLs : https://repo.continuum.io/pkgs/free/osx-64
https://repo.continuum.io/pkgs/free/noarch
https://repo.continuum.io/pkgs/pro/osx-64
https://repo.continuum.io/pkgs/pro/noarch
config file : None
offline mode : False
HW12256:~ usr000$ conda list
4-Installing Spark
To install Spark, we download the pre-built spark tarball spark-2.0.2-bin-hadoop2.7.tgz from http://spark.apache.org/downloads.html and move to your target Spark directory.
Untar the tarball in your chosen directory
HW12256:bin usr000$ tar -xvfz spark-2.0.2-bin-hadoop2.7.tgz
Create symlink to spark2 directory
HW12256:bin usr000$ ln -s ~/bin/sparks/spark-2.0.2-bin-hadoop2.7 ~/bin/spark2
5-Checking Spark Install
Check the directories created under Spark 2
HW12256:bin usr000$ ls -lru
total 16drwxr-xr-x
5 usr000 staff 170 Dec 28 10:39 sparkslrwxr-xr-x
1 usr000 staff 50 Dec 28 10:39 spark2 -> /Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7lrwxr-xr-x
1 usr000 staff 51 May 23
2016 spark -> /Users/usr000/bin/sparks/spark-1.6.1-bin-hadoop2.6/HW12256:bin usr000$ cd spark2HW12256:spark2 usr000$ ls -lrutotal 112drwxr-xr-x@ 3 usr000
staff 102 Jan 1 1970
yarndrwxr-xr-x@
24 usr000 staff 816 Jan
1 1970 sbindrwxr-xr-x@
10 usr000 staff 340 Dec 28 10:30 pythondrwxr-xr-x@
38 usr000 staff 1292 Jan
1 1970 licensesdrwxr-xr-x@ 208 usr000 staff
7072 Dec 28 10:30 jarsdrwxr-xr-x@ 4 usr000
staff 136 Jan 1 1970
examplesdrwxr-xr-x@ 5 usr000
staff 170 Jan 1 1970
datadrwxr-xr-x@ 9 usr000
staff 306 Dec 28 10:27 confdrwxr-xr-x@
24 usr000 staff 816 Dec 28 10:30 bin-rw-r--r--@ 1 usr000
staff 120 Dec 28 10:25 RELEASE-rw-r--r--@ 1 usr000
staff 3828 Dec 28 10:25
README.mddrwxr-xr-x@ 3 usr000
staff 102 Jan 1 1970
R-rw-r--r--@ 1 usr000
staff 24749 Dec 28 10:25 NOTICE-rw-r--r--@ 1 usr000
staff 17811 Dec 28 10:25 LICENSEHW12256:spark2 usr000$
Running SparkPi example in local mode.
Scala command
# export SPARK_HOMEHW12256:spark2 usr000$ export
SPARK_HOME=/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7HW12256:spark2 usr000$ echo $SPARK_HOME/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7# Run Spark PI example in ScalaHW12256:spark2 usr000$ ./bin/spark-submit
--class org.apache.spark.examples.SparkPi --driver-memory 512m
--executor-memory 512m --executor-cores 1
$SPARK_HOME/examples/jars/spark-examples*.jar 5Python commandHW12256:spark2 usr000$ ./bin/spark-submit --driver-memory
512m --executor-memory 512m --executor-cores 1 examples/src/main/python/pi.py
10Scala exampleHW12256:spark2 usr000$ export
SPARK_HOME=/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7HW12256:spark2 usr000$ echo $SPARK_HOME/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7HW12256:spark2 usr000$ ./bin/spark-submit
--class org.apache.spark.examples.SparkPi --driver-memory 512m
--executor-memory 512m --executor-cores 1
$SPARK_HOME/examples/jars/spark-examples*.jar 5Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties16/12/29 11:40:53 INFO SparkContext:
Running Spark version 2.0.216/12/29 11:40:53 WARN NativeCodeLoader:
Unable to load native-hadoop library for your platform... using builtin-java
classes where applicable...16/12/29 11:40:55 INFO DAGScheduler: Job 0
finished: reduce at SparkPi.scala:38, took 0.851288 sPi is roughly 3.139094278188556216/12/29 11:40:55 INFO SparkUI: Stopped
Spark web UI at http://000.000.0.0:4040...16/12/29 11:40:55 INFO SparkContext:
Successfully stopped SparkContext16/12/29 11:40:55 INFO ShutdownHookManager:
Shutdown hook called16/12/29 11:40:55 INFO ShutdownHookManager:
Deleting directory
/private/var/folders/1r/8qylt4bj4h59b3h_1xq_nsw00000gp/T/spark-35b67f21-1d52-4dee-9c75-7e9d9c153adaHW12256:spark2 usr000$
Python example
HW12256:spark2 usr000$ ./bin/spark-submit examples/src/main/python/pi.py 10
Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties16/12/29 11:27:33 INFO SparkContext:
Running Spark version 2.0.216/12/29 11:27:33 WARN NativeCodeLoader:
Unable to load native-hadoop library for your platform... using builtin-java
classes where applicable...16/12/29 11:27:36 INFO TaskSchedulerImpl:
Removed TaskSet 0.0, whose tasks have all completed, from pool16/12/29 11:27:36 INFO DAGScheduler: Job 0
finished: reduce at /Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7/examples/src/main/python/pi.py:43,
took 1.199257 sPi is roughly 3.13836016/12/29 11:27:36 INFO SparkUI: Stopped
Spark web UI at http://http://000.000.0.0:404016/12/29 11:27:36 INFO
MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!...16/12/29 11:27:36 INFO SparkContext:
Successfully stopped SparkContext16/12/29 11:27:37 INFO ShutdownHookManager:
Shutdown hook called16/12/29 11:27:37 INFO ShutdownHookManager:
Deleting directory
/private/var/folders/1r/8qylt4bj4h59b3h_1xq_nsw00000gp/T/spark-eb12faa9-b7ff-4556-9538-45ddcdc6797b16/12/29 11:27:37 INFO ShutdownHookManager:
Deleting directory /private/var/folders/1r/8qylt4bj4h59b3h_1xq_nsw00000gp/T/spark-eb12faa9-b7ff-4556-9538-45ddcdc6797b/pyspark-ba9947c5-dbea-4edc-9c4c-c2c316e6caba
Wordcount program using PySpark
HW12256:spark2 usr000$ ./bin/pyspark
Python 2.7.10 (default, Jul 30 2016,
19:40:32)[GCC 4.2.1 Compatible Apple LLVM 8.0.0
(clang-800.0.34)] on darwinType "help",
"copyright", "credits" or "license" for more
information.Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.propertiesSetting default log level to
"WARN".To adjust logging level use
sc.setLogLevel(newLevel).16/12/29 12:25:15 WARN NativeCodeLoader:
Unable to load native-hadoop library for your platform... using builtin-java
classes where applicableWelcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version
2.0.2
/_/Using Python version 2.7.10 (default, Jul
30 2016 19:40:32)SparkSession available as 'spark'.>>> import os>>> print(os.getcwd())/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7>>> import re>>> from operator import add>>> wordcounts_in =
sc.textFile('README.md').flatMap(lambda l: re.split('\W+', l.strip())).filter(lambda
w: len(w)>0).map(lambda w: (w,1)).reduceByKey(add).map(lambda (a,b):
(b,a)).sortByKey(ascending = False)/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/shuffle.py:58:
UserWarning: Please install psutil to have better support with spilling/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/shuffle.py:58:
UserWarning: Please install psutil to have better support with spilling>>> wordcounts_in.take(10)[(23, u'the'), (18, u'Spark'), (14, u'to'),
(13, u'run'), (11, u'for'), (11, u'apache'), (11, u'spark'), (11, u'and'), (11,
u'org'), (8, u'a')]>>> wordcounts_in =
sc.textFile('README.md').flatMap(lambda l: re.split('\W+',
l.strip())).filter(lambda w: len(w)>0).map(lambda w: (w,1)).reduceByKey(add).map(lambda
(a,b): (b,a)).sortByKey(ascending = False).map(lambda (a,b): (b,a))/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/shuffle.py:58:
UserWarning: Please install psutil to have better support with spilling/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/shuffle.py:58:
UserWarning: Please install psutil to have better support with spilling>>> wordcounts_in.take(10)
[(u'the',
23), (u'Spark', 18), (u'to', 14), (u'run', 13), (u'for', 11), (u'apache', 11),
(u'spark', 11), (u'and', 11), (u'org', 11), (u'a', 8)]>>>exit()
6-Launching Jupyter Notebook with PySpark
Launching Jupyter Notebook with Spark 1.6.*, we use to associate the --packages com.databricks:spark-csv_2.11:1.4.0 parameter in the command as the csv package was not natively part of Spark.
HW12256:~ usr000$ PYSPARK_DRIVER_PYTHON=jupyter
PYSPARK_DRIVER_PYTHON_OPTS='notebook' PYSPARK_PYTHON=python3 /Users/usr000/bin/spark/bin/pyspark
--packages com.databricks:spark-csv_2.11:1.4.0
In the case of Spark 2.0.*, we do not need to associate the spark-csv –packages parameter, as spark-csv is part of the standard Spark 2.0 library.
HW12256:~ usr000$
PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS='notebook'
PYSPARK_PYTHON=python3 /Users/usr000/bin/spark2/bin/pyspark
7-Exploring PySpark 2.0.2
We will explore the new features of Spark 2.0.2 using PySpark and contrasting where appropriate with previous version of spark and with pandas. In the case of the machine learning pipeline, we will contract Spark MLLib or ML with Scikit Learn.
a.Spark Session
Spark 2.0 introduces SparkSession. SparkSession is the single entry point for interacting with Spark functionality. It replaces and encapsulates the SQLContext, HiveContext and StreamingContext for a more unified access to the DataFrame and Dataset APIs. The SQLContext, HiveContext and StreamingContext still exist under the hood in Spark 2.0 for continuity purpose with the Spark legacy code.
The Spark session has to be created when using spark-submit command. An example on how to do that:
from pyspark.sql import SparkSession
from pyspark import SparkContext
from pyspark import SparkConf
# from pyspark.sql import SQLContext
spark = SparkSession\ .builder\
.appName("example-spark")\
.config("spark.sql.crossJoin.enabled","true")\
.getOrCreate()sc = SparkContext()
# sqlContext = SQLContext(sc)
When typing ‘pyspark’ at the terminal, python automatically creates the spark context sc.
A SparkSession is automatically generated and available as 'spark'.
Application name can be accessed using SparkContext.
spark.sparkContext.appName# Configuration is accessible
using RuntimeConfig:from py4j.protocol import
Py4JErrortry: spark.conf.get("some.conf")except Py4JError as e: pass
The following code outline the available spark context sc as well as the new spark session under the name "spark" which includes the previous sqlContext, HiveContext, StreamingContext under one unified single entry point.
sqlContext, HiveContext, StreamingContext still exist to ensure continuity with legacy code in Spark.
HW12256:spark2 usr000$ ./bin/pyspark
Python 2.7.10 (default, Jul 30 2016, 19:40:32)[GCC 4.2.1 Compatible Apple
LLVM 8.0.0 (clang-800.0.34)] on darwinType "help",
"copyright", "credits" or "license" for more
information.Using Spark's default log4j
profile: org/apache/spark/log4j-defaults.propertiesSetting default log level to
"WARN".To adjust logging level use
sc.setLogLevel(newLevel).16/12/29 20:41:27 WARN
NativeCodeLoader: Unable to load native-hadoop library for your platform...
using builtin-java classes where applicableWelcome
to ____ __ / __/__
___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.0.2 /_/Using Python version 2.7.10
(default, Jul 30 2016 19:40:32)SparkSession available as
'spark'.>>> sc<pyspark.context.SparkContext
object at 0x101e9c850>>>>
sc._conf.getAll()[(u'spark.app.id',
u'local-1483040488671'), (u'spark.sql.catalogImplementation', u'hive'),
(u'spark.rdd.compress', u'True'), (u'spark.serializer.objectStreamReset',
u'100'), (u'spark.master', u'local[*]'), (u'spark.executor.id', u'driver'),
(u'spark.submit.deployMode', u'client'), (u'hive.metastore.warehouse.dir',
u'file:/Users/usr000/bin/sparks/spark-2.0.2-bin-hadoop2.7/spark-warehouse'),
(u'spark.driver.port', u'57764'), (u'spark.app.name', u'PySparkShell'),
(u'spark.driver.host', u'000.000.0.0')]>>>
spark<pyspark.sql.session.SparkSession
object at 0x102df9b50>>>>
spark.sparkContext<pyspark.context.SparkContext
object at 0x101e9c850>>>>
spark.sparkContext.appNameu'PySparkShell'>>> from
pyspark.sql.functions import *>>> spark.range(1,
7, 2).collect()16/12/29 20:58:32 WARN
ObjectStore: Version information not found in metastore.
hive.metastore.schema.verification is not enabled so recording the schema
version 1.2.016/12/29 20:58:32 WARN
ObjectStore: Failed to get database default, returning NoSuchObjectException[Row(id=1), Row(id=3),
Row(id=5)]
b.Read CSV
We describe how to easily access csv files from spark and from pandas and load them into dataframe for data exploration, maniputation and mining.
i.Spark 2.0 & Spark 1.6
We can create a spark dataframe directly from reading the csv file.
In order to be compatible with previous format we have include a conditional switch in the format statement
## Spark 2.0 and Spark 1.6 compatible read csv#formatPackage = "csv" if sc.version > '1.6' else
"com.databricks.spark.csv"df = sqlContext.read.format(formatPackage).options(header='true',
delimiter = '|').load("s00_dat/dataframe_sample.csv")df.printSchema()
ii.Pandas
We can create the iris pandas dataframe from the existing dataset from sklearn.
from
sklearn.datasets import load_irisimport numpy as npimport pandas as pdimport matplotlib.pyplot as pltiris = load_iris()df = pd.DataFrame(iris.data, columns=iris.feature_names)df['species'] = pd.Categorical.from_codes(iris.target, iris.target_names)
c.Dataframes
i.Pandas DataFrames
Pandas dataframes in conjunction with visualization libraries such as matplotlib and seaborn give us some nice insights into the data
ii.Spark DataSets, Spark DataFrames and Spark RDDs
Spark Dataframe and Spark RDDs are the fundamental data structure that allow us to manipulate and interact with the various Spark libraries.
Spark DataSets are more relevant for Scala developpers and give the ability to create typed spark dataframe.
d.Machine Learning
i.SciKit Learn
We demonstrate a random forest machine learning pipeline using scikit learn in the ipython notebook.
ii.Spark MLLib, Spark ML
We demonstrate a random forest machine learning pipeline using Spark MLlib and Spark ML
8-Conclusion
Spark and Jupyter Notebook using the Anaconda Python distribution provide a very powerful development environment in your laptop.
It allows quick exploration of data mining, machine learning, visualizations in a flexible and easy to use environment.
We have described the installation of Jupyter Notebook, Spark. We have described few data processing pipeline as well as a machine learning classification using Random Forest.
... View more
Labels:
09-18-2016
07:52 PM
1 Kudo
Hi Mike, follow the following steps: 1- in the CLI where spark is installed, first export Hadoop conf export HADOOP_CONF_DIR= ~/etc/hadoop/conf (you may want to put it in your spark conf file: export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/etc/hadoop/conf}) 2- launch spark-shell val input = sc.textFile("hdfs:///....insert/your/hdfs/file/path...") input.count() //prints the nr of lines read ...
... View more