Member since
09-24-2015
38
Posts
41
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4500 | 10-18-2017 01:27 PM | |
28981 | 04-22-2016 09:23 PM | |
1650 | 12-22-2015 02:41 PM | |
2290 | 12-22-2015 12:54 PM | |
4403 | 12-08-2015 03:44 PM |
12-08-2017
03:45 AM
@Joseph Niemiec How can I do this command " select * from table where date <= '2017-12-08' " in nest partitions form? In case the table is partitioned by day,month,year
... View more
06-20-2016
11:15 PM
@Jagdish Saripella, @Joseph Niemiec
Have you figured out the problem ? I ran into similar issue. This is unlikely a data-skew issue.
... View more
12-22-2015
02:31 PM
As for multiple networks you can multi-home the nodes so you have a Public network and a Cluster Traffic network. Hardware vendors like the Cisco Refernce architecture are designed expecting multi-homing to be configured. https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html
... View more
04-14-2016
10:07 PM
Per the Storm documentation (http://storm.apache.org/releases/0.10.0/Setting-up-a-Storm-cluster.html): "If you need support from external libraries or custom plugins, you can place such jars into the extlib/ and extlib-daemon/ directories. Note that the extlib-daemon/ directory stores jars used only by daemons (Nimbus, Supervisor, DRPC, UI, Logviewer), e.g., HDFS and customized scheduling libraries. Accordingly, two environmental variables STORM_EXT_CLASSPATH and STORM_EXT_CLASSPATH_DAEMON can be configured by users for including the external classpath and daemon-only external classpath." This means that extlib-daemon should not be on the classpath for workers. This functionality was introduced with STORM-483 (https://github.com/apache/storm/commit/05306d5053ff91bd323c4b54cd246c9f928ca339), and supervisor.clj was supposed to be updated as follows: topo-classpath (if-let [cp (storm-conf TOPOLOGY-CLASSPATH)]
[cp]
[])
- classpath (-> (current-classpath)
+ classpath (-> (worker-classpath)
(add-to-classpath [stormjar])
(add-to-classpath topo-classpath))
top-gc-opts (storm-conf TOPOLOGY-WORKER-GC-CHILDOPTS)
However, I decompiled the Hortonworks storm-core jar, and the old version of the code that calls current-classpath still appears: public static final Var const__10 = RT.var((String)"backtype.storm.util", (String)"add-to-classpath");
public static final Var const__11 = RT.var((String)"backtype.storm.util", (String)"current-classpath");
...
v15 = new Object[1];
v16 = stormjar;
stormjar = null;
v15[0] = v16;
v17 = topo_classpath;
topo_classpath = null;
classpath = ((IFn)supervisor$fn__6546.const__10.getRawRoot()).invoke(((IFn)supervisor$fn__6546.const__10.getRawRoot()).invoke(((IFn)supervisor$fn__6546.const__11.getRawRoot()).invoke(), (Object)RT.vector((Object[])v15)), (Object)v17); I believe that worker-classpath was designed to construct a classpath for a worker JVM that does not include daemon-specific locations (like extlib-daemon). However since the HDP version does not call worker-classpath, the worker ends up inheriting the supervisor's classpath via the call to current-classpath. I checked storm-core-0.10.0.2.4.1.1-3.jar which I believe is the latest HDP build, and it still does not call worker-classpath. This seems like a bug.
... View more
10-02-2015
12:54 PM
I think we are going to try to go the route of having 2 HS2, one with Ranger Integration, one without. No matter what we do we will end up with 1000's of policies as there are potentially 1000's of different data sources being onboarded to the system. Most of these don't require anything more then HDFS Security, so having to enter them into Hive for access is a management nightmare. By having a 'Ranger HS2 with doAs=False' and a 'Non-Ranger doAs=true' HS2 we can at least decide which tables require Ranger benefits and then force ownership of that data to Hive and make the policcies as a one off, rather then polices for every new data source that get on-boarded. Today after a simple migration from ACL's we had 17+ pages of policies in Ranger, which we all thought was crazy when most of them don't need Ranger benefits. Just to note this is also using wildcards for resources and multiple users and groups in a policy.
... View more