Member since
06-13-2016
34
Posts
9
Kudos Received
0
Solutions
12-26-2017
10:10 PM
Anyone find an answer to this? We're going to need to do something similar....
... View more
05-08-2017
07:50 PM
1 Kudo
In a nutshell, few were using it, and it really was just a bit easier-to-use skin over other tools (primarily Oozie); and as they attempted to add features, it was looking more like the underlying tools, so.... It will be interesting to see how the marketing/messaging shifts over the coming months. This related question and answer seems to suggest Atlas isn't going anywhere, and it seems Atlas and Oozie were the primary underpinnings of Falcon (I think): https://community.hortonworks.com/questions/97570/apache-falcon-in-hdp-30.html
... View more
04-04-2017
08:06 PM
I'll reach out to my account team, but it still seems odd to announce a component as deprecated when you can't point to an announced product that will be taking on its functionality. The whole point of deprecating is normally to give folks a heads-up they should start moving to the replacement....
... View more
04-03-2017
09:22 PM
1 Kudo
Documentation for HDP 2.6.0 was recently posted, and Falcon is marked as deprecated as of 2.6.0 with 3.0.0 listed as the Target Version for Removal: link to deprecation notice in release notes There are no comments indicating where Falcon's functionality is moving, so I'm wondering what are the plans for Falcon's functionality?...
... View more
Labels:
03-22-2017
07:21 PM
Looks like openscoring also offers jpmml under a BSD license for a fee, see below. Unfortunately, it appears there's a gray area between "we just want to use the software" and "want to redistribute proprietary software based on this code." The wording of the attached blurb from openscoring suggests they think "use" of AGPL code is fine, even though the FSF stance seems to be that GNU AGPL is only compatible w/ GPL: https://www.gnu.org/licenses/why-affero-gpl.en.html
... View more
03-17-2017
08:49 PM
hive.exec.scratchdir on this cluster is /tmp/hive. Don't know why the user appears to be exceeding quota on a personal directory.
... View more
01-30-2017
08:38 PM
Using HDP 2.4.2, Ambari 2.2.2.0, I see nimbus.authorizer: org.apache.ranger.authorization.storm.authorizer.RangerStormAuthorizer
... View more
01-30-2017
07:47 PM
nimbus.supervisor.users and nimbus.admins need to be added manually even when Ranger is being used? I'm in a group that has the following Permissions: Submit Topology, File Upload, Get Nimbus Conf, Get Cluster Info, File Download, Kill Topology, Rebalance, Activate, Deactivate, Get Topology Conf, Get Topology, Get User Topology, Get Topology Info. And 'Delegate Admin' is checked.
... View more
01-30-2017
06:59 PM
I was suspecting configs were not correct because trying to run hortonworks' WordCountTopology sample was not working. Here is what I was seeing: -bash-4.1$ storm jar storm-starter-0.0.1-storm-0.9.0.1.jar storm.starter.WordCountTopology WordCount Running: /opt/java/hotspot/7/64_bit/jdk1.7.0_79/bin/java -client -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/hdp/2.4.2.0-258/storm -Dstorm.log.dir=/var/hadoop/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/hdp/current/storm-client/lib -Dstorm.conf.file= -cp /usr/hdp/2.4.2.0-258/storm/lib/cheshire-5.3.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/log4j-core-2.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/hadoop-auth-2.7.1.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/storm/lib/clojure-1.6.0.jar:/usr/hdp/2.4.2.0-258/storm/lib/clj-stacktrace-0.2.7.jar:/usr/hdp/2.4.2.0-258/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.4.2.0-258/storm/lib/oncrpc-1.0.7.jar:/usr/hdp/2.4.2.0-258/storm/lib/jackson-core-2.3.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/clout-1.0.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/ring-servlet-1.3.0.jar:/usr/hdp/2.4.2.0-258/storm/lib/ring-json-0.3.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/kryo-2.21.jar:/usr/hdp/2.4.2.0-258/storm/lib/jline-0.9.94.jar:/usr/hdp/2.4.2.0-258/storm/lib/tigris-0.1.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/reflectasm-1.07-shaded.jar:/usr/hdp/2.4.2.0-258/storm/lib/tools.namespace-0.2.4.jar:/usr/hdp/2.4.2.0-258/storm/lib/ring-devel-1.3.0.jar:/usr/hdp/2.4.2.0-258/storm/lib/java.classpath-0.2.2.jar:/usr/hdp/2.4.2.0-258/storm/lib/javax.servlet-2.5.0.v201103041518.jar:/usr/hdp/2.4.2.0-258/storm/lib/compojure-1.1.3.jar:/usr/hdp/2.4.2.0-258/storm/lib/core.incubator-0.1.0.jar:/usr/hdp/2.4.2.0-258/storm/lib/ring-core-1.1.5.jar:/usr/hdp/2.4.2.0-258/storm/lib/gmetric4j-1.0.7.jar:/usr/hdp/2.4.2.0-258/storm/lib/ns-tracker-0.2.2.jar:/usr/hdp/2.4.2.0-258/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.4.2.0-258/storm/lib/commons-codec-1.6.jar:/usr/hdp/2.4.2.0-258/storm/lib/disruptor-2.10.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/asm-4.0.jar:/usr/hdp/2.4.2.0-258/storm/lib/zookeeper.jar:/usr/hdp/2.4.2.0-258/storm/lib/log4j-slf4j-impl-2.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/storm-core-0.10.0.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/storm/lib/tools.logging-0.2.3.jar:/usr/hdp/2.4.2.0-258/storm/lib/log4j-api-2.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/hiccup-0.3.6.jar:/usr/hdp/2.4.2.0-258/storm/lib/minlog-1.2.jar:/usr/hdp/2.4.2.0-258/storm/lib/slf4j-api-1.7.7.jar:/usr/hdp/2.4.2.0-258/storm/lib/jackson-dataformat-smile-2.3.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/ring-jetty-adapter-1.3.0.jar:/usr/hdp/2.4.2.0-258/storm/lib/clj-time-0.8.0.jar:storm-starter-0.0.1-storm-0.9.0.1.jar:/usr/hdp/current/storm-supervisor/conf:/usr/hdp/2.4.2.0-258/storm/bin -Dstorm.jar=storm-starter-0.0.1-storm-0.9.0.1.jar storm.starter.WordCountTopology WordCount
18:49:19.481 [main] INFO b.s.u.Utils - Using defaults.yaml from resources
18:49:19.551 [main] INFO b.s.u.Utils - Using storm.yaml from resources
18:49:19.611 [main] INFO b.s.u.Utils - Using defaults.yaml from resources
18:49:19.631 [main] INFO b.s.u.Utils - Using storm.yaml from resources
18:49:19.648 [main] INFO b.s.StormSubmitter - Generated ZooKeeper secret payload for MD5-digest: -6595191808170807148:-7705041539986139533
18:49:19.649 [main] INFO b.s.s.a.AuthUtils - Got AutoCreds []
18:49:19.664 [main] INFO b.s.u.StormBoundedExponentialBackoffRetry - The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
18:49:19.723 [main] WARN b.s.s.a.k.ClientCallbackHandler - Could not login: the client is being asked for a password, but the client code does not currently support obtaining a password from the user. Make sure that the client is configured to use a ticket cache (using the JAAS configuration setting 'useTicketCache=true)' and restart the client. If you still get this message after that, the TGT in the ticket cache has expired and must be manually refreshed. To do so, first determine if you are using a password or a keytab. If the former, run kinit in a Unix shell in the environment of the user who is running this client using the command 'kinit <princ>' (where <princ> is the name of the client's Kerberos principal). If the latter, do 'kinit -k -t <keytab> <princ>' (where <princ> is the name of the Kerberos principal, and <keytab> is the location of the keytab file). After manually refreshing your cache, restart this client. If you continue to see this message after manually refreshing your cache, ensure that your KDC host's clock is in sync with this host's clock.
18:49:19.725 [main] ERROR b.s.s.a.k.KerberosSaslTransportPlugin - Server failed to login in principal:javax.security.auth.login.LoginException: No password provided
javax.security.auth.login.LoginException: No password provided .... Now that I've tried passing the path to the client_jaas.conf on the command line, it seems things get a little further, and there is a different error: -bash-4.1$ storm jar storm-starter-0.0.1-storm-0.9.0.1.jar storm.starter.WordCountTopology WordCount -c java.security.auth.login.config=/etc/storm/conf/client_jaas.conf Running: /opt/java/hotspot/7/64_bit/jdk1.7.0_79/bin/java -client -Ddaemon.name= -Dstorm.options=java.security.auth.login.config%3D%2Fetc%2Fstorm%2Fconf%2Fclient_jaas.conf -Dstorm.home=/usr/hdp/2.4.2.0-258/storm -Dstorm.log.dir=/var/hadoop/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/hdp/current/storm-client/lib -Dstorm.conf.file= -cp /usr/hdp/2.4.2.0-258/storm/lib/cheshire-5.3.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/log4j-core-2.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/hadoop-auth-2.7.1.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/storm/lib/clojure-1.6.0.jar:/usr/hdp/2.4.2.0-258/storm/lib/clj-stacktrace-0.2.7.jar:/usr/hdp/2.4.2.0-258/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.4.2.0-258/storm/lib/oncrpc-1.0.7.jar:/usr/hdp/2.4.2.0-258/storm/lib/jackson-core-2.3.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/clout-1.0.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/ring-servlet-1.3.0.jar:/usr/hdp/2.4.2.0-258/storm/lib/ring-json-0.3.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/kryo-2.21.jar:/usr/hdp/2.4.2.0-258/storm/lib/jline-0.9.94.jar:/usr/hdp/2.4.2.0-258/storm/lib/tigris-0.1.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/reflectasm-1.07-shaded.jar:/usr/hdp/2.4.2.0-258/storm/lib/tools.namespace-0.2.4.jar:/usr/hdp/2.4.2.0-258/storm/lib/ring-devel-1.3.0.jar:/usr/hdp/2.4.2.0-258/storm/lib/java.classpath-0.2.2.jar:/usr/hdp/2.4.2.0-258/storm/lib/javax.servlet-2.5.0.v201103041518.jar:/usr/hdp/2.4.2.0-258/storm/lib/compojure-1.1.3.jar:/usr/hdp/2.4.2.0-258/storm/lib/core.incubator-0.1.0.jar:/usr/hdp/2.4.2.0-258/storm/lib/ring-core-1.1.5.jar:/usr/hdp/2.4.2.0-258/storm/lib/gmetric4j-1.0.7.jar:/usr/hdp/2.4.2.0-258/storm/lib/ns-tracker-0.2.2.jar:/usr/hdp/2.4.2.0-258/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.4.2.0-258/storm/lib/commons-codec-1.6.jar:/usr/hdp/2.4.2.0-258/storm/lib/disruptor-2.10.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/asm-4.0.jar:/usr/hdp/2.4.2.0-258/storm/lib/zookeeper.jar:/usr/hdp/2.4.2.0-258/storm/lib/log4j-slf4j-impl-2.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/storm-core-0.10.0.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/storm/lib/tools.logging-0.2.3.jar:/usr/hdp/2.4.2.0-258/storm/lib/log4j-api-2.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/hiccup-0.3.6.jar:/usr/hdp/2.4.2.0-258/storm/lib/minlog-1.2.jar:/usr/hdp/2.4.2.0-258/storm/lib/slf4j-api-1.7.7.jar:/usr/hdp/2.4.2.0-258/storm/lib/jackson-dataformat-smile-2.3.1.jar:/usr/hdp/2.4.2.0-258/storm/lib/ring-jetty-adapter-1.3.0.jar:/usr/hdp/2.4.2.0-258/storm/lib/clj-time-0.8.0.jar:storm-starter-0.0.1-storm-0.9.0.1.jar:/usr/hdp/current/storm-supervisor/conf:/usr/hdp/2.4.2.0-258/storm/bin -Dstorm.jar=storm-starter-0.0.1-storm-0.9.0.1.jar storm.starter.WordCountTopology WordCount
18:54:08.293 [main] INFO b.s.u.Utils - Using defaults.yaml from resources
18:54:08.362 [main] INFO b.s.u.Utils - Using storm.yaml from resources
18:54:08.421 [main] INFO b.s.u.Utils - Using defaults.yaml from resources
18:54:08.441 [main] INFO b.s.u.Utils - Using storm.yaml from resources
18:54:08.458 [main] INFO b.s.StormSubmitter - Generated ZooKeeper secret payload for MD5-digest: -6645485375203566088:-8607446551035289369
18:54:08.459 [main] INFO b.s.s.a.AuthUtils - Got AutoCreds []
18:54:08.474 [main] INFO b.s.u.StormBoundedExponentialBackoffRetry - The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
18:54:08.532 [main] INFO o.a.s.z.Login - successfully logged in.
18:54:08.782 [main] INFO b.s.u.StormBoundedExponentialBackoffRetry - The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
18:54:08.783 [main] INFO o.a.s.z.Login - successfully logged in.
18:54:08.895 [main] INFO b.s.u.StormBoundedExponentialBackoffRetry - The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
18:54:08.896 [main] INFO o.a.s.z.Login - successfully logged in.
18:54:09.015 [main] INFO b.s.u.StormBoundedExponentialBackoffRetry - The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
18:54:09.017 [main] INFO o.a.s.z.Login - successfully logged in.
18:54:09.137 [main] INFO b.s.u.StormBoundedExponentialBackoffRetry - The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
18:54:09.138 [main] INFO o.a.s.z.Login - successfully logged in.
18:54:09.261 [main] INFO b.s.u.StormBoundedExponentialBackoffRetry - The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
18:54:09.262 [main] INFO o.a.s.z.Login - successfully logged in.
Exception in thread "main" java.lang.RuntimeException: AuthorizationException(msg:fileUpload is not authorized)
at backtype.storm.StormSubmitter.submitJarAs(StormSubmitter.java:399)
at backtype.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:229)
at backtype.storm.StormSubmitter.submitTopology(StormSubmitter.java:271)
at backtype.storm.StormSubmitter.submitTopology(StormSubmitter.java:157)
at storm.starter.WordCountTopology.main(WordCountTopology.java:77)
Caused by: AuthorizationException(msg:fileUpload is not authorized)
at backtype.storm.generated.Nimbus$beginFileUpload_result$beginFileUpload_resultStandardScheme.read(Nimbus.java:13616)
at backtype.storm.generated.Nimbus$beginFileUpload_result$beginFileUpload_resultStandardScheme.read(Nimbus.java:13594)
at backtype.storm.generated.Nimbus$beginFileUpload_result.read(Nimbus.java:13536)
at org.apache.thrift7.TServiceClient.receiveBase(TServiceClient.java:78)
at backtype.storm.generated.Nimbus$Client.recv_beginFileUpload(Nimbus.java:462)
at backtype.storm.generated.Nimbus$Client.beginFileUpload(Nimbus.java:450)
at backtype.storm.StormSubmitter.submitJarAs(StormSubmitter.java:370)
... 4 more
... View more
01-30-2017
06:36 PM
Based on additional digging since I posted this question, it looks like the HWX docs are wrong to refer to port 6667, it appears 6627 is the default and what is currently set on my cluster. I'm still fuzzy on whether I should change java.security.auth.login.config in storm.yaml to point to client_jaas.conf rather than storm_jaas.conf, and whether that would mean the node where this change is made would ONLY function as a client.
... View more
01-30-2017
04:07 PM
I've read the documentation https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_secure-storm-ambari/content/ch_secure-storm-designating-node.html, but am confused. The section in the link above "Use an Existing Storm Node" mentions creating a .storm directory for each user, but doesn't say whether or not anything should be put in the new directory. Also, the next step talks about Adding settings to /etc/storm/conf/storm.yaml, but it looks like all of those settings already exist, but 2 have different values: setting current value docs suggest nimbus.thrift.port 6627 6667 java.security.auth.login.config '/usr/hdp/current/storm-supervisor/conf/storm_jaas.conf' "/etc/storm/conf/client_jaas.conf" It's unclear to me whether or not I should change the settings to match what the docs suggest. The port difference could just be how it happens to be set on my cluster, or it could be that a different port is used by the client vs. storm processes communicating with each other. If I make the java.security change, with this node only be usable as a client?
... View more
Labels:
- Labels:
-
Apache Storm
10-31-2016
02:58 PM
I thought those 2 settings pre-dated the introduction of ACID tables. I can understand the "External tables cannot be ACID tables..." part, but I would think those settings could be used to allow users to issue an "exclusive lock" on an external table to prevent reading from it thru hive while external jobs manipulate the underlying files....
... View more
10-27-2016
07:51 PM
From what I'm hearing from other sources this answer was inaccurate and totally fails to take into consideration how our cluster is being used. I disagree with it being tagged "best answer".
... View more
09-07-2016
07:37 PM
Looks like the features I was after have been back-ported. It gets confusing when the primary documentation hortonworks points users to is the Apache docs, which states these features are in 1.6... I don't understand why hwx seems to be avoiding 1.6.
... View more
09-06-2016
07:38 PM
Any word on when Flume 1.6 is likely be rolled into HDP?
I haven't switched to 2.5.0 yet, but was disappointed to see the release notes indicate flume 1.5.2 is still the latest in HDP 2.5 even tho flume 1.6 has been out for over a year....
... View more
Labels:
09-01-2016
09:57 PM
1 Kudo
We seem to be getting bitten by HIVE-10809, where pig scripts using org.apache.hive.hcatalog.pig.HCatStorer are leaving behind empty _scratch directories with every run. Does anyone have a suggested workaround, or someone in the HWX community working on pig have any update on this jira? (It has a status of patch available, but hasn't been updated since May 2015.)
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Pig
09-01-2016
05:47 PM
2 Kudos
I've been asked to set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager and hive.support.concurrency = true, because a subset of users is concerned about dirty reads on an external table while an external job runs to consolidate small files within a partition, so they want to do an exclusive lock during the consolidation.... Anyone no of a reason I should be wary of the above settings? Is there potential for performance impacts for other jobs/users that might have had no need for the above settings? I guess another question would be does "lock table" even work on an external table? Thx, -Vince
... View more
Labels:
- Labels:
-
Apache Hive
09-01-2016
05:08 PM
I'm interested in the telecom use case, too.... We're dealing with hourly ingests that result in a number of small files we'd like regularly compacted...
... View more
08-23-2016
05:04 PM
I think our hive-env template is missing the section related to setting the heapsize.... We're reaching out to support for a fresh copy of that.
... View more
08-23-2016
04:44 PM
1 Kudo
Ambari's hive config includes an entry for "Metastore Heap Size" and the context help says this corresponds to hive.metastore.heapsize, but I can't find any other reference to this parameter in Apache or any hadoop vendor's documentation. Is this actually a parameter? Where is it set? The value we are setting in Ambari does not appear to be affecting the actual heap being used on our server. (Ambari 2.2.2.0)
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hive
08-16-2016
01:30 PM
1 Kudo
Unfortunately, my group was using external tables as easier way to deal with quotas in a "multi-tenant" cluster and impose some governance on hive. (i.e. Most users/groups can only create external tables, and the files need to be landed in their assigned folder in HDFS. DBAs control internal tables in hive.) Somewhere, we missed the "basic premise" that the data in external tables won't change....
... View more
08-15-2016
09:00 PM
1 Kudo
A user recently asked about locking hive tables to make sure reads are consistent, and that led me to the Apache documentation on hive transactions where I saw the following: External tables cannot be made ACID tables since the changes on external tables are beyond the control of the compactor (HIVE-13175). This leads me to wonder whether updated/comprehensive documentation exists on the differences between internal and external tables in hive. Traditionally, the explanation of the difference between the two has been that hive maintains both the data and metadata with internal tables, so dropping an internal table will drop the data and metadata, while dropping an external table will only drop the metadata, but otherwise, they're functionally equivalent. The note above regarding ACID/transactions suggests internal and external table capabilities/features are diverging.... Thoughts? Thanks in advance!
... View more
- Tags:
- Data Processing
- Hive
Labels:
- Labels:
-
Apache Hive