About rgelhausen

bbende · ‎08-17-2016

@Randy Gelhausen NiFi JIRA to capture this idea: https://issues.apache.org/jira/browse/NIFI-2585

rgelhausen · ‎08-10-2016

I found the answer. In the Spark interpreter menu there is a "zeppelin.spark.printREPLOutput" property which you can set to false.

JoeWitt · ‎07-16-2016

I don't have any real numbers to share though I found this article on the topic interesting https://www.maxcdn.com/blog/ssl-performance-myth/ and found that it aligns to what I've observed as well. There are relatively few cases where unsecured site-to-site is appropriate in comparison to secured site-to-site.

mliem · ‎09-26-2016

The Nifi team has identified and issue with Hive scripts causing this processor to hang. Basically these hive commands are running Mapreduce or Tez jobs that are producing a lot of standard out which is being returned to the NiFi processor. If the amount of stdout or sterr returned gets large the processor can hang. To prevent this from happening, we recommend adding the “-S” option to hive commands or “—silent=true” to beeline commands that are executed using the NiFi script processors.

elserj · ‎07-07-2016

Loading jars out of HDFS, as enabled by HBASE-1936, would be an alternative to copying the jars to the local filesystem on each node running HBase.

bmathew · ‎06-13-2016

If your looking for a way to just delete the S3 path from your code and if your using PySpark, then the following will work: import os cmd="hdfs dfs -rm -r -skipTrash s3a://my-bucket/test_delete_me" os.system(cmd)

jfrazee · ‎06-13-2016

Typically no. You're actually limited on the number of buckets you can create whereas number of objects, and thus prefixes, effectively not. The situation where you want different buckets is where you want to specify different bucket policies; e.g., for data lifecycle (+/- versioning, automatic archive to glacier), security, and environment (dev, test, prod). The design of prefixes/key names/directories should then be guided by your access patterns with similar sorts of considerations you have for organizing data in HDFS. Listings over prefixes/recursive listings can be slow, so thinking along those terms, if you're going to do listings you'll want enough hierarchy or structure to your key names that those result sets don't get huge. If you're only ever going to do access to specific keys, this is less of an issue.

rgelhausen · ‎08-05-2016

I ended up creating an additional source upstream that generates "tick" events at my specified interval, then joined the two RDDs. Every interval, the RDD element from the "tick" stream has a non-zero value.

cgambino · ‎04-15-2016

Two answers: For a rolling window look into "DistributedSetCache" as that allows the most recent X events to be lookedup for time chunking this jira question(also asked by you) resolves it https://issues.apache.org/jira/browse/NIFI-1775

rkm6677 · ‎05-24-2016

Hi @nejm hadjmbarek, I'm Posting my code, which is working Fine: import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.SQLException; import java.sql.PreparedStatement; import java.sql.Statement; public class phoenix_hbase { public static void main(String[] args) throws SQLException { @SuppressWarnings("unused") Statement stmt = null; ResultSet rset = null; try { Class.forName("org.apache.phoenix.jdbc.PhoenixDriver"); } catch (ClassNotFoundException e1) { System.out.println("Exception Loading Driver"); e1.printStackTrace(); } try { Connection con = DriverManager.getConnection("jdbc:phoenix:172.31.124.43:2181:/hbase-unsecure"); //172.31.124.43 is the adress of VM, not needed if ur running the program from vm itself stmt = con.createStatement(); PreparedStatement statement = con.prepareStatement("select * from javatest"); rset = statement.executeQuery(); while (rset.next()) { System.out.println(rset.getString("mycolumn")); } statement.close(); con.close(); } catch(Exception e) { System.out.println(e.getMessage()); } } }

Online	Offline
Last Visited	‎01-23-2018 02:10 AM

Member Since	‎09-21-2015 08:50 PM
Last Visited	‎01-23-2018 02:10 AM
Posts	133
Kudos received	123

Cloudera Community

Re: Phoenix table design

Re: How to determine whether a hive script fails?

Re: Performance metrics phoenix bulk load vs hbase...

Re: What is recommended way of moving mainframe da...

Re: HBase Row Level Filtering

Re: Can I access event provenance metadata using e...

Re: Is there a way to show only form output in a Z...

Re: What's the expected performance overhead for s...

Re: ExecuteStreamCommand Hangs when executing Hive...

Re: How do I add custom HBase co-processors in HDP...

Re: How can I use Spark to empty/delete data from ...

Re: How can I specify S3 bucket folders/paths with...

Re: Is there a way to get time-based ticks/trigger...

Re: Are there processors that can share/collect fl...

Re: Not able to connect phoenix via java jdbc