1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1841 | 04-03-2024 06:39 AM | |
| 2856 | 01-12-2024 08:19 AM | |
| 1577 | 12-07-2023 01:49 PM | |
| 2340 | 08-02-2023 07:30 AM | |
| 3223 | 03-29-2023 01:22 PM |
06-12-2016
11:17 AM
More likely that principals and techniques from spark and flink will enhance map reduce. Flink is faster so that would've a better choice than Spark. Tez is very powerful accelerator
... View more
06-12-2016
03:45 AM
Would you use HBase? Or an in-memory grid like Apache Ignite or Apache Geode.
... View more
Labels:
- Labels:
-
Apache HBase
06-11-2016
04:02 PM
600 columns of detailed information on customers with many kinds of attributes. The data needs to be access interactively in reports and through web applications. Access to a few hundred/thousand rows (plus summary information) from the dataset based on known 20-30 column chunks of related information bundles. For interactive exploration of the data and extraction of this lists to use elsewhere.
... View more
06-11-2016
02:43 PM
If you have 600+ columns and you need to access 20-30 columns at a time. What is the optimal type of storage: Hive Table Stored as ORC with compression, vectorization and optimization; access with Tez and properly partition and bucket HBase HBase in Phoenix Table Parquet File AVRO File Accumulo
... View more
Labels:
06-11-2016
01:16 PM
1 Kudo
Cool. http://www.slideshare.net/hortonworks/deep-learning-with-hortonworks-and-apache-spark-hortonworks-technical-workshop Deep Learning with Microsoft http://www.slideshare.net/mlprague/xuedong-huang-deep-learning-and-intelligent-applications Deep Learning 4J http://www.slideshare.net/agibsonccc/brief-introduction-to-distributed-deep-learning
... View more
06-10-2016
11:36 PM
1 Kudo
I call twitter with a filter on some terms, grab the key twitter attributes then call a filter to remove profanities. http://www.purgomalum.com/service/plain?text= or http://www.purgomalum.com/service/json=text work and are free REST API services. For fun, I send the tweet as a search keyword to Guardian using their API (you need to register for a key). http://content.guardianapis.com/search?order-by=newest&q=${tweet}&api-key=StuffNumbers
... View more
Labels:
06-10-2016
06:29 PM
Has anyone gotten TensorFlow or DeepLearning4J running on top of HDP 2.4? I am interested in the setup.
... View more
Labels:
06-10-2016
03:59 PM
Looks like a good blog post. Looks like it will work fine.
... View more