Member since
09-18-2015
3274
Posts
1159
Kudos Received
426
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 45122 | 02-09-2016 06:13 PM |
02-06-2016
06:11 PM
10 Kudos
OLAP (Online Analytical Processing) is the technology behind many Business Intelligence (BI) applications. OLAP is a powerful technology for data discovery, including capabilities for limitless report viewing, complex analytical calculations, and predictive “what if” scenario (budget, forecast) planning. OLAP is an acronym for Online Analytical Processing. OLAP performs multidimensional analysis of business data and provides the capability for complex calculations, trend analysis, and sophisticated data modeling. It is the foundation for may kinds of business applications for Business Performance Management, Planning, Budgeting, Forecasting, Financial Reporting, Analysis, Simulation Models, Knowledge Discovery, and Data Warehouse Reporting. OLAP enables end-users to perform ad hoc analysis of data in multiple dimensions, thereby providing the insight and understanding they need for better decision making. Source OLAP solutions
Open source
Apache Kylin http://kylin.apache.org/ Apache Kylin™ is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets, original contributed from eBay Inc. Extremely Fast OLAP Engine at Scale: Kylin is designed to reduce query latency on Hadoop for 10+ billions of rows of data - ANSI SQL Interface on Hadoop: Kylin offers ANSI SQL on Hadoop and supports most ANSI SQL query functions - Interactive Query Capability: Users can interact with Hadoop data via Kylin at sub-second latency, better than Hive queries for the same dataset - MOLAP Cube: User can define a data model and pre-build in Kylin with more than 10+ billions of raw data records - Seamless Integration with BI Tools: Kylin currently offers integration capability with BI Tools like Tableau. Integration with Microstrategy and Excel is coming soon - Other Highlights: - Job Management and Monitoring
- Compression and Encoding Support
- Incremental Refresh of Cubes
- Leverage HBase Coprocessor for query latency
- Approximate Query Capability for distinct Count (HyperLogLog)
- Easy Web interface to manage, build, monitor and query cubes
- Security capability to set ACL at Cube/Project Level
- Support LDAP Integration Druid http://druid.io/druid.html Druid is an open source data store designed for OLAP queries on event data. This page is meant to provide readers with a high level overview of how Druid stores data, and the architecture of a Druid cluster. This data set is composed of three distinct components. If you are acquainted with OLAP terminology, the following concepts should be familiar.
Timestamp column: We treat timestamp separately because all of our queries center around the time axis. Dimension columns: Dimensions are string attributes of an event, and the columns most commonly used in filtering the data. We have four dimensions in our example data set: publisher, advertiser, gender, and country. They each represent an axis of the data that we’ve chosen to slice across. Metric columns: Metrics are columns used in aggregations and computations. In our example, the metrics are clicks and price. Metrics are usually numeric values, and computations include operations such as count, sum, and mean. Also known as measures in standard OLAP terminology. Commercial Atscale http://www.atscale.com/ AtScale turns your Hadoop cluster into scale-out OLAP server. Now you can use your BI tool of choice – from Tableau to Microstrategy to Microsoft Excel – to connect to and query data in Hadoop, with no extra layers in between.
Dynamic, virtual cubes present complex data as simple measures and dimensions Support for virtually any BI tool that can talk SQL or MDX Analyze billions of rows of data directly on your Hadoop cluster Eliminate need for costly data marts, extracts, and custom cubes Consistent metric definitions across all users, regardless of BI Kyvos Insights http://www.kyvosinsights.com/solution The cubes Kyvos can build and run on Hadoop are orders of magnitude bigger than what could be built on traditional OLAP gear. Instead of getting rid of the granular level of detail that would ordinarily be summarized or aggregated in a traditional OLAP setup, Kyvos can build a specific dimension for each column or field, whether it’s an individual customer or an individual SKU (stock keeping unit). Source Cloud option Source With Altiscale Data Cloud, the AtScale Intelligence Platform runs on top of enterprise-grade Hadoop in the cloud, reducing time to value, lowering costs and eliminating implementation risk. Since Altiscale runs a complete Hadoop ecosystem for its customers, it also eliminates one of Hadoop’s greatest challenges: ongoing operational risk. This allows customers to focus on their business goals without losing time and effort to the ongoing burden of Hadoop management.
... View more
Labels:
02-06-2016
08:46 PM
More information https://code.facebook.com/posts/938595492830104/osquery-introducing-query-packs/
... View more
03-16-2017
06:28 PM
Neeraj - I followed the original article and having some issue. I noticed that once I add the group "Public" in ranger policies without adding ip address in policy condition user are able to publish and consumer from any host. This is what i did. HDP Version: HDP-2.3.4.0-3485 -- Enables Kafka plugin in Ranger. -- Restarted Ranger -- Create following policies in Ranger ( see the image ) ( Important : Added group
Public left policy condition blank ) -- Logged in to
server 21 to Produce and consume message's -- I was able to produce and consume messages from any
server . What we want is to secure our Kafka environment through
ranger by ip address. I understand that the identity of client user over a
non-secure channel is not possible. I followed the following article to secure or Kafka environment. https://cwiki.apache.org/confluence/display/RANGER/Kafka+Plugin#KafkaPlugin-WhydowehavetospecifypublicusergrouponallpoliciesitemscreatedforauthorizingKafkaaccessovernon-securechannel Please let me know what I am missing.
... View more
08-20-2018
09:20 PM
Hi Neeraj, Allowing read and wright to all users to Poenix SYSTEM tables is not really secure. Is there any solution to avoid it? Thanks Helmi
... View more
08-23-2016
06:23 PM
Yes I have heard and worked on POC. When does Hadoop (HDP) release with IGFS ?
... View more
04-16-2016
09:13 PM
@Davide Vergari just released a custom Ambari service to install Apache Drill thru Ambari on HDP 2.4. Feel free to take a look at it.
... View more
11-30-2018
12:21 AM
Hi Neeraj, I am able to install presto but my queries are failing. Did you face similar error before? presto:default> show tables;
Query 20181130_001533_00002_5gf9c failed: 10.xxx.xx.xx: null
presto:default> exit
Caused by: org.apache.thrift.transport.TTransportException: 10.xxx.xx.xx:: null
at com.facebook.presto.hive.HiveMetastoreClientFactory.rewriteException(HiveMetastoreClientFactory.java:58)
at com.facebook.presto.hive.HiveMetastoreClientFactory.access$000(HiveMetastoreClientFactory.java:33)
... View more