About rgelhausen

rgelhausen · ‎03-16-2016

There are a couple of things you should do: 1. Instead of manually configuring, you should enable Phoenix under the HBase settings in Ambari by clicking the slider. This will put all the jars and XML configs in the correct place. 2. Connect to sqline using /usr/hdp/current/phoenix-client/bin/sqlline.py zookeeper_host:2181:/hbase-unsecure In your example, you have a space instead of ":" between 2181 and /hbase-unsecure

rgelhausen · ‎03-09-2016

Hi Roy, please have a look at Apache Phoenix and its views feature. This will let you define a base set of columns (producer_id, timestamp, event_type, etc) but also within the same table create additional logical views per record type. Your use case sounds similar to the product_metrics table and specific mobile_product_metrics example given in the link above. Once your views are defined, you can query them to get metadata to apply to the records in your ingest queue. Phoenix Views support issuing upsert statements to write new data. Re: changing schema- Phoenix Views can be altered at will as your schemas change.

rgelhausen · ‎03-01-2016

You'll need to install rstudio-server manually. The Ambari r-service above only installs the R interpreter (and some R packages).

rgelhausen · ‎02-18-2016

While Phoenix does let you do joins, they are something of an anti-pattern for NoSQL engines. Ideally you would pre-join and denormalize your data with Spark or Hive or Pig before storing it in Phoenix. If that is not an option, use the sort-merge hint recommended above and/or use a filter predicate that restricts the input size of your join, e.g.: select * from ( select * from customer where id = '123' ) a join events on a.id = b.id In the query above, you'd want id to be the first field in your rowkey/primary key. This lets the query execute as a RangeScan over _just_ the range of keys matching the specified id, instead of a full table scan. If you can't provide a restrictive predicate, Phoenix queries will still execute in parallel fashion, but north of several million rows, Hive or Spark queries over columnar compressed ORC or Parquet files will be significantly faster. Phoenix also has integration with Spark, in case you need to read from live Phoenix tables to perform online analysis. Spark is much better suited for analytical queries than Phoenix, but keep in mind it will be a table-scan unless you use predicate pushdown configurations.

rgelhausen · ‎02-17-2016

Hi @David Tam, for a working example using phoenix-spark to read/write HBase DataFrames, checkout https://github.com/randerzander/HiveToPhoenix

rgelhausen · ‎02-17-2016

Hi Sunile, unfortunately you cannot directly manage "Phoenix table" permissions via Ranger. However, creating a Phoenix table creates an underlying HBase table of the same name. To answer your question directly, yes, setting Ranger permissions on those underlying HBase tables (as in Neeraj's demo) protects data in your Phoenix tables. Note that Phoenix also uses a series of "SYSTEM" HBase tables (SYSTEM.CATALOG, SYSTEM.STATS, etc) which you may also want to secure with Ranger policies.

rgelhausen · ‎01-28-2016

Hi @Landon Robinson, take a look at http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_dataintegration/content/hive-jdbc-odbc-drivers.html If you're writing Java code, the easiest to use is probably the provided JDBC driver. You can get the current version of the JDBC driver in one of two ways: 1. At /user/hdp/current/hive-client/lib/hive-jdbc.jar on your edge/client node 2. here- you'll need to pick the jar corresponding to your HDP version (Ambari shows this under the Admin->Stacks and Versions->Versions tab, or you can display it on the CLI by typing "hdp-select", which shows the version number of each installed component.

rgelhausen · ‎12-11-2015

You'll need to use the ODBC Data Source Administrator tool to uncheck "Use Native Query" from your ODBC driver installation.

rgelhausen · ‎12-10-2015

NiFi works on the Raspberry Pi without any custom configuration. Follow the normal "Getting Started" steps here. One thing to know is that the large number of standard NARs means "service nifi start" will take a long time (5-10 minutes) to complete. Once it is done, the web UI will function normally.

rgelhausen · ‎12-08-2015

The installation and quick-start steps now work with HDP 2.3.2 out of the box. Make sure you download the binary for HBase 1.1.

Online	Offline
Last Visited	‎01-23-2018 02:10 AM

Member Since	‎09-21-2015 08:50 PM
Last Visited	‎01-23-2018 02:10 AM
Posts	133
Kudos received	123

Cloudera Community

Re: Phoenix table design

Re: How to determine whether a hive script fails?

Re: Performance metrics phoenix bulk load vs hbase...

Re: What is recommended way of moving mainframe da...

Re: HBase Row Level Filtering

Re: Issue connecting to Phoenix with Sqlline

Re: How to efficiently map data columns from exter...

Re: how to install R & Rstudio in hortonworks plat...

Re: Phoenix and hbase timeouts or lack of resource...

Re: Reading from and Writing to HBase with a spark...

Re: Ranger for phoenix

Re: Fetch Hive Table from HCatalog/Hive

Re: Hive Functions via ODBC Driver (QlikView)

Re: Has anyone tried using a Apache NiFi on Raspbe...

Re: How to Make Kylin work with HDP 2.3