About dmueller1607

Abhishek_721 · ‎04-10-2020

this issue resolved ???i am also acing the same issue please suggest

wblake · ‎02-11-2020

Hello @mburgess , If you have an input string but don't know what the value might be, e.g. it could be "8" or "8.4", and want to convert this into an int. Is there a way you can convert the "8" to an 8 and the "8.4" to an 8.4 float? Currently I am only able to convert both to ints as 8, or both to floats as 8.0, and 8.4. For context, I am using a ValidateRecord to validate the number is an int, so would not like float values to be validated. This means that if an input is converted from a string into a number, I would like to know whether it is a decimal or integer. Are you able to please assist? Many thanks!

dmueller1607 · ‎11-25-2019

To answer my own question: Since I'm using multiple partitions for the Kafka topic, Spark uses more executors to process the data. Also Hive/Tez creates as many worker containers as the topic contains partitions.

dmueller1607 · ‎11-14-2018

I found the following Java based solution for me: Using the Dataset.filter method with FilterFunction: https://spark.apache.org/docs/2.3.0/api/java/index.html?org/apache/spark/sql/Dataset.html So, my code now looks like this: Dataset<Row> dsResult = sqlC.read() .format("org.apache.phoenix.spark") .option("table", tableName) .option("zkUrl", hbaseUrl).load() .where("OTHER_COLUMN = " + inputId) .filter(row -> { long readTime = row.getTimestamp(row.fieldIndex("TABLE_TS_COL")).getTime(); long tsFrom = new Timestamp(sdf.parse(dateFrom).getTime()).getTime(); long tsTo = new Timestamp(sdf.parse(dateTo).getTime()).getTime(); return readTime >= tsFrom && readTime <= tsTo; });

dmueller1607 · ‎10-15-2018

Solved it - Phoenix Arrays are 1-based, so using the following query solved it: SELECT REGEXP_SPLIT(ROWKEY, ':')[1] as test, count(1) FROM "my_view" GROUP BY REGEXP_SPLIT(ROWKEY, ':')[1]

dmueller1607 · ‎09-25-2018

The problem was solved after changing the MySQL Database URL from jdbc:mysql://xxxx.yyyy/hive?createDatabaseIfNotExist=true to jdbc:mysql://xxxx.yyyy/hive?createDatabaseIfNotExist=true&serverTimezone=Europe/Berlin I found the relevant information here: https://community.hortonworks.com/questions/218023/error-setting-up-hive-on-hdp-265timezone-on-mysql.html

dmueller1607 · ‎09-14-2018

@Felix Albani Thank you for your help! Without the LIMIT clause, the Job works perfectly (and in parallel).

dmueller1607 · ‎09-03-2018

Thank you! Can you give me some details about this or do you have some helpful links?

kgautam · ‎08-09-2018

1. COUNT will result in a full table scan and hence the query is slow. 2. Where on the primary key will be fast as it will do a lookup and not a scan. 3. Where used on any column apart from the primary key will result in a HBase full table scan. 4. Analyse table once to speed up count queries. But it will not affect the where on no-primary key.

arald · ‎08-07-2018

have a look here: https://community.hortonworks.com/questions/88526/how-to-salt-row-key-in-hbase-table.html Basically it says that your prefix definition should be made in a way that you can calculate it during the query as well. In your (but perhaps simplified) example it might be even numbers prefix 000, odd numbers prefix 001.

Online	Offline
Last Visited	‎11-25-2019 04:11 AM

Member Since	‎04-24-2017 12:08 PM
Last Visited	‎11-25-2019 04:11 AM
Posts	106
Kudos received	13

Cloudera Community

Re: Spark Streaming / Hive + Kafka: Only one Worke...

Re: Filter a Phoenix Timestamp Column in SparkSQL ...

Re: Phoenix Query with Split operation on String (...

Re: Hive Metastore not starting in HDP 3.0

Re: HFile creation from Hive Table not working

Re: Reading external Hive table from Spark in Hado...

Re: Convert JSON Attribute to Number in NiFi workf...

Re: Spark Streaming / Hive + Kafka: Only one Worke...

Re: Filter a Phoenix Timestamp Column in SparkSQL ...

Re: Phoenix Query with Split operation on String (...

Re: Hive Metastore not starting in HDP 3.0

Re: Spark SQL: Limit clause performance issues

Re: SparkSQL: Hive sub-query leads to full table s...

Re: Accessing HBase Table through Hive is very slo...

Re: Use HBase Shell Scan method to search in salte...