Member since
01-03-2020
11
Posts
0
Kudos Received
0
Solutions
02-26-2020
12:00 AM
in the schema definition for name and address set "default" : "null" instead of "default": null
... View more
01-31-2020
01:37 AM
@stevenmatison what do when in cases one is not ingesting data, for example someone may be trying to delete or add a daily partition of an external table in hive and running this statement for each flow file would be waste of resources. With which processors would I be able to implement such a counter and how do I manage to reset it every day ?
... View more
01-31-2020
12:12 AM
@Wynner I do not know the time of the first flowfile, sorry but couldn't understand why the solution would be different for different processors ? Can you please suggest something for something like the puthive processor ? The processor can ignore all subsequent flowfiles. Does this mean the subsequent flow files can be removed? Yes, the subsequent flowfiles for the day should be removed/ignored for that processor.
... View more
01-30-2020
01:51 AM
Any suggestions ?
... View more
01-28-2020
06:06 AM
How can I schedule a NiFi processor to run only when it receives the first flow file of the day.
The processor can ignore all subsequent flowfiles.
... View more
Labels:
- Labels:
-
Apache NiFi
01-28-2020
05:16 AM
@Wynner Can you please suggest how I schedule a processor so that it runs only when it receives the first flowfile for the day
... View more
01-24-2020
04:45 AM
@daisuke_baba were you able to resolve your issue ?
... View more
01-15-2020
04:51 AM
@stevenmatison I had refered that post but unfortunately it is not working even when the table is in CSV/text format.
... View more
01-15-2020
12:10 AM
STACK -
HDP 3.1.0.0 Hive 3.1.0
Trying to store timestamps like **2020-01-03T02:46:21.148+02:00** in an **ORC** hive table.
Storing the timestamps using the timestamp datatype gives NULL on querying which is expected as this is not the format hive expects it's timestamps to be.
But as per the documentation if we set the appropriate timestamp format serde property then hive should be able to read the timezones.
*"On the table level, alternative timestamp formats can be supported by providing the format to the SerDe property "timestamp.formats" (as of release 1.2.0 with HIVE-9298). For example, yyyy-MM-dd'T'HH:mm:ss.SSS,yyyy-MM-dd'T'HH:mm:ss."*
However despite **ALTER TABLE <tablename> SET SERDEPROPERTIES ("timestamp.formats"="yyyy-MM-dd'T'HH:mm:ss.SSSXXXXX");** the property seems to have no effect and hive is still not able to read it. I tried multiple variations of the format but none worked.
Creating the table as textfile/CSV doesn't help either
To replicate -
create table default.test_tz (tt timestamp) stored as orc; insert into default.test_tz values ("2020-01-03T02:46:21.148+02:00"), ("2017-02-16T11:24:29.000Z"), ("2017-02-16 11:24:29"), ("2019-06-15T15:43:19"); ALTER TABLE default.test_tz SET SERDEPROPERTIES ("timestamp.formats"="yyyy-MM-dd'T'HH:mm:ss.SSSXXXXX");
**Should I just store it as string? What is the best practice in such cases.**
... View more
Labels:
01-09-2020
02:21 AM
What should be done in a scenario where there is no possibility of a maintenance window ? Its is quite contradictory to what the documentation has to say "All compactions are done in the background and do not prevent concurrent reads and writes of the data. After a compaction the system waits until all readers of the old files have finished and then removes the old files." https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Compactor
... View more
01-03-2020
03:36 AM
I have a hive 3 transaction enabled table which is being streamed into by NiFi putHive3Streaming processor. When I am trying to query (select count(*)) the table it is failing with org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: <HDFS path to manged table>/<partition>/delta<...>/bucket_0000* I could also observe compaction jobs being run for the table which I believe is the reason for the delta file being deleted but cannot understand why the query is failing. can we not query a table for which a compaction job is being run ? I am on HDP-3.1.0.0 running hive on tez
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache NiFi