Member since
07-16-2020
6
Posts
0
Kudos Received
0
Solutions
08-12-2020
02:01 AM
1. Follow below document to run Oozie in time basis (weekly/daily/hourly) https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_oozie_cron.html 2. Follow below document for Oozie email action http://oozie.apache.org/docs/5.2.0/DG_EmailActionExtension.html
... View more
07-24-2020
01:21 PM
The row counts reflect the status of the partition or table the last time its stats were updated by "compute stats" in Impala (or analyze in Hive). Or that the stats were updated manually via an alter table. (There are also other cases where stats are updated, e.g. they can be automatically gathered by hive, but those are a few examples). One scenario where this could happen is if a partition was dropped since the last compute stats was run. The stats generally can be out of sync with the # of rows in the underlying table - we don't use them for answering queries, just for query optimization, so it's fine if they're a little inaccurate. If you want to know the accurate counts, you can run queries like select count(*) from table; select count(*) from table where business_date = "13/05/2020" and tec_execution_date = "13/05/2020 20:08;
... View more
07-16-2020
04:51 PM
To get the output like the Hive page you linked to you just need this: describe formatted <TABLE_NAME> <COLUMN_NAME>; That works in Hue. Can you further clarify what output you are looking for in an ideal scenario?
... View more