Created 02-13-2020 01:07 AM
Hello everyone,
need help please.
when i execute the following request on the impala-shell (Cloudera cdh5.16)
CREATE TABLE FCT_OC_ACTIVE_ALARMS_COUNTERS (
TIME TIMESTAMP NOT NULL ENCODING RLE,
OPERATIONCONTEXTID BIGINT NOT NULL ENCODING RLE,
SEVERITYID BIGINT NOT NULL ENCODING RLE,
ALARMTYPEID BIGINT NOT NULL,
PROBABLECAUSEID BIGINT NOT NULL,
SPECIFICPROBLEMSID BIGINT,
MANAGEDOBJECTNAME STRING NOT NULL,
YEARMONTH INT NOT NULL,
NEXTDELTATIME TIMESTAMP,
AOACKNOWLEDGED BIGINT,
AOHANDLED BIGINT,
AONOTHANDLED BIGINT,
AOOUTSTANDING BIGINT,
UPDATE_TIMESTAMP TIMESTAMP,
PRIMARY KEY(TIME, OPERATIONCONTEXTID, SEVERITYID, ALARMTYPEID, PROBABLECAUSEID, MANAGEDOBJECTNAME, SPECIFICPROBLEMSID, YEARMONTH)
) PARTITION BY RANGE (YEARMONTH) (
PARTITION VALUE = 202001
) STORED AS KUDU
it works perfectly. but when i execute the same on impala-shell (Cloudera cdh 6.3.2) i get the following error:
ERROR: ImpalaRuntimeException: Error creating Kudu table 'impala::ci_inta_fas_oss_installer_cdh6.FCT_OC_ACTIVE_ALARMS_COUNTERS'
CAUSED BY: ImpalaRuntimeException: Kudu PRIMARY KEY columns must be specified as the first columns in the table (expected leading columns ('time', 'operationcontextid', 'severityid', 'alarmtypeid', 'probablecauseid', 'managedobjectname', 'specificproblemsid', 'yearmonth') but found ('time', 'operationcontextid', 'severityid', 'alarmtypeid', 'probablecauseid', 'specificproblemsid', 'managedobjectname', 'yearmonth'))
Did i mess something in my configuration on cdh 6.3 ?
Created 02-13-2020 09:11 AM
You need to either change the order of the columns in your table definition or the PRIMARY KEY definition so that they match. In your statement you have the order of MANAGEDOBJECTNAME and SPECIFICPROBLEMSID reversed in the two places.
MANAGEDOBJECTNAME, SPECIFICPROBLEMSID, YEARMONTH
SPECIFICPROBLEMSID BIGINT,
MANAGEDOBJECTNAME STRING NOT NULL,
YEARMONTH INT NOT NULL,
We made this stricter because it previously silently ignored the order of columns in the PRIMARY KEY clause, which can have really bad performance implications - https://issues.apache.org/jira/browse/IMPALA-8283
Created 02-13-2020 09:11 AM
You need to either change the order of the columns in your table definition or the PRIMARY KEY definition so that they match. In your statement you have the order of MANAGEDOBJECTNAME and SPECIFICPROBLEMSID reversed in the two places.
MANAGEDOBJECTNAME, SPECIFICPROBLEMSID, YEARMONTH
SPECIFICPROBLEMSID BIGINT,
MANAGEDOBJECTNAME STRING NOT NULL,
YEARMONTH INT NOT NULL,
We made this stricter because it previously silently ignored the order of columns in the PRIMARY KEY clause, which can have really bad performance implications - https://issues.apache.org/jira/browse/IMPALA-8283
Created 02-19-2020 08:59 AM
Excellent. This is the solution i've applied. But i was not able to explain why it does not work anymore with cdh6.
Thanks
Created 02-19-2020 09:26 AM
We made this stricter because it was easy to create tables with the wrong primary key order, which has perf consequences.
It was really a bug that we allowed creating tables with unclear primary key order.