Member since
01-07-2020
64
Posts
1
Kudos Received
0
Solutions
11-03-2021
03:43 AM
Hi @balajip I know how to create a UDF. My problem is that every time I restart impala the udf is gone. Is there any way to keep UDF after the restart or I have to create it every time ?
... View more
10-19-2021
09:27 PM
@drgenious You didn't include a link to what you found that said "something about CDH", but I suspect based on your description that what you found was not about CDH (which stands for Cloudera's Distribution including Apache Hadoop), but CDC, or change data capture. I will leave the question about how to copy the data from an RDBMS such as Mysql and somehow publish that to a Kafka topic to other members of the community to answer.
... View more
10-12-2021
01:54 AM
Hello @drgenious, Please check the below link [0]. [0]https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_metrics_impala_daemon_resource_pool.html#concept_gif_9en_yk
... View more
10-07-2021
10:45 AM
@dr If it's a managed table, you could get its size from TABLE_PARAMS table: e.g. SELECT a.TBL_NAME AS `TABLE`, b.PARAM_VALUE AS `SIZE` from TABLE_PARAMS b INNER JOIN TBLS a where a.TBL_ID=b.TBL_ID and b.PARAM_KEY='totalSize'; You could change the you need it. But, if there are external tables, or the table stats are not generated regularly, then you might not get the correct data. You could get the table size using HDFS file system commands as well: hdfs dfs -du -s -h <path to the table location> This will give you more accurate data.
... View more
09-28-2021
08:49 AM
2019-09-18 08:44:10 2020-08-05 13:15:48 2020-08-05 13:24:00 2020-10-15 18:29:34 2020-09-09 09:35:04 This is already in asc order. check it's format year/mm/dd
... View more
09-24-2021
08:15 PM
1 Kudo
Hi @drgenious, 1) where can I run these kind of queries? In CM -> Charts -> Chart Builder builder you can run tsquery. Refer to this link: https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_dg_chart_time_series_data.html 2) where can I find the attributes like category and clusterName in cloudera? In Chart Builder text bar, write an incomplete query like: SELECT get_file_info_rate Below the text bar there is Facets, click on More, select any Facets you want, for example you select clusterName, then you will see a the clusterName shows in the chart's title. Then you can complete your tsquery: SELECT get_file_info_rate where clusterName=xxxxx If you want to build impala related charts, suggest to firstly review the CM > Impala service > Charts Library, many charts are already there for common monitoring purpose. You can open any of the existing charts to learn how to construct the tsquery and then build your own charts. Another very good place to learn is CM > Charts > Chart Builder, at right side you will see a "?" button, click on it you will see many examples and you could just cllick "try it". Regards, Will If the answer helps, please accept as solution and click thumbs up.
... View more
09-22-2021
10:54 PM
2 Kudos
1. For total memory configured you can check (impala daemon memory * a total number of demons ) , these values should be displayed on top of Impala admission control as well that this much if memory is allocated to the impala. 2. You can check other memory metrics from the cluster utilization report, please note that how much memory is consumed per pool feature is not currently captured in impala metrics. a) Max Allocated Peak Allocation Time – The time when Impala reserved the maximum amount of memory for queries. Click the drop-down list next to the date and time and select View Impala Queries Running at the Time to see details about the queries. Max Allocated – The maximum memory that was reserved by Impala for executing queries. If the percentage is high, consider increasing the number of hosts in the cluster. Utilized at the Time – The amount of memory used by Impala for running queries at the time when maximum memory was reserved. Click View Time Series Chart to view a chart of peak memory allocations. Histogram of Allocated Memory at Peak Allocation Time – Distribution of memory reserved per Impala daemon for executing queries at the time Impala reserved the maximum memory. If some Impala daemons have reserved memory close to the configured limit, consider adding more physical memory to the hosts. b) Max Utilized Peak Usage Time – The time when Impala used the maximum amount of memory for queries. Click the drop-down list next to the date and time and select View Impala Queries Running at the Time to see details about the queries. Max Utilized – The maximum memory that was used by Impala for executing queries. If the percentage is high, consider increasing the number of hosts in the cluster. Reserved at the Time – The amount of memory reserved by Impala at the time when it was using the maximum memory for executing queries. Click View Time Series Chart to view a chart of peak memory utilization. Histogram of Utilized Memory at Peak Usage Time – Distribution of memory used per Impala daemon for executing queries at the time Impala used the maximum memory. If some Impala daemons are using memory close to the configured limit, consider adding more physical memory to the hosts. [1] https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_cluster_util_custom.html#concept_jp4_4bh_hx [2] https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_metrics_impala_daemon.html [3] https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_metrics_impala_daemon_resource_pool.html
... View more
08-23-2021
04:07 PM
Hi, can you use beeline and type the below command then recreate the table : set parquet.column.index.access=false; this should make hive not use the index of your create table statement to map the data in your files, but instead it will use the columns names . hope this works for you. Best Regards
... View more
08-09-2021
05:40 PM
Hi, What is the query you are using to read the data from table? can you attach its "query profile" and coordinator logs to have a look? Regards, Chethan YM
... View more
07-18-2021
08:37 AM
All the hive related tables are stored under "hive" database in mysql. You can take mysql dump for a database hive and can prevent this from happening in the future. You can use command like: mysqldump -u root -p hive Reference: https://www.sqlshack.com/how-to-backup-and-restore-mysql-databases-using-the-mysqldump-command/
... View more