About drgenious

drgenious · ‎11-03-2021

Hi @balajip I know how to create a UDF. My problem is that every time I restart impala the udf is gone. Is there any way to keep UDF after the restart or I have to create it every time ?

ask_bill_brooks · ‎10-19-2021

@drgenious You didn't include a link to what you found that said "something about CDH", but I suspect based on your description that what you found was not about CDH (which stands for Cloudera's Distribution including Apache Hadoop), but CDC, or change data capture. I will leave the question about how to copy the data from an RDBMS such as Mysql and somehow publish that to a Kafka topic to other members of the community to answer.

akriti · ‎10-12-2021

Hello @drgenious, Please check the below link [0]. [0]https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_metrics_impala_daemon_resource_pool.html#concept_gif_9en_yk

smruti · ‎10-07-2021

@dr If it's a managed table, you could get its size from TABLE_PARAMS table: e.g. SELECT a.TBL_NAME AS `TABLE`, b.PARAM_VALUE AS `SIZE` from TABLE_PARAMS b INNER JOIN TBLS a where a.TBL_ID=b.TBL_ID and b.PARAM_KEY='totalSize'; You could change the you need it. But, if there are external tables, or the table stats are not generated regularly, then you might not get the correct data. You could get the table size using HDFS file system commands as well: hdfs dfs -du -s -h <path to the table location> This will give you more accurate data.

vijaysahu · ‎09-28-2021

2019-09-18 08:44:10 2020-08-05 13:15:48 2020-08-05 13:24:00 2020-10-15 18:29:34 2020-09-09 09:35:04 This is already in asc order. check it's format year/mm/dd

willx · ‎09-24-2021

Hi @drgenious, 1) where can I run these kind of queries? In CM -> Charts -> Chart Builder builder you can run tsquery. Refer to this link: https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_dg_chart_time_series_data.html 2) where can I find the attributes like category and clusterName in cloudera? In Chart Builder text bar, write an incomplete query like: SELECT get_file_info_rate Below the text bar there is Facets, click on More, select any Facets you want, for example you select clusterName, then you will see a the clusterName shows in the chart's title. Then you can complete your tsquery: SELECT get_file_info_rate where clusterName=xxxxx If you want to build impala related charts, suggest to firstly review the CM > Impala service > Charts Library, many charts are already there for common monitoring purpose. You can open any of the existing charts to learn how to construct the tsquery and then build your own charts. Another very good place to learn is CM > Charts > Chart Builder, at right side you will see a "?" button, click on it you will see many examples and you could just cllick "try it". Regards, Will If the answer helps, please accept as solution and click thumbs up.

ShankerSharma · ‎09-22-2021

1. For total memory configured you can check (impala daemon memory * a total number of demons ) , these values should be displayed on top of Impala admission control as well that this much if memory is allocated to the impala. 2. You can check other memory metrics from the cluster utilization report, please note that how much memory is consumed per pool feature is not currently captured in impala metrics. a) Max Allocated Peak Allocation Time – The time when Impala reserved the maximum amount of memory for queries. Click the drop-down list next to the date and time and select View Impala Queries Running at the Time to see details about the queries. Max Allocated – The maximum memory that was reserved by Impala for executing queries. If the percentage is high, consider increasing the number of hosts in the cluster. Utilized at the Time – The amount of memory used by Impala for running queries at the time when maximum memory was reserved. Click View Time Series Chart to view a chart of peak memory allocations. Histogram of Allocated Memory at Peak Allocation Time – Distribution of memory reserved per Impala daemon for executing queries at the time Impala reserved the maximum memory. If some Impala daemons have reserved memory close to the configured limit, consider adding more physical memory to the hosts. b) Max Utilized Peak Usage Time – The time when Impala used the maximum amount of memory for queries. Click the drop-down list next to the date and time and select View Impala Queries Running at the Time to see details about the queries. Max Utilized – The maximum memory that was used by Impala for executing queries. If the percentage is high, consider increasing the number of hosts in the cluster. Reserved at the Time – The amount of memory reserved by Impala at the time when it was using the maximum memory for executing queries. Click View Time Series Chart to view a chart of peak memory utilization. Histogram of Utilized Memory at Peak Usage Time – Distribution of memory used per Impala daemon for executing queries at the time Impala used the maximum memory. If some Impala daemons are using memory close to the configured limit, consider adding more physical memory to the hosts. [1] https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_cluster_util_custom.html#concept_jp4_4bh_hx [2] https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_metrics_impala_daemon.html [3] https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_metrics_impala_daemon_resource_pool.html

tarekabouzeid91 · ‎08-23-2021

Hi, can you use beeline and type the below command then recreate the table : set parquet.column.index.access=false; this should make hive not use the index of your create table statement to map the data in your files, but instead it will use the columns names . hope this works for you. Best Regards

ChethanYM · ‎08-09-2021

Hi, What is the query you are using to read the data from table? can you attach its "query profile" and coordinator logs to have a look? Regards, Chethan YM

asish · ‎07-18-2021

All the hive related tables are stored under "hive" database in mysql. You can take mysql dump for a database hive and can prevent this from happening in the future. You can use command like: mysqldump -u root -p hive Reference: https://www.sqlshack.com/how-to-backup-and-restore-mysql-databases-using-the-mysqldump-command/

Online	Offline
Last Visited	‎06-26-2023 10:46 AM

Member Since	‎01-07-2020 06:44 AM
Last Visited	‎06-26-2023 10:46 AM
Posts	64
Kudos received	1

Cloudera Community

Re: Create function every time impala restart

Re: Ingest Mysql data to Kafka

Re: What is the difference between mem_reserved an...

Re: Find table's size in Hive metastore (MySQL)

Re: Impala can not understand sorting in tables

Re: Where can I run tsquery ?

Re: Find the % of the used memory in impala

Re: Parquet schema error

Re: Impala can not read the files stored in the hi...

Re: Backup MySQL tables schemas used in hive