Member since
10-28-2020
622
Posts
47
Kudos Received
40
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1982 | 02-17-2025 06:54 AM | |
6697 | 07-23-2024 11:49 PM | |
1335 | 05-28-2024 11:06 AM | |
1885 | 05-05-2024 01:27 PM | |
1267 | 05-05-2024 01:09 PM |
01-24-2024
06:26 PM
Hello Smruti, I tried to clear the usercache dir in all danotes as suggested. But the issue is resolved. Basically it is issue with Tez session. It is happening all the jobs which are involved tez . Thanks Narasimha.
... View more
01-18-2024
10:04 PM
Much Thanks .
... View more
12-27-2023
09:48 AM
2 Kudos
Hive 3.0 introduced an option to re-attempt a failed Hive query, in case the first run fails. It would only make sense if we fixed whatever was the issue in the previous run. We'll discuss the ways to configure this once without having to intervene after each failure event. The following Hive property enables query re-execution. This should be enabled out of the box. hive.query.reexecution.enabled=true;
hive.query.reexecution.strategies=overlay,reoptimize,recompile_without_cbo,reexecute_lost_am; Re-execution strategies: Overlay Using this method, we can set a Hive property that should be applied on the re-execution. It works by adding a configuration subtree as an overlay to the actual hive settings(reexec.overlay.*). set reexec.overlay.{hive_property}=new_value Every hive setting which has a prefix of "reexec.overlay" will be set for all re-executions. e.g. In case our Hive queries fail with OOM while performing Map Joins, which could occur when we do not have correct stats for the tables, we could try disabling hive.auto.convert.join for the next attempt: set reexec.overlay.hive.auto.convert.join=false;
set hive.query.reexecution.strategies=overlay; Reoptimize Throughout the execution of a query, the system actively monitors the real count of rows passing through each operator. This recorded information is leveraged in subsequent re-planning stages, potentially leading to the generation of a more optimized query plan. Instances where this becomes essential include: - Absence of statistics. - Inaccurate statistics. - Scenarios involving numerous joins. In order to enable this, use: set hive.query.reexecution.strategies=overlay,reoptimize
set hive.query.reexecution.stats.persist.scope=query hive.query.reexecution.stats.persist.scope provides an option to persists the runtime stats at different levels: query - only used during the reexecution hiveserver2 - persisted in the HS2 until restarted metastore - persisted in the metastore; and loaded on hive server startup. Avoid setting it to "metastore" due to the bug discussed in HIVE-26978 recompile_without_cbo When CBO fails during compilation phase, it falls back to legacy optimizer, but in many cases the it is unable to correctly recreate the AST. HIVE-25792 helps recompile the query without CBO in case it fails. reexecute_lost_am Re-executes query if it failed due to tez am node gets decommissioned. Some relevant configurations : Configuration default hive.query.reexecution.always.collect.operator.stats false Enable to gather runtime statistics on all queries. hive.query.reexecution.enabled true Feature enabler hive.query.reexecution.max.count 1 number of reexecution that may happen hive.query.reexecution.stats.cache.batch.size -1 If runtime stats are stored in metastore; the maximal batch size per round during load. hive.query.reexecution.stats.cache.size 100 000 Size of the runtime statistics cache. Unit is: OperatorStat entry; a query plan consist ~100. hive.query.reexecution.stats.persist.scope query runtime statistics can be persisted: query: - only used during the reexecution hiveserver: persisted during the lifetime of the hiveserver metastore: persisted in the metastore; and loaded on hiveserver startu[ hive.query.reexecution.strategies overlay,reoptimize,recompile_without_cbo,reexecute_lost_am reexecution plugins; currently overlay and reoptimize is supported runtime.stats.clean.frequency 3600s Frequency at which timer task runs to remove outdated runtime stat entries. runtime.stats.max.age 3days Stat entries which are older than this are removed.
... View more
12-27-2023
06:53 AM
1 Kudo
@wert_1311 It should be in the following location in S3: s3://<s3_bucket_name>/clusters/<environment_id>/<database_catalog_id>/warehouse/tablespace/external/hive/sys.db/logs/dt=YYYY-MM-DD/ns=<virtual_warehouse_id>/ Hiveserver2 log directory will be named app=hiveserver2/ You could also view hiveserver2 pod logs using kubectl: e.g. kubectl logs hiveserver2-0 -n compute-1584641610-f14f -c hiveserver2
... View more
12-26-2023
10:49 PM
Hello @smruti , Please let me know if any way where we can have call or meet to review the script ? I will in mean time will try to remove the -n options and check if that works.
... View more
12-15-2023
11:09 AM
Cloudera's official statement on this subject can be found here. Cloudera supports various RDBMS options, each of which have multiple possible strategies to implement HA. Cloudera cannot reasonably test and certify on each strategy for each RDBMS. Cloudera expects HA solutions for RDBMS to be transparent to Cloudera software, and therefore are not supported and debugged by Cloudera. It is the responsibility of the customer to provision, configure, and manage the RDBMS HA deployment, so that Cloudera software behaves as it would when interfacing with a single, non-HA service.
... View more
11-13-2023
10:55 AM
@jayes Please make sure that you have set this property in "HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml" under Hive on Tez configuration. I tried this and it works for me: Beeline version 3.1.3000.7.1.7.2000-305 by Apache Hive
0: jdbc:hive2://c1649-node2.coelab.cloudera.c> set dfs.replication=1;
No rows affected (0.208 seconds)
0: jdbc:hive2://c1649-node2.coelab.cloudera.c> set hive.security.authorization.sqlstd.confwhitelist.append;
+----------------------------------------------------+
| set |
+----------------------------------------------------+
| hive.security.authorization.sqlstd.confwhitelist.append=mapred\..*|hive\..*|mapreduce\..*|spark\..*|dfs\..* |
+----------------------------------------------------+
1 row selected (0.109 seconds)
... View more
10-25-2023
03:37 AM
Thanks for assistance @DianaTorres
... View more
09-29-2023
08:59 PM
@Srinivas-M You may set these properties in a safety valve for core-site.xml. CM UI > HDFS > Configuration > Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml
... View more
09-25-2023
12:36 AM
Thx for your answer, but I have problem that i don't know what fields can be null( my query will have a different columns in WHERE statment
... View more