Member since
03-06-2020
292
Posts
26
Kudos Received
20
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
162 | 10-25-2022 04:37 AM | |
227 | 09-24-2022 10:12 PM | |
354 | 09-07-2022 11:08 PM | |
235 | 09-07-2022 07:18 AM | |
196 | 09-05-2022 05:05 AM |
12-26-2022
04:51 AM
Hi, -> Can you check user has sudo/root permissions on these hosts? -> Can you do agent hard restart on these hosts and try? -> recheck private key is valid and correct one. Regards, Chethan YM
... View more
12-18-2022
02:39 AM
Hi, Can you create a separate user in the database for each service and retry? You should NOT use the root user for all databases. Refer below doc: https://docs.cloudera.com/documentation/enterprise/latest/topics/cm_ig_extrnl_pstgrs.html#cmig_topic_5_6_2 Regards, Chethan YM
... View more
12-18-2022
02:23 AM
Hi, Where you are redirecting the output to? for csv file? What is the output from impala-shell terminal if you do not redirect the output? It looks like caused by the csv module when impala is using it to export the data. # csv.writer expects a file handle to the input. # cStringIO is used as the temporary buffer. temp_buffer = StringIO() writer = csv.writer(temp_buffer, delimiter=self.field_delim, lineterminator='\n', quoting=csv.QUOTE_MINIMAL) writer.writerows(rows) Seems to be we cannot change this since it is needs to modified at code level. https://github.com/apache/impala/blob/014c455aaaa38010ae706228f7b439c080c0bc7d/shell/shell_output.py... Regards, Chethan YM
... View more
12-18-2022
02:02 AM
Hi, -> Can you try with -Doraoop.timestamp.string=false property? -> Can you try with sqoop type mapping parameters as per below doc and see? https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_java_sql_timestamp:~:text=on%20all%20databases.-,7.2.8.%C2%A0Controlling%20type%20mapping,-Sqoop%20is%20preconfigured Regards, Chethan YM
... View more
12-18-2022
01:57 AM
Hi, > Add the below configuration in HDFS from CM UI: CM -> hdfs -> Configuration hadoop.proxyuser.hive.hosts = * hadoop.proxyuser.hive.groups = * > If the above is already set, please disable the DB notification API auth. In Hive and Hive_On_Tez Service: In Hive Service Advanced Configuration hive-site safety value: Name: hive.metastore.event.db.notification.api.auth Value: false Restart stale configs and check if it works. Regards, Chethan YM
... View more
11-18-2022
04:43 AM
1 Kudo
Hi, It looks like known- OPSAPS-60161 unresolved issue. Can you disable the canary health check? https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_ht_hive_metastore_server.html#concept_6qo_fpn_yk Regards, Chethan YM
... View more
11-02-2022
07:28 AM
Hi, > As per the document it is service down time, So i think it is complete impala service down time. (However I haven't seen the issue on live) > No metrics/graph to check " inc_stats_size" > If 1GB is insufficient, Try to use "compute stats" instead of "compute incremental stats" Regards, Chethan YM
... View more
10-25-2022
04:47 AM
Hi @yassan , I would like to let you know that, the default value on the flag(inc_stats_size_limit_bytes) is set to 200 MB, as a safety check to prevent Impala from hitting the maximum limit for the table metadata. Whereas, the error reported usually serves as an indication that 'COMPUTE INCREMENTAL STATS' should not be used on the particular table and consider spitting the table thereby, using regular 'COMPUTE STATS' statement if possible. However, incase if you are not able to use the 'Compute Stats' statement then you could try to increase the default limit on the flag(inc_stats_size_limit_bytes) where, it should be set less than 1 GB limit and the value is measured in bytes. Below is the seteps: 1. CM > Impala Service > Configuration > Search "Impala Command Line Argument Advanced Configuration Snippet (Safety Valve)" 2. Add --inc_stats_size_limit_bytes= #####Please note that the above value is in bytes. For example, if you want to set 400 Mb, please input 419430400(400*1024*1024). 3. Please save and restart Impala service. Note: If I answered your question please give a thumbs up and Accept it as a solution. Regards, Chethan YM
... View more
10-25-2022
04:37 AM
1 Kudo
Hi, There is a KB article related to this issue, please review this below: https://community.cloudera.com/t5/Customer/Permission-denied-when-accessing-Hive-tables-from-Spark-in/ta-p/310498 Regards, Chethan YM
... View more
10-25-2022
04:31 AM
Hi, The alert message does not give more information to check, review the HMS and CM service monitor logs related to this issue and provide the stack-traces. Regards, Chethan YM
... View more
10-25-2022
04:26 AM
Hi, As per the below Git, PauseTransitRunnable is the runnable which is scheduled to run at the configured interval, it checks all bundles. to see if they should be paused, un-paused or started. https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/service/PauseTransitService.java#:~:text=PauseTransitRunnable%20is%20the%20runnable%20which%20is%20scheduled%20to%20run%20at%20the%20configured%20interval%2C%20it%20checks%20all%20bundles It also released the lock, Are you running this sqoop-import from oozie? If yes try to rerun outside of oozie and check if it still get stucks and review the corresponding NM and RM logs if anything interrupting. Regards, Chethan YM
... View more
09-24-2022
10:12 PM
Hi , There seems to be a UDF present for SK in Hive, Have you tried this? Is it Working? https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/using-hiveql/topics/hive_surrogate_keys.html Regards, Chethan YM
... View more
09-24-2022
10:03 PM
1 Kudo
Hi, It looks like waiting for inserting the data, It may get finish after few minutes. Is it worked or still hangs for hours?Create table when we upload CSV file usually takes more time. Regards, Chethan YM
... View more
09-16-2022
03:16 AM
Hi, Below is the suspected causes for this issue: https://issues.apache.org/jira/browse/YARN-3055 https://issues.apache.org/jira/browse/YARN-2964 Yes, You can set that parameter at workflow level and test. Regards, Chethan YM
... View more
09-15-2022
02:53 AM
Hi @coco Can you follow the below steps in Hue, if you are running the job from Hue. 1. Login to Hue 2. Go to Workflows -> Editors -> Workflows 3. Open the Workflow to edit. 4. On the left hand pane, Click 'Properties' 5. Under section 'Hadoop Job Properties', in the Name box enter 'mapreduce.job.complete.cancel.delegation.tokens' and in value enter 'true'. 6. Save the workflow and submit. If you are running from terminal add the above property in configurations section then rerun the workflow and see if it helps. If this works please accept it as a solution. Regards, Chethan YM
... View more
09-07-2022
11:20 PM
Hi @vz If both columns are in string i think you can use concat or concat_ws, Can you check below articles and see if this helps? https://community.cloudera.com/t5/Support-Questions/HIVE-Concatenates-all-columns-easily/td-p/180208 https://blog.fearcat.in/a?ID=01600-32e80587-5a71-411e-835b-ed905cb1b61a https://stackoverflow.com/questions/51211278/concatenate-multiple-columns-into-one-in-hive Note: If my reply answers your question please give a thumbs up and accept it as a solution. Regards, Chethan YM
... View more
09-07-2022
11:08 PM
1 Kudo
Hi @gocham , In CDP 7.1.7 Capacity Scheduler is alone supported and Fair Scheduler is not supported, Capacity Scheduler is the default and only supported scheduler. You must transition from Fair Scheduler to Capacity Scheduler when upgrading your cluster to CDP Private Cloud Base. This is the related Jira from Cloudera - CLR-106983 Note: If i answered your question please give a thumbs up and accept it as a solution. Regards, Chethan YM
... View more
09-07-2022
07:18 AM
1 Kudo
Hi @Ramesh_hdp I do not think we have an option to check number of users are logged into Hive, But we can check how many connections are made to Hive. >you can refer the below article: https://community.cloudera.com/t5/Support-Questions/How-to-get-number-of-live-connections-with-HiveServer2-HDP-2/td-p/106284 >If you enable Hiveserver2 UI then under Active Sessions you can see which user connected from which IP Regards, Chethan YM
... View more
09-05-2022
05:16 AM
Hi, Please review the below documentation: https://docs.cloudera.com/HDPDocuments/DAS/DAS-1.4.5/index.html As per this it looks like it only works on Hive and need PostgreSQL db. Regards, Chethan YM
... View more
09-05-2022
05:05 AM
1 Kudo
Hi @Iga21207 , So how it works in catalod is when you run any refresh commands then that is executed sequentially and once that is completed then it goes to next one. It doesn't run in parallel as per the catalogd which is a single threaded operation. There is a lock that catalogd thread creates on class getCatalogObjects(). So when you are refreshing(means they have not completed yet sequentially) and after that when the new request came in then the Catalog threw the error on that table as it can't get the lock because the lock is already there on previous table on which the refresh command was running. Not sure on your CDH version, This may resolved in Higher version of CDP/CDH. Note: If i answered your question please give a thumbs up and accept it as a solution. Regards, Chethan YM
... View more
09-05-2022
04:49 AM
Hi, > Running workflow for more than 7 days means, does it run entire 7 days all the time and fails? Can you provide the script you are running? > Is the shell script works outside of oozie without issues? > Please provide the complete error stack trace you are seeing. Regards, Chethan YM
... View more
08-29-2022
05:21 AM
2 Kudos
Hi, After you run the query you need to look at the query profile to analyse the complete memory. look for “Per node peak memory usage” in the profile to understand how much memory each host or impala daemon used to run this query. For above snippet from your side it looks like this query has the 3gb max limit to run the query, this can be set at session level or in impala admission control pool. If you provide the complete query profile i think we can get more details. Regards, Chethan YM
... View more
08-29-2022
05:07 AM
1 Kudo
Hi, Yes, Hive metastore is a component that stores all the structure information(metadata) of objects like tables and partitions in the warehouse including column and column type information etc... Regards, Chethan YM Note: If this answered your question please accept the reply as a solution.
... View more
08-29-2022
03:52 AM
Hi, I do not think we have such configuration to validate the data, We need to ensure that data matches with the table that we have created. Regards, Chethan YM
... View more
08-29-2022
03:43 AM
1 Kudo
Hi, There's no limit about the number of databases in Hive metastore. As a good practice we do not recommend to create tables with more than 10,0000 partitions. In my opinion, you shouldn't have problem with 2,000 tables. You can expect to have some type of performance issues if you have a total number object greater than 500,000 and there's no a hard limit about the number of Hive/Impala databases/tables that you can have in the cluster. Regards, Chethan YM
... View more
08-16-2022
08:38 AM
> Below is the document which has some more details on the same: https://impala.apache.org/docs/build/html/topics/impala_upsert.html > Please let us know what is your concerns.
... View more
08-16-2022
08:25 AM
Hi @PNW If you want to disable the automatic restart of cloudera services v isit the configuration page of the service you'd like to manage, then search for "Automatically Restart Process". You should see this option for each role within the service, then uncheck it. Regards, Chethan YM
... View more
07-06-2022
07:44 AM
Hello Syed, Can you try the alternatives command to change the python version? Check the below link. https://medium.com/coderlist/how-to-change-default-python-version-on-linux-fedora-28-c22da18bdd6 Regards, Chethan YM
... View more
06-30-2022
07:31 AM
Hi, Attached screenshot does not give more details, Do check application logs, HS2 logs and metastore logs to find out the more information. Check if you have required resources in the cluster to run the query. Check if any JVM pauses due to out of memory. Check if the tables involved in this job is not using by anyone for writing etc... Try to restart Hive and rerun the job again and check. Regards, Chethan YM
... View more
06-23-2022
09:04 AM
Hello, You need to enable trace level logging for ODBC driver and should check the logs after reproducing the issue. Check if you have any socket timeout errors in the trace logs if yes refer the below doc to set the socket timeout. https://docs.cloudera.com/documentation/other/connectors/impala-odbc/2-6-5/Cloudera-ODBC-Driver-for-Impala-Install-Guide.pdf Regards, Chethan YM
... View more