Member since
03-06-2020
398
Posts
54
Kudos Received
35
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
146 | 11-21-2024 10:12 PM | |
1001 | 07-23-2024 10:52 PM | |
1143 | 05-16-2024 12:27 AM | |
3247 | 05-01-2024 04:50 AM | |
1416 | 03-19-2024 09:23 AM |
09-07-2022
11:20 PM
Hi @vz If both columns are in string i think you can use concat or concat_ws, Can you check below articles and see if this helps? https://community.cloudera.com/t5/Support-Questions/HIVE-Concatenates-all-columns-easily/td-p/180208 https://blog.fearcat.in/a?ID=01600-32e80587-5a71-411e-835b-ed905cb1b61a https://stackoverflow.com/questions/51211278/concatenate-multiple-columns-into-one-in-hive Note: If my reply answers your question please give a thumbs up and accept it as a solution. Regards, Chethan YM
... View more
09-07-2022
11:08 PM
1 Kudo
Hi @gocham , In CDP 7.1.7 Capacity Scheduler is alone supported and Fair Scheduler is not supported, Capacity Scheduler is the default and only supported scheduler. You must transition from Fair Scheduler to Capacity Scheduler when upgrading your cluster to CDP Private Cloud Base. This is the related Jira from Cloudera - CLR-106983 Note: If i answered your question please give a thumbs up and accept it as a solution. Regards, Chethan YM
... View more
09-07-2022
07:18 AM
1 Kudo
Hi @Ramesh_hdp I do not think we have an option to check number of users are logged into Hive, But we can check how many connections are made to Hive. >you can refer the below article: https://community.cloudera.com/t5/Support-Questions/How-to-get-number-of-live-connections-with-HiveServer2-HDP-2/td-p/106284 >If you enable Hiveserver2 UI then under Active Sessions you can see which user connected from which IP Regards, Chethan YM
... View more
09-05-2022
05:16 AM
Hi, Please review the below documentation: https://docs.cloudera.com/HDPDocuments/DAS/DAS-1.4.5/index.html As per this it looks like it only works on Hive and need PostgreSQL db. Regards, Chethan YM
... View more
09-05-2022
05:05 AM
1 Kudo
Hi @Iga21207 , So how it works in catalod is when you run any refresh commands then that is executed sequentially and once that is completed then it goes to next one. It doesn't run in parallel as per the catalogd which is a single threaded operation. There is a lock that catalogd thread creates on class getCatalogObjects(). So when you are refreshing(means they have not completed yet sequentially) and after that when the new request came in then the Catalog threw the error on that table as it can't get the lock because the lock is already there on previous table on which the refresh command was running. Not sure on your CDH version, This may resolved in Higher version of CDP/CDH. Note: If i answered your question please give a thumbs up and accept it as a solution. Regards, Chethan YM
... View more
09-05-2022
04:49 AM
Hi, > Running workflow for more than 7 days means, does it run entire 7 days all the time and fails? Can you provide the script you are running? > Is the shell script works outside of oozie without issues? > Please provide the complete error stack trace you are seeing. Regards, Chethan YM
... View more
08-29-2022
05:21 AM
2 Kudos
Hi, After you run the query you need to look at the query profile to analyse the complete memory. look for “Per node peak memory usage” in the profile to understand how much memory each host or impala daemon used to run this query. For above snippet from your side it looks like this query has the 3gb max limit to run the query, this can be set at session level or in impala admission control pool. If you provide the complete query profile i think we can get more details. Regards, Chethan YM
... View more
08-29-2022
05:07 AM
1 Kudo
Hi, Yes, Hive metastore is a component that stores all the structure information(metadata) of objects like tables and partitions in the warehouse including column and column type information etc... Regards, Chethan YM Note: If this answered your question please accept the reply as a solution.
... View more
08-29-2022
03:52 AM
Hi, I do not think we have such configuration to validate the data, We need to ensure that data matches with the table that we have created. Regards, Chethan YM
... View more
08-29-2022
03:43 AM
1 Kudo
Hi, There's no limit about the number of databases in Hive metastore. As a good practice we do not recommend to create tables with more than 10,0000 partitions. In my opinion, you shouldn't have problem with 2,000 tables. You can expect to have some type of performance issues if you have a total number object greater than 500,000 and there's no a hard limit about the number of Hive/Impala databases/tables that you can have in the cluster. Regards, Chethan YM
... View more