Member since
10-29-2015
128
Posts
31
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1991 | 06-27-2024 02:42 AM | |
3273 | 06-24-2022 09:06 AM | |
4634 | 01-19-2021 06:56 AM | |
59192 | 01-18-2016 06:59 PM |
10-17-2022
06:28 AM
Hi, We are in process of setting up a CDP 7.1.7 SP1 cluster. We have got Knox enabled and configured across all services which works fine for all except Livy. When we attempt to submit a spark session via Livy service (Knox URL), it does not recognise the user session is getting submitted from and so returns "unauthorised error). There is a log entry in Knox gateway which clearly shows that the user name is not detected. However, when we submit a spark session directly via Livy URL with the same user, it passes through. This confirms definitely something is not correct in the Knox / Livy configuration. In this attempt, at the same place in Knox gateway logs, it has a POST entry which clearly displays the user name who has submitted this job. Digging it further, found that there are is no entry in simplified topology configuration for Livy. Not sure if this is an issue? Attached is the configuration. Secondly, we also found that in KNOX_DATA_DIR/services, we have a folder called Livy and it has 3 version folders. What sense this makes, not sure? Honestly, I do not understand what is the actual significance of these folders. It has rewrite and services.xml in it. Further referring to below article, did mentioned about creating these files in services folder, however, we are not sure how exactly that would help. Add custom service to existing descriptor in Apache Knox Proxy | CDP Private Cloud (cloudera.com) Is there a documentation that actually helps to understand all the steps that are involved in configuring Livy to work with Knox and explains the communications / interactions within these services? Also, any help in getting this setup correct would be really great. Additionally, how exactly topologies in Knox have an impact in this. Thanks snm1523
... View more
06-30-2022
07:14 AM
Hello, Is there a straight documentation available that would help to side car migrate Oozie jobs (50+) from HDP to CDP PB? I know of the properties file that might need some modification to point to CDP RM and relevant servers, however, facing it hard to understand / map properties file as we have 50+ workflows to be migrated. Thanks snm1523
... View more
Labels:
06-30-2022
06:56 AM
Thank you @araujo, Do we also get a field to enter these details while installing Cloudera Manager at the page were we add custom repositories for Hadoop parcels? I don't remember it hence, asking. Thanks snm1523
... View more
06-24-2022
09:06 AM
Thank you for the response. I was able to find a way out to fix this.
... View more
06-16-2022
08:00 AM
Hi, I just completed setting up CDP Private Base in a POC environment. I was in process of attempting AD (LDAP based) integration so that users get authenticated via Active Directory. I am unsure if there was a mis configuration, however, after restarting Cloudera Manager, I can't login to it via "Admin" account (local). Thought it has got AD integrated, tried multiple accounts from AD, however, none of them are working. Please help to re-enable admin (local) account. Thanks snm1523
... View more
06-16-2022
02:17 AM
Hello, I am in process of setting up a CDP cluster (private base) and in step of configuring local repositories. I have got the Cloudera repos configured via out standard repository solution, Artifactory since servers will not have access to public internet to access Cloudera archives. Now the URL given to me to access those repos has to be authenticated via a User ID and API key. So URL ultimately turns out something like this: https://<user>:<API Key>@Repo URL/ Questions are: 1. Is there a way I can configure a authentication based YUM repository for Cloudera Manager packages, without having the need to enter credentials in a plain text in base URL field of YUM repo? 2. Once we have Cloudera Manager installed, we can provide a custom repo link in Cloudera Manager to fetch Cloudera Runtime Parcels. At that stage, will Cloudera Manager accept Environment variables in the URL or that again has to be a plain text? Else, is there any other way to setup authentication based Cloudera runtime repo which can be used in Cloudera Manager so we don't have to provide these creds in a plain text. Thanks snm1523
... View more
04-08-2022
08:10 AM
1 Kudo
Hi All, I was able to get my script tested on my 10 nodes DEV cluster. Below are the results: 1. All HDP core services started / stopped okay 2. None of Hive Service Interactive service started and hence, Hive service was not marked as STARTED though HMS and HS2 were started okay 3. None of the Spark2_THRIFTSERVER was started Any one can share some thoughts on points 2 and 3? Thanks snm1523
... View more
03-09-2022
07:29 AM
Additional info @steven-matison, I checked further an API call for SPARK2 service and found a difference. Spark Thrift Server is reported as STARTED in the API output, however, on Ambari UI it is stopped. See the screenshots. Ambari UI: API Output: Any thoughts / suggestions. Thanks snm1523
... View more
03-09-2022
07:08 AM
@steven-matison Something like this. This time Spark, Kafka and Hive got stopped as expected, but since YARN took a longer to Stop (internal components were still getting stopped), moment script got service status of YARN as INSTALLED, it triggered HDFS and HDFS was waiting. Please refer to screenshot: What I want is to ensure previous service is completely started / stopped before the next one is triggered. Not sure if that is even possible. Thanks snm1523
... View more
03-09-2022
07:01 AM
Thank you for the response @steven-matison. I have 3 different scenarios so far: 1. Kafka, at times one or other Kafka Broker doesn't come up quickly. 2. Spark, Spark Thrift server is always not starting at the first place when bundled in an All services script. However, if called individually (only SPARK) works as expected. 3. Hive, we have around 6 Hive Interactive servers. HSI by nature takes a little long to start (they do start eventually though). In all the scenarios mentioned above, the moment there is an API call to start or stop service, the status in ServiceInfo field of API output changes to what is needed (i.e. Installed in case of Stop and Started in case of start), however, the underlying components are still doing their work to start / stop. Since, I am checking the status at the service level (reference below), the condition is passed and moves ahead. Ultimately I am in a situation of one service still starting / stopping and other is already triggered. str=$(curl -s -u $USER:$PASS http://{$HOST}/api/v1/clusters/$CLUSTER/services/$1) if [[ $str == *"INSTALLED"* ]] then finished=1 echo "\n$1 Stopped...\n" fi So far I have noticed this only for those 3 services. Hence, I am seeking suggestions on how do I overcome this OR if there is a way I could check status of each component of each service. Hope I was able to explain this better. Thanks snm1523
... View more