Member since
10-29-2015
72
Posts
10
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
832 | 01-19-2021 06:56 AM | |
40678 | 01-18-2016 06:59 PM |
04-13-2022
03:24 AM
Hello, I had heard that Hive LLAP will be made available out of the box with the release of CDP PB 7.1.7 SP1. I did checked the release notes and found support for HWC in secure mode is made available. However, unsure if Hive LLAP is also released. What I understand is HWC enables users to access Hive warehouse (managed - transactional) data directly from Spark like applications and LLAP allows low latency while executing Hive queries eliminating the need of deploying Impala. This is specially for users migrating from HDP to CDP. Would be great if anyone can confirm if LLAP is also made available in SP1 and any reference links confirming the same. Below are the links I have referred to: Apache Hive 3 architectural overview | CDP Private Cloud (cloudera.com) Cloudera Runtime 7.1.7 SP1 component versions | CDP Private Cloud Documentation Errata in Cloudera Runtime 7.1.7 SP1 | CDP Private Cloud Thanks snm1523
... View more
Labels:
04-08-2022
08:10 AM
1 Kudo
Hi All, I was able to get my script tested on my 10 nodes DEV cluster. Below are the results: 1. All HDP core services started / stopped okay 2. None of Hive Service Interactive service started and hence, Hive service was not marked as STARTED though HMS and HS2 were started okay 3. None of the Spark2_THRIFTSERVER was started Any one can share some thoughts on points 2 and 3? Thanks snm1523
... View more
03-09-2022
07:29 AM
Additional info @steven-matison, I checked further an API call for SPARK2 service and found a difference. Spark Thrift Server is reported as STARTED in the API output, however, on Ambari UI it is stopped. See the screenshots. Ambari UI: API Output: Any thoughts / suggestions. Thanks snm1523
... View more
03-09-2022
07:08 AM
@steven-matison Something like this. This time Spark, Kafka and Hive got stopped as expected, but since YARN took a longer to Stop (internal components were still getting stopped), moment script got service status of YARN as INSTALLED, it triggered HDFS and HDFS was waiting. Please refer to screenshot: What I want is to ensure previous service is completely started / stopped before the next one is triggered. Not sure if that is even possible. Thanks snm1523
... View more
03-09-2022
07:01 AM
Thank you for the response @steven-matison. I have 3 different scenarios so far: 1. Kafka, at times one or other Kafka Broker doesn't come up quickly. 2. Spark, Spark Thrift server is always not starting at the first place when bundled in an All services script. However, if called individually (only SPARK) works as expected. 3. Hive, we have around 6 Hive Interactive servers. HSI by nature takes a little long to start (they do start eventually though). In all the scenarios mentioned above, the moment there is an API call to start or stop service, the status in ServiceInfo field of API output changes to what is needed (i.e. Installed in case of Stop and Started in case of start), however, the underlying components are still doing their work to start / stop. Since, I am checking the status at the service level (reference below), the condition is passed and moves ahead. Ultimately I am in a situation of one service still starting / stopping and other is already triggered. str=$(curl -s -u $USER:$PASS http://{$HOST}/api/v1/clusters/$CLUSTER/services/$1) if [[ $str == *"INSTALLED"* ]] then finished=1 echo "\n$1 Stopped...\n" fi So far I have noticed this only for those 3 services. Hence, I am seeking suggestions on how do I overcome this OR if there is a way I could check status of each component of each service. Hope I was able to explain this better. Thanks snm1523
... View more
03-09-2022
01:28 AM
Thank you Andre. Wasn't aware. Actually I had used the steps mentioned in my reply during installation of Nifi. So thought might be useful in this case. Thanks Sunil
... View more
03-08-2022
07:41 AM
Hi, I am trying to perform a scripted start / stop of HDP services and its components. I need to get this done service by service, because there are few components like Hive Interactive / Spark Thrift server / Kafka broker, which does not get started / stopped in proper time. Most of the times it is observed that even if HSInteractive is still starting (visible in the background operations panel on Ambari), command moves to next service to start / stop and ultimately HSI fails. Hence, I also want to ensure that previous service is completely stopped / started, before attempting to stop / start next service in the list. To achieve, below is the script I have written (this is to stop, I have similar for start). However, when it reaches to the point for Hive or Spark or Kafka services, even though the the internal components like Hive Interactive or Spark Thrift server or Kafka broker are not started / stopped, it moves to the next service. Most of the times the start of Spark Thrift server fails via this script. However, if same API call is sent only for Spark or Hive individually it works as expected. Shell script for reference: USER='admin' PASS='admin' CLUSTER='xxxxxx' HOST='xxxxxx:8080' function stop(){ curl -s -u $USER:$PASS -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Stop '"$1"' via REST"}, "Body": {"ServiceInfo": {"state": "INSTALLED"}}}' http://$HOST/api/v1/clusters/$CLUSTER/services/$1 echo -e "\nWaiting for $1 to stop...\n" wait $1 "INSTALLED" maintOn $1 } function wait(){ finished=0 check=0 while [[] $finished -ne 1 ]] do str=$(curl -s -u $USER:$PASS http://{$HOST}/api/v1/clusters/$CLUSTER/services/$1) if [[ $str == *"INSTALLED"* ]] then finished=1 echo "\n$1 Stopped...\n" fi check=$((check+1)) sleep 3 done if [[ $check -eq 3 ]] then echo -e "\n${1} failed to stop after 3 attempts. Exiting...\n" exit $? fi } function maintOn(){ curl -u $USER:$PASS -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo":{"context":"Turn ON Maintenance Mode for $1 via Rest"},"Body":{"ServiceInfo":{"maintenance_state":"ON"}}}' http://$HOST/api/v1/clusters/$CLUSTER/services/$1 } stop AMBARI_INFRA_SOLR stop AMBARI_METRICS stop HDFS stop HIVE stop KAFKA stop MAPREDUCE2 stop SPARK2 stop YARN stop ZOOKEEPER Any help / guidance would be great. Thanks Snm1523
... View more
03-08-2022
07:04 AM
Hi @CookieCream, Should guide you on how do we get certificates generated. Apache NiFi Toolkit Guide Once you have this followed and certs generated, you will have a new nifi.properties created that will include truststore and keystore related properties. I have not tried anything on MacOS, but, I did saw there are some specific instructions for MacOS. Have a look Thanks snm1523
... View more
03-02-2022
01:35 AM
Here's your actual error: 2022-02-27 23:08:20,716 WARN [main] o.a.nifi.security.util.SslContextFactory Some truststore properties are populated (./conf/truststore.p12, ********, PKCS12) but not valid
2022-02-27 23:08:20,717 ERROR [main] o.apache.nifi.controller.FlowController Unable to start the flow controller because the TLS configuration was invalid: The truststore properties are not valid Ensure that you have generated certificates (SSL in your case, i assume). Add them to the truststore.jks of your Nifi instance (default location: ./conf/). Also, ensure the truststore and keystore properties in nifi.properties is accurately updated. This should help i guess. Thanks snm1523
... View more
03-01-2022
05:46 AM
Hi All, We are in early stages of planning HDP 3 to CDP 7.1.7 Private cloud base upgrade and would like to know experiences from the community who have attempted this. Especially in the scenario where we are using Isilon 8.2.2 (planned to upgrade this to 9.x) instead of standard HDFS service. Few points I would certainly need suggestions on are: 1. What were the considerations of this project? 2. How were hive metastore and actual replications and backups handled? We can't use replication manager due to some specific limitations, like: - CDP Replication Manager will do a pull and converts all the managed tables to external at the destination - Also our network connectivity is done in a way that it would support push jobs instead of pull So we will need a custom backup, restore and replication solution 3. Since Isilon doesn't have any integration with CM on CDP, Dell has suggested to use InsightIQ solution from them that would help us to fetch metrics from Isilon Would certainly welcome any more suggestions / points to consider that i may not have mentioned above. Please suggest Thanks snm1523
... View more
- Tags:
- CDP
- HDP
- HDP to CDP
02-24-2022
04:21 AM
1 Kudo
Have you got a company wide CA cert or a SSL cert created and added that to the Nifi truststore? Also, if it is clustered, ensure that certificate of all cluster nodes are added to each others truststore. Additionally, also ensure that .keystore and .truststore properties in Nifi.properties are updated properly i.e. correct password and locations are entered. Thanks snm1523
... View more
05-24-2021
09:22 PM
Hello Vos, Please share an email address to send the doc. Thanks Snm1523
... View more
05-14-2021
08:01 AM
@tusharkathpal , got hit to another issue, working on that. Will revert with these results by Monday.
... View more
05-14-2021
08:00 AM
Hello @vidanimegh, We are not changing any permissions to users (including me) on default DB. They are just default (whatever they get once created). We are managing permissions using Sentry on each DB that is created. We have verified permissions are all okay in Sentry as users (including me) are able to see / query tables in Impala via Hue but not in Hive. Thanks snm1523
... View more
05-14-2021
04:48 AM
Hello @tusharkathpal, Thank you for the reply. Have verified all the users are in correct groups and same on all nodes. Please suggest what can be checked further. Thanks snm1523
... View more
05-10-2021
07:53 AM
Any suggestions please?
... View more
03-23-2021
04:01 AM
Hello All, We have a around 22 databases and their respective tables which are accessible via Impala in Hue, but not via Hive for 3 newly added users. We get below error which is related to permissions to databases via Sentry, however, this looks strange to me since the permissions are managed at DB level and not specific to service. So if permissions are not correct, we should not have been able to access them via Impala as well. Error message: Error while compiling statement: FAILED: SemanticException No valid privileges User xxxxxx does not have privileges for SWITCHDATABASE The required privileges: Server=server1->Db=*->Table=+->Column=*->action=select->grantOption=false;Server=server1->Db=*->Table=+->Column=*->action=insert->grantOption=false;Server=server1->Db=*->Table=+->Column=*->action=alter->grantOption=false;Server=server1->Db=*->Table=+->Column=*->action=create->grantOption=false;Server=server1->Db=*->Table=+->Column=*->action=drop->grantOption=false;Server=server1->Db=*->Table=+->Column=*->action=index->grantOption=false;Server=server1->Db=*->Table=+->Column=*->action=lock->grantOption=false; Query that works fine in Impala but not Hive: select * from dbname.tablename limit 5; Please suggest what can be checked / done to fix this. Thanks snm1523
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Impala
-
Cloudera Hue
02-08-2021
07:15 AM
Hello, We have a strange issue here. We are on CDH 6.3.0 and have sentry in place for authorizations. A user is trying to execute queries from Hue on Impala Editor and fails with Authorization errors. However, same query works fine via Hive and also via Impala shell. It fails only from Hue editor. We have: 1. Refreshed metadata 2. Invalidated metadata 3. Verified permissions are in place in Sentry 4. There are no logs generated (I checked in Hue, Sentry Server and Catalog server). Please suggest if any other place should also be checked. Example query: select * from DB.table limit 5; Kindly help to diagnose and resolve this issue. Thanks Snm1523
... View more
Labels:
01-19-2021
06:56 AM
Thank you for the reply @smdas After looking at the privilege logs, found that the port was reported to be used. In reality it was not being used so possible some caching issue. We rebooted the server and were able to then restart the DN successfully. Thanks snm1523
... View more
01-11-2021
04:05 AM
Hello, Does anyone has any clue about below error. Datanode is not coming up with this error in the logs. We are on Cloudera Hadoop 6.3.0. RECIEVED SIGNAL 15: SIGTERM Thanks snm1523
... View more
Labels:
- Labels:
-
Apache Hadoop
10-07-2020
05:12 AM
Hello @Elf , Yes I was able to. Have prepared a quick guide detailing the steps to be taken which in turn refers to Apache Griffin documentations and other required pre-reqs. Unsure, how do i share with you since I am unable to attach it here. Thanks snm1523
... View more
10-07-2020
05:07 AM
Hello Shishir, Would you mind to please how do we migrate a standalone Nifi setup to cluster mode? Thanks snm1523
... View more
08-31-2020
04:25 AM
Thank you... This helped.
... View more
08-20-2020
07:22 AM
Hi All,
Need opinions / thoughts / best practices.
Has anyone tried integrating Apache Griffin (An open source Data Quality solution for Big Data. More details at http://griffin.apache.org/#about_page) with deployed CDH clusters?
If yes, any documentation available on how to get this done? What are the requirements / prereqs? Any recommendations on best practices? Any additional information or help or documentation (apart from the link shared) shared would be highly beneficial.
Please share your thoughts / feedback / comments.
Thanks
SNM1523
... View more
Labels:
03-20-2020
09:25 AM
Hello, If we Go to Clusters > Kudu > Click Actions, Run Kudu Rebalancer Tool, will it distribute data for overall Kudu or just Kudu Master or Kudu T-Servers. Target is to distribute currently uneven data, evenly acrossall nodes. If above does not work, is there any other way to distribute / balance data evenly across all kudu nodes. Also, do we need any downtime while rebalancing is in progress unlike HDFS. Overview of our infrastructure: 3 Kudu Masters 9 Tablet Servers Kudu 1.7.0-cdh5.16.2/ CM 5.16.2 Request some advice / assistance on the same. Thanks Snm1523
... View more
Labels:
- Labels:
-
Apache Kudu
02-14-2020
03:04 AM
Hello, We are in the process of deploying CDP on Azure for our test system and later would move to Production. However, we are stuck with the port requirements for this deployment. As per the guidance from Cloudera, initially we were given a huge list of 7k+ port numbers that are required to be opened. That certainly would not pass network restrictions and hence, post further deep dive and discussions, we are currently at a point of getting only 6015 port number opened. We have also requested for a better explanation of why these 7k+ port numbers or just 6015 is required. However, while we are waiting for a response, I thought of to seek suggestions from the community, if anybody had deployed CDP on Azure and might have come across this. Would be great if anyone could advise on below: 1. What is the real need of 6015? I did not find any reference of this port in any of the community documentations by Cloudera 2. Will we really need a list of 7k+ ports to be opened? or anybody has quick list of required ports to be opened and can share that. 3. What are the other best practices / precautions you would have considered if deployed CDP on Azure or any other Cloud provider. Thanks snm1523
... View more
Labels:
- Labels:
-
Cloudera Data Platform (CDP)
01-23-2020
03:09 AM
Hello, While browsing through various articles related to Cloudera Manager, I came across an article which talks about exporting Cloudera Manager configs using a script. In this a wget from " ${SCM_URL} /cmf/exportCLI" url is doing the trick. Being fairly new with playing APIs, I tried to read across somethings related to CM APIs, however, did not find much. Would be great if someone could help me to understand which exact configurations are exported with the mentioned API. Apologies if seems to be a basic question. However, thought of asking community instead of creating confusions for myself. Thanks snm1523
... View more
- Tags:
- api
- cloudera manager
Labels:
- Labels:
-
Cloudera Manager
01-23-2020
03:04 AM
Hello @GangWar, Thank you for the reply, Contacting Cloudera is certainly a best option and would most likely go for it. However, currently the project is in discussion and I was just trying to get views / thoughts from the community on such a scenario if anybody had ever come across. Thanks snm1523
... View more