Member since
01-16-2018
553
Posts
37
Kudos Received
91
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
148 | 03-10-2023 07:36 AM | |
106 | 03-10-2023 07:17 AM | |
119 | 02-28-2023 09:04 PM | |
97 | 02-28-2023 08:53 PM | |
97 | 02-28-2023 08:43 PM |
03-10-2023
07:41 AM
Hello @mingtian Hope you are doing well. We wish to check if your Q concerning Balancer skipping any Region-Movement has been answered by us. If Yes, Kindly mark the Post as Solved. If No, Feel free to share any Q pertaining to the Post. Regards, Smarak
... View more
03-10-2023
07:36 AM
Hello @Ivoz Thanks for using Cloudera Community. Kindly refer [1] for PowerScale Compatibility with CDP Stack. As per [1], CDP v7.1.7SP1 would support PowerScale 9.2, 9.3. As such, Your team can proceed with 9.2 without any concerns. Regards, Smarak [1] Third-party filesystem support: Dell EMC PowerScale (cloudera.com)
... View more
03-10-2023
07:17 AM
1 Kudo
Hello @josr89 Thanks for using Cloudera Community. You have mentioned Hadoop 3.1.1 & I am assuming you aren't referring to any HDP/CDH (Legacy) or CDP Platforms. The Steps to recover Knox Admin Password is more akin to reset of Knox Admin Password & does involve re-provisioning the Certificates and Credentials. This is a fairly risky Operation & we would recommend performing the same with Cloudera Support. The Doc [1] covers the same for your reference. Regards, Smarak [1]Change the Master Secret (cloudera.com)
... View more
03-05-2023
09:29 PM
Hello @snm1523 Thanks for the Checks done so far. For SPNEGO, I was referring to [1]. This is based on the assumption that Kerberos is enabled for the Solr Service. If Yes, Kindly review [1]. Considering the AuthN is confirmed & SPNEGO Check would confirm AuthZ, I am unable to confirm additional factors, which may cause such issues as well. For Sanity Check, Your team is able to use the CLI [2] is working for your team correctly ? Regards, Smarak [1] https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/security-how-to-guides/topics/cm-security-enable-web-auth-s19.html [2] https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/search-solrctl-reference/topics/search-solrctl-ref.html
... View more
03-03-2023
08:04 AM
Hello @AmitBIDWH We hope your Team's queries have been addressed by us. As such, We shall mark the Post as Solved now. If you have any further ask, Feel free to Update the Case likewise. Regards, Smarak
... View more
03-03-2023
08:04 AM
Hello @snm1523 Thanks for using Cloudera Community. Generally, I have observed such "Red Lines" when the User isn't authorised rightly. Review the Ranger Permission against your Username & confirm if the right privileges are granted. Secondly, Try confirming if the Issue persists in Incognito/Private Browser mode to rule out any weird browser concerns. Third, SPNEGO is setup correctly. If the above Checks doesn't yield the Outcome expected, You may prefer opening a Support Case for our Support folks to engage with your Team for a quicker resolution. Regards, Smarak
... View more
02-28-2023
09:04 PM
Hello @quangbilly79 Thanks for using Cloudera Community. The "Spark Master" refers to the Resource Manager responsible for allocating resources. Since you are using YARN, Your Team needs to use " --master yarn". The usage of " --master spark://<IP Address>:7077" is for Spark Standalone Cluster, which isn't the Case for your team. To your Observation concerning the "Driver Instance" & "Worker Instance" being added via "Add Role Instance", there is no such Option as YARN is the Resource Manager, which shall allocate the resources for Spark Driver & Executors. Review [1] for the usage of "--master" as well. Hope the above answers your Team's queries. Regards, Smarak [1] https://spark.apache.org/docs/latest/submitting-applications.html#launching-applications-with-spark-submit
... View more
02-28-2023
08:53 PM
Hello @ighack Thanks for using Cloudera Community. In such Cases, Kindly review the StdOut & StdErr within the Directory shared in the 1st line of the Screenshot (Ending with "8464-hbase-MASTER"). These 2 files would offer additional details into the JVM StartUp. Note that Role Log would be Useful, if the Role was terminated by any issues specific to HBase while the StdOut & StdErr are Useful wherein OS/JVM concerns cause the Role StartUp issues. Regards, Smarak
... View more
02-28-2023
08:43 PM
Hello @Menyawy Thanks for using Cloudera Community. The Error [1] indicates your Team is using CDH v5.x using HBase v1.x. In the concerned release, We have limited choices to ensure "hbase:namespace" Table Region is assigned correctly as shared below: (I) Easier Yet Not-Always-Successful Option: 1) Shutdown HBase Service
2) echo "rmr /hbase" | hbase zkcli (From HMaster Node)
3) Start HBase Service
4) sudo hbase hbase hbck -fixAssignments (II) Review whether any of the "hbase:namespace" Table Region files are reporting MissingBlockException. If Yes, We need to fix the MissingBlockException. (III) Possible for the RegionServer Logs (Wherein the "hbase:namespace" Table Region is being assigned) would offer additional insight into the reasoning for the RegionServer being Unable to assign the "hbase:namespace" Table Region successfully. Your Team may review the concerned RegionServer Logs as well. There may be Other Options, yet such Inconsistencies are better handled in HBase 2.x [1] & your Team should plan for using HBase v2.x soon. Regards, Smarak [1] https://github.com/apache/hbase-operator-tools/blob/master/hbase-hbck2/README.md#master-startup-cannot-progress-in-holding-pattern-until-region-onlined
... View more
02-28-2023
08:36 AM
2 Kudos
Hello @BrianChan Thanks for using Cloudera Community. I believe CDSW isn't included in Free Tier [0], Your Team should review CML (Cloudera Machine Learning), which is available in CDP Public Cloud & CDP Private Cloud with [1] offering details into the Difference between CDSW & CML. As noted in [1], CML expands the functionalities of CDSW further. CML is included in "CDP Public Cloud" Trail [2] & in "CDP Private Cloud Data Services" Trial [3]. Regards, Smarak [0] https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/installation/topics/cdpdc-setup-trial-cluster-using-wizard.html [1] https://docs.cloudera.com/machine-learning/1.3.4/product/topics/ml-cdsw-key-differences.html [2] https://www.cloudera.com/campaign/try-cdp-public-cloud.html [3] https://www.cloudera.com/downloads/cdp-private-cloud-trial.html
... View more
02-27-2023
10:59 PM
Hello @AmitBIDWH Thanks for using Cloudera Community. HBase offers Tools for Use-Cases as Bulk-Loading, Exporting, Importing, CopyTable etc. Additionally, few Task as Compactions are managed implicitly by Customer. We would suggest reviewing [1] for any HBase concerns. To your Q concerning Create > Admin > Load, BulkLoad [2] is one of the Best Tool for Large-Scale Loading. Additionally, Your Team may use HBase Shell or REST APIs [3] for further activities. Kindly review & let us know if you have any concerns. Regards, Smarak [1] https://hbase.apache.org/book.html [2] https://hbase.apache.org/book.html#arch.bulk.load [3] https://hbase.apache.org/book.html#_rest
... View more
02-27-2023
10:54 PM
1 Kudo
Hello @bgkim Thanks for using Cloudera Community. To your Q, the Composite Primary Key would require using both A & B in WHERE Clause as the Indexing is done collectively. As such, Your SELECT Query would ideally benefit upon creating a Local Index on A & C. You may review [1] as Read-Heavy Use-Case benefit via Global Index with Penalty incurred during Writes. Additionally, Phoenix offers Covered Index & Explain Plan helps confirming the Index Usage. Link [2] offers few examples as well. With all recommendations, Best Advise is always to review the Performance internally prior to implementing them in Production. Regards, Smarak [1] https://phoenix.apache.org/secondary_indexing.html [2] https://learn.microsoft.com/en-us/azure/hdinsight/hbase/apache-hbase-phoenix-performance
... View more
01-24-2023
11:30 PM
Hello @Ellengogo I am not able to find the exact Case reference as this Post is few months old. Typically, such issues are caused by Resource Constraint, wherein the Engine Pod (Created when a Workbench Session is started) gets terminated owing to Resource Constraint. If the issue is persistent, a Support Case would be ideal as we require the review of the Kubernetes Output pertaining to the Engine Pod along with Resource Profile & other related artefacts. Regards, Smarak
... View more
01-18-2023
12:10 AM
Hello @Girija Since we haven't heard back from you, We shall mark the Post as Solved. If you happen to have any further ask, Feel free to Update the Post. In Summary, Internally, I wasn't able to replicate the issue being faced by you as I was able to create the ConfigSet using "_default" ConfigSet as baseConfig. Customer can use the below solrctl command to create a ConfigSet with Solr KeyTab: solrctl config --create Test_Config _default -p configSetProp.immutable=false Assuming the above Command fails, Running the solrctl command with "--trace" after "solrctl" & before "config" would print the trace logging & assist in troubleshooting the issue faced by your team. Regards, Smarak
... View more
01-17-2023
11:03 PM
Hello @pankshiv1809 Since we haven't heard from your side concerning the Post, We are marking the Post as Solved. If you have any further ask, Feel free to update the Post & we shall get back to you accordingly. Regards, Smarak
... View more
01-17-2023
10:46 PM
Hello @mingtian Note that Debug Logging won't ensure the Balancer would perform Region Movement, rather the same would confirm if Balancer is running yet the same isn't moving any Region owing to CostFactor. Example: I ensured 1 RegionServer didn't had any Region by RegionMovement & triggered a Balancer, which showed [1] & trigger a Region Movement (Note "Found A Solution That Moves 1 Region"). After the 1st Balancer is Completed, I triggered a 2nd Balancer, which printed [2], wherein the DEBUG report "Skipping Load Balancing". I believe your Team would see [2] i.e. Balancer is Skipping any Load Balancing owing to Cost Factor. As such, Your Team can consider the fact that HBase is rejecting Region Movement owing to the fact that any new Region-Movement is "Costlier" than Current Region Placement. Tweaking [3] Cost Parameters including setting " hbase.master.loadbalance.bytable" to "true" should help trigger a Balancer for your Team. Regards, Smarak [1] 2023-01-18 06:38:33,290 INFO org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Finished computing new moving plan. Computation took 95 ms to try 7200 different iterations. Found a solution that moves 1 regions; Going from a computed imbalance of 0.4961380973335763 to a new imbalance of 0.020487264673311183. funtionCost=RegionCountSkewCostFunction : (multiplier=500.0, imbalance=0.0); PrimaryRegionCountSkewCostFunction : (not needed); MoveCostFunction : (multiplier=7.0, imbalance=0.3333333333333333, need balance); ServerLocalityCostFunction : (multiplier=25.0, imbalance=0.0); RackLocalityCostFunction : (multiplier=15.0, imbalance=0.0); TableSkewCostFunction : (multiplier=35.0, imbalance=0.0); RegionReplicaHostCostFunction : (not needed); RegionReplicaRackCostFunction : (not needed); ReadRequestCostFunction : (multiplier=5.0, imbalance=1.0, need balance); WriteRequestCostFunction : (multiplier=5.0, imbalance=1.0, need balance); MemStoreSizeCostFunction : (multiplier=5.0, imbalance=0.0); StoreFileCostFunction : (multiplier=5.0, imbalance=0.0); [2] 2023-01-18 06:39:05,365 INFO org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Cluster wide - skipping load balancing because weighted average imbalance=0.013858568086431631 <= threshold(0.025). If you want more aggressive balancing, either lower hbase.master.balancer.stochastic.minCostNeedBalance from 0.025 or increase the relative multiplier(s) of the specific cost function(s). functionCost=RegionCountSkewCostFunction : (multiplier=500.0, imbalance=0.0); PrimaryRegionCountSkewCostFunction : (not needed); MoveCostFunction : (multiplier=7.0, imbalance=0.0); ServerLocalityCostFunction : (multiplier=25.0, imbalance=0.0); RackLocalityCostFunction : (multiplier=15.0, imbalance=0.0); TableSkewCostFunction : (multiplier=35.0, imbalance=0.0); RegionReplicaHostCostFunction : (not needed); RegionReplicaRackCostFunction : (not needed); ReadRequestCostFunction : (multiplier=5.0, imbalance=0.6685715976063684, need balance); WriteRequestCostFunction : (multiplier=5.0, imbalance=1.0, need balance); MemStoreSizeCostFunction : (multiplier=5.0, imbalance=0.0); StoreFileCostFunction : (multiplier=5.0, imbalance=0.0); [3] StochasticLoadBalancer (Apache HBase 3.0.0-alpha-4-SNAPSHOT API)
... View more
01-16-2023
09:09 PM
Hello @Ryan_2002 Thanks for engaging Cloudera Community. First of all, Thank You for the detailed description of the Problem. I believe your ask is Valid, yet reviewing the same over a Community Post isn't a suitable approach. Feasible for you to engage Cloudera Support to allow our Team to work with you, with the suitability of Screen-Sharing Session as well as Logs exchange, both of which aren't feasible in Community. That would greatly expedite the review of your ask. Regards, Smarak
... View more
01-16-2023
01:57 AM
Hello @mingtian Hope you are doing well. We wish to follow-up with you & check if the DEBUG Logging assisted in confirming the reasoning for Balancer Algorithm deciding against Region-Movement. If Yes, Kindly let us know if your Q in the Post has been answered or any further Q remains. Regards, Smarak
... View more
01-16-2023
01:55 AM
Hello @pankshiv1809 Hope you are doing well. We wish to follow-up on the Post & confirm whether your Team was requesting information into Dynamic Allocation to allow Spark adjust resources based on Workload requirement. Regards, Smarak [1] Job Scheduling - Spark 3.3.1 Documentation (apache.org)
... View more
01-16-2023
01:54 AM
Hello @panb We hope your Query has been addressed by us & shall mark the Post as Resolved. In Summary, Your Team needs to meet the requirement as stated in [1], which doesn't differentiate in Processor Type & I believe your Team is referring to Hygon Dhyana Processor. Note that we have shared the Hardware requirement is shared for CDP v7.1.8 as CDH isn't recommended now owing to End-Of-Life. As a Best Practise, I shall suggest engaging with Cloudera Account Team associated with Customer to perform any due diligence with respect to Supportability & Best Practices prior to onboarding Use-Cases into any new Platform, wherein Supportability is doubted by your Team. Regards, Smarak [1] Hardware Requirements | CDP Private Cloud (cloudera.com)
... View more
01-13-2023
01:59 AM
Hello @pankshiv1809 Thanks for using Cloudera Community. Based on your Post, Assuming [1] would help i.e. Using Dynamic Allocation to allow Spark adjust resources based on Workload requirement. Regards, Smarak [1] Job Scheduling - Spark 3.3.1 Documentation (apache.org)
... View more
01-13-2023
01:56 AM
Hello @panb Thanks for using Cloudera Community. As far as I am aware, your Team needs to meet the requirement as stated in [1], which doesn't differentiate in Processor Type & I believe your Team is referring to Hygon Dhyana Processor. Note that we have shared the Hardware requirement is shared for CDP v7.1.8 as CDH isn't recommended now owing to End-Of-Life. Regards, Smarak [1] Hardware Requirements | CDP Private Cloud (cloudera.com)
... View more
01-13-2023
01:47 AM
Hello @mingtian Thanks for using Cloudera Community. Based on your Post, We would suggest enabling DEBUG Logging for HMaster (Via HMaster UI To Avoid Any Restart) & trigger the Balancer. Generally, Balancer Algorithm may be deciding against running any Region-Alignment owing to Cost Factor [1]. The HMaster Debug Log would print such Balancer information for your review, upon which the Params discussed in [1] can be tuned to force Balancer, yet the Default Params are generally persisted for most Use-Cases. Note that Balancer Job isn't to merely fit Equal Regions per RegionServer. Balancer consider various Cost as defined by [1] to proceed with Region-Alignment. Regards, Smarak [1] StochasticLoadBalancer (Apache HBase 3.0.0-alpha-4-SNAPSHOT API)
... View more
01-13-2023
01:35 AM
1 Kudo
Hello @Ryan_2002 Thanks for using Cloudera Community. To your Q, the Driver Cap is the Engine/Resource Profile & the Executor's Resource Usage is defined by the SparkSession or "spark-defaults.conf" file within the Project wherein the Workbench Session is being created. Your Team can review the Pods in the User's Namespace & see the same i.e. upon a Workbench Session Creation, an Engine Pod is started with "Limits" set toEngine/Resource Profile Settings. After SparkSession is initialised, additional Pods are generated within the User's Namespace based on the Execution's Configs passed via SparkSession or "spark-defaults.conf" file. You may configure the Executor's Configs as per your usage yet the same depends on the CML Workspace AutoScale Range & InstanceType. Say, an InstanceType supporting 8 vCPU & Executors requesting 8 vCPU won't work. Similarly, AutoScale Max of 5 yet requesting Executors collectively utilising the Resource Limit of 5 Nodes. Hope the above helps answer your Post's queries. If Yes, Kindly mark the Post as Solved. If No, Feel free to share your concerns & we shall address accordingly. Regards, Smarak
... View more
01-02-2023
10:03 PM
Hello @quangbilly79 Thanks for using Cloudera Community. Based on your Post, you may consider "Kafka Gateway" as the Client for Kafka, which are setup on the Hosts wherein the same is added as per Cloudera Manager "Assign Roles". A Client/Gateway is familiar with the Service (Kafka in this Case) & all Client/Service Configs are available for the Client/Gateway without any manual intervention. Any changes made to the Service or Client Configs is pushed to the Service/Client Configuration by Cloudera Manager. Imagine a Scenario wherein you wish to run "hdfs dfs -ls" on a HDFS FileSystem. Simply running the Command won't work unless the Host wherein the Command "hdfs dfs -ls" is being run knows the Setup (HDFS FileSystem, NameNode, Port, Protocol). Review [1] for an Example. Adding an HDFS Gateway ensures User doesn't need to manually configure a Client/Gateway with Cloudera Manager doing the needful. Similarly, Kafka Gateway operates. Else, Customer need to manually configure the Client/Gateway Setup. Hope the above answer your query concerning the Gateway Role. Regards, Smarak [1] https://www.ibm.com/docs/en/spectrum-scale-bda?topic=hdfs-clients-configuration
... View more
12-27-2022
10:30 PM
Hello @sachin_saju Thanks for using Cloudera Community. You have 2 ask in the Post: 1. How to configure different Storage Policies with Cold & Hot Data, 2. Applying different Compression Algorithm in 1 Column Family. For Q2, I believe the same isn't feasible i.e. Compression Algorithm can be set at CF level. Review [1] for the Compression Algorithm recommendation around Hot & Cold type data. For Q1, I assume you are referring to HDFS Storage Policy. If Yes, the same is configured uniformly i.e. I am not sure if we can apply different HDFS Storage Policy for different data within the same CF. In HBase, We generally recommend SSD [2] for WAL, else the HBase Data relies on HDFS Storage Policy used. Alternatively, Use BackUp-Restore [3] for having a "Cold" Version of Data, which can be restored as per requirement. Regards, Smarak [1] https://hbase.apache.org/book.html#data.block.encoding.types [2] https://docs.cloudera.com/cdp-private-cloud-base/7.1.8/configuring-hbase/topics/hbase-configure-storage-policy-wal.html [3] https://hbase.apache.org/book.html#br.overview
... View more
12-27-2022
10:19 PM
1 Kudo
Hello @sachin_saju Thanks for using Cloudera Community. Your queries concerning the Read Path is discussed between a fellow Community User & myself in [1]. Kindly review the same & let us know if the same answer the queries around Read Path. In Summary, Read Path relies on a Merge of BlockCache & MemStore prior to returning the Output to the End-User, thereby avoiding any Inconsistent Read. Refer [2] for few Diagram around the same to help explain the Read Merge Path. Concerning Doubt # 3, Our Community User asked a similar Q in [3]. I haven't reviewed this Use-Case internally around Hit/Miss Ratio in the UI to answer the same. Henceforth, I shall let our fellow HBase Engineers to answer [3], which may answer your Q3 as well. Barring Q3, Let me know if your first 2 queries are addressed by [1] & [2]. Regards, Smarak [1] https://community.cloudera.com/t5/Support-Questions/Is-it-possible-for-inconsistent-read-in-Hbase-with-Memcache/m-p/359452#M238123 [2] https://nag-9-s.gitbook.io/hbase/hbase-architecture/hbase-read-merge [3] https://community.cloudera.com/t5/Support-Questions/Create-cache-miss-scenario-in-HBase-with-HDP-2-6-5/m-p/359795
... View more
12-25-2022
05:50 AM
Hello @Serhii This is an Old Post, yet I am answering the same as there are few changes with CDP recent release & ensuring Community awareness. With CDP v7.1.6 [1] allows Accumulo to be installed via Cloudera Manager. The Installation is documented via [1] & requires a Separate Parcel to be installed before attempting to add Accumulo via Cloudera Manager. Having said that, Feel free to engage with Cloudera Account Team for Customer as the investment into Accumulo isn't as par with other similar counterpart to review any long term engagement with Accumulo for meeting Customer's Use-Case. Regards, Smarak [1] https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/opdb-accumulo-installation/topics/opdb-accumulo-install.html
... View more
12-23-2022
07:04 AM
Hello @Girija Internally, I wasn't able to replicate the issue being faced by you. I was able to create the ConfigSet using "_default" ConfigSet as baseConfig. I am assuming the issue is specific to your Environment & your team should use the CLI to better diagnose such issue. Your team can use the below solrctl command to create a ConfigSet with Solr KeyTab: solrctl config --create Test_Config _default -p configSetProp.immutable=false Assuming the above Command fails, Running the solrctl command with "--trace" after "solrctl" & before "config" would print the trace logging & assist in troubleshooting the issue faced by your team. Regards, Smarak
... View more
12-21-2022
07:04 AM
1 Kudo
Hello @SagarCapG Confirmed that Phoenix v5.1.0 has the Fix for " !primarykeys" to show the Primary Key linked with a Phoenix Table. Upon checking our Product Documentation, CDP v7.1.6 introduces Phoenix v5.1.0 [1]. As such, I am surprised your Team has Phoenix v5.0.0 with CDP v7.1.7, wherein Official v7.1.7 Doc [2] says Phoenix v5.1.1.7.1.7.0-551 is used. Since the Issue is fixed in Phoenix v5.1.x & CDP v7.1.6 onwards ship Phoenix v5.1.x, Kindly engage Cloudera Support to allow Support to review your Cluster for identifying the reasoning for CDP v7.1.7 using Phoenix v5.0.0. Or, Upgrade to Phoenix v5.1.x (If Customer is managing Phoenix outside of CDP) to use "!primarykeys" functionality. Regards, Smarak [1] What's New in Apache Phoenix | CDP Private Cloud (cloudera.com) [2] Cloudera Runtime component versions | CDP Private Cloud
... View more