Member since
07-31-2013
1924
Posts
462
Kudos Received
311
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1542 | 07-09-2019 12:53 AM | |
9287 | 06-23-2019 08:37 PM | |
8049 | 06-18-2019 11:28 PM | |
8675 | 05-23-2019 08:46 PM | |
3473 | 05-20-2019 01:14 AM |
05-07-2019
05:24 PM
The simplest way is through Cloudera Hue. See http://gethue.com/new-apache-oozie-workflow-coordinator-bundle-editors/ That said, if you've attempted something and have run into issues, please add more details so the community can help you on specific topics.
... View more
05-07-2019
05:21 PM
It would help if you add along some description of what you have found or attempted, instead of just a broad question. What load balancer are you choosing to use? We have some sample HAProxy configs at https://www.cloudera.com/documentation/enterprise/latest/topics/impala_proxy.html#tut_proxy for Impala that can be repurposed for other components. Hue also offers its own pre-optimized Load Balancer as roles in Cloudera Manager that you can add and have it setup automatically: https://www.cloudera.com/documentation/enterprise/latest/topics/hue_perf_tuning.html
... View more
05-05-2019
08:58 PM
> So If i want to fetch all defined mapreduce properties,can i use this Api or it does have any pre-requisites? Yes you can. The default role group mostly always exists even if role instances do not, but if not (such as in a heavily API driven install) you can create one before you fetch. > Also does it require any privileges to access this api? A read-only user should also be able to fetch configs as a GET call over API. However, if there are configs marked as secured (such as configs that carry passwords, etc.) then the value retrieval will require admin privileges - they will otherwise appear redacted.
... View more
05-05-2019
06:08 PM
@priyanka2, > But in the yarn, there is no role of type Gateway for my cluster. > So is there any other way to fetch mapreduce properties? There may still be a role config group for it. You can use the roleConfigGroups endpoint to access its configs: Something like `curl -u auth:props -v http://cm-host.com:7180/api/v15/clusters/MyClusterName/services/YARN-1/roleConfigGroups/YARN-1-GATEWAY-BASE/config?view=full` > Could you please explain what could be the reason for that? The NodeManagers do not require MR client-side properties, just properties related to services it may need to contact and the MR shuffle service plugin configs. The NM is not involved in the MR app-side framework execution, so its mapred-site.xml only carries a subset as you've observed. @mikefisch, IIUC, you are looking for a way to assign roles to specific hosts? Use the POST call described here, for each service endpoint: https://cloudera.github.io/cm_api/apidocs/v19/path__clusters_-clusterName-_services_-serviceName-_roles.html -- Specifically, the roles list needs a structure that also requires a host reference ID that you can grab from the cluster hosts endpoint prior to this step. There's a simpler auto-assign feature also available: https://cloudera.github.io/cm_api/apidocs/v19/path__clusters_-clusterName-_autoAssignRoles.html
... View more
04-10-2019
10:15 PM
As @Tomas79 explains, there will be no consequence whatsoever of making that change (for your described problem) as these files are not deleted by the writer (in the same way regular service log files are). You'll need to delete older log files on your own, regardless of what you specify the maximum file sizes to be for each rolled log. You can consider using something like logrotate on Linux to automate this.
... View more
04-10-2019
12:31 AM
1 Kudo
One possibility could be the fetch size (combined with some unexpectedly wide rows). Does lowering the result fetch size help? >From http://sqoop.apache.org/docs/1.4.7/SqoopUserGuide.html#idp774390917888 : --fetch-size Number of entries to read from database at once. Also, do you always see it fail with the YARN memory kill (due to pmem exhaustion) or do you also observe an actual java.lang.OutOfMemoryError occasionally? If it is always the former, then another suspect would be some off-heap memory use done by the JDBC driver in use, although I've not come across such a problem.
... View more
04-09-2019
10:00 PM
1 Kudo
To add on: If you will not require audits or lineage at all for your cluster, you can also choose to disable their creation: Impala - Configuration - "Enable Impala Lineage Generation" (uncheck) Impala - Configuration - "Enable Impala Audit Event Generation" (uncheck) If you are using Navigator with Cloudera Enterprise, then these audits and lineage files should be sent automatically to the Navigator services. If they are not passing through, it may be an indicator of problem in the pipeline - please raise a support case if this is true.
... View more
04-03-2019
06:53 PM
Is the job submitted to the source cluster, or the destination? The DistCp jobs should only need to contact the NodeManagers of the cluster it runs on, but if the submitted cluster is remote then the ports may need to be opened. The HDFS transfer part does not involve YARN service communication at all, so it is not expected to contact a NodeManager. It would be helpful if you can share some more logs leading up to the observed failure.
... View more
04-03-2019
02:10 AM
For CDH / CDK Kafka users, the command is already in your PATH as "kafka-consumer-groups".
... View more
04-01-2019
06:55 PM
1 Kudo
Could you share the full log from this failure, both from the Oozie server for the action ID and the action launcher job map task logs? The 8042 port is the NodeManager HTTP port, useful in serving logs of live containers among other status details over REST. It is not directly used by DistCp in its functions, but MapReduce and Oozie diagnostics might be invoking it as part of a response to a failure, so it is a secondary symptom. Note though that running DistCp via Oozie requires you to provide appropriate configs that ensures delegation tokens for both kerberized clusters are acquired. Use "mapreduce.job.hdfs-servers" with a value such as "hdfs://namenode-cluster-1,hdfs://namenode-cluster-2" to influence this on the Oozie server's delegation token acquisition phase. This is only relevant if you use Kerberos on both clusters.
... View more