Member since
01-27-2017
28
Posts
10
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
13547 | 06-13-2017 12:21 PM | |
2200 | 06-13-2017 12:13 PM | |
9542 | 04-06-2017 02:13 PM |
08-13-2019
12:43 AM
This can be a case to case incident. In some cases we need to remove and replace the content of the /opt/cloudera/parcels/CDH/lib/hadoop/etc/hadoop directory from the other nodes of the same cluster if this node is added from some other cluster. This is because the directory may not be cleared before adding the server to the new cluster. The existing files form the old cluster may not work as expected in the new cluster as the parameters in the configuration files may vary from cluster to cluster. So we need to forcefully remove the contents and add them manually.
... View more
02-21-2019
02:24 AM
I think i came across https://rapids.ai a spark framework, maybe we have something that can run on GPU's as i believe GPU are best for big data, had no idea earlier but the performance in some cases is 5x times faster so is worth to run bigdata jobs on GPU, though it needs some setup.
... View more
10-10-2018
10:34 AM
hi Borg! i think it maybe too late for this response 🙂 but let me try 🙂 > Are you asking how to change an active topic size or set the default for newly created topics? And active topic size > Also in terms of "size" of the topic are you referring to partitions or messages? GBytes 🙂 > Lastly, are you referring to topics created being created via the command line or a newly created topics from client? any 🙂
... View more
04-25-2018
09:07 AM
Did anyone found the solution for this issue? I have a similar issue, We are using CM5.11 and installed Spark2 seperately on other server. When we enable KMS on the cluster then Spark2 throws Unknown Host exception.
... View more
12-07-2017
12:24 AM
Hi @Borg Thank you for your answer. I have tried your suggestion, with the unfortunate lack of change compared to the things I tried previously. It simply ignores it, and still creates a single rdd that exceeds the 2G limit. Any way to force it? Thanks
... View more
08-24-2017
12:38 PM
Hi desind, I see that you have the map and reduce cluster wide memory set to 4G and 4G respectively. However the parameter that you will need to change is the PIG_HEAPSIZE = X, I would suggest increasing this and running the job again. For reference on changing properties [1]: Pig Properties Pig supports a number of Java properties that you can use to customize Pig behavior. You can retrieve a list of the properties using the help properties command. All of these properties are optional; none are required. To specify Pig properties use one of these mechanisms: The pig.properties file (add the directory that contains the pig.properties file to the classpath) The -D command line option and a Pig property (pig -Dpig.tmpfilecompression=true) The -P command line option and a properties file (pig -P mypig.properties) The set command (set pig.exec.nocombiner true) Note: The properties file uses standard Java property file format. The following precedence order is supported: pig.properties > -D Pig property > -P properties file > set command. This means that if the same property is provided using the –D command line option as well as the –P command line option and a properties file, the value of the property in the properties file will take precedence. To specify Hadoop properties you can use the same mechanisms: The hadoop-site.xml file (add the directory that contains the hadoop-site.xml file to the classpath) The -D command line option and a Hadoop property (pig –Dmapreduce.task.profile=true) The -P command line option and a property file (pig -P property_file) The set command (set mapred.map.tasks.speculative.execution false) The same precedence holds: hadoop-site.xml > -D Hadoop property > -P properties_file > set command. Hadoop properties are not interpreted by Pig but are passed directly to Hadoop. Any Hadoop property can be passed this way. All properties that Pig collects, including Hadoop properties, are available to any UDF via the UDFContext object. To get access to the properties, you can call the getJobConf method. [1] https://pig.apache.org/docs/r0.9.1/cmds.html#help Thanks, Jordan
... View more
08-18-2017
12:31 PM
Hi desind, If you are using Cloudera Manager, these metrics can be viewed by clicking on the Flume service from the Cloudera Homepage and then clicking on "Metric Details". This will be populated once you start using Flume to push data. Also, if you are looking for a more myopic view from a particular agent; you can then click on the agent link and Cloudera Manager will show you the charts that detail the following : Host Network throughput Disk Throughput Disk IOPS and more. You can also build your own custom dashboard with charts that you specify by using the following: https://www.cloudera.com/documentation/enterprise/5-11-x/topics/cm_dg_dashboards.html - Jordan
... View more
07-25-2017
01:33 PM
Hi TomK, There are many options available to accomplish what you are lookng for. Please see and review the link that I have provided below : http://gethue.com/making-hadoop-accessible-to-your-employees-with-ldap/ Thanks, Jordan
... View more
06-14-2017
12:24 AM
Thanks Borg , It happaned after deleteing huge ammount of data from the HDFS . To reslove for the meatim I deleted all snapshots and created new one . Its working OK . Waiting for 5.11.1 🙂 Many Thanks Alon
... View more
06-13-2017
12:36 PM
@mbigelowTry to search for " How to install CM over an existing CDH Cluster"
... View more
06-13-2017
11:52 AM
You will have to pay for any re-take of the exam. Is that the answer you were looking for? I am not completely sure what you are asking, can you clarify? From the Cloudera Certification page: "Candidates who fail an exam must wait a period of thirty calendar days, beginning the day after the failed attempt, before they may retake the same exam. You may take the exam as many times as you want until you pass, however, you must pay for each attempt; Cloudera offers no discounts for retake exams. Retakes are not allowed after the successful completion of a test."
... View more
06-09-2017
12:04 PM
Unfortunately, we do no release exact dates, however you can review the current release that are available here: https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_vd.html
... View more
06-08-2017
01:14 PM
Thanks for posting to our community. The OS will be CentOS, the database will be MySQL as for any other questions, I would suggest reviewing the following from our site: https://www.cloudera.com/more/training/certification/faq.html#results https://www.cloudera.com/more/training/certification/cca-admin.html
... View more
06-06-2017
02:55 PM
1 Kudo
Thanks for the clarification and the provided information. I would suggest to us the loadbalancer as stated in the previous post. As for the tool tip, thanks for pointing that out; I will relay this to our internal teams to review. Cheers
... View more
05-22-2017
11:20 AM
2 Kudos
Hi munna143, You may have to recreate the hash-file for the parcel by completing the steps below: Download the manually deleted parcel and add it to parcel repo location /opt/cloudera/parcel-repo . Create the hash-file based on parcel: $sha1sum /opt/cloudera/parcel-repo/CDH-parcel-file.parcel | cut -d ' ' -f 1 > /opt/cloudera/parce-repo/CDH-parcel-file.parcel.sha Set the ownership of the files to cloudera-scm $chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo/CDH-parcel-file.parcel /opt/cloudera/parcel-repo/CDH-parcel-file.parcel.sha Delete the parcel through Cloudera Manager from the Parcels page. Thanks, Jordan
... View more
05-15-2017
12:16 PM
Hello Igor, Thanks for your post, however running a standalone Spark cluster is deprecated as of CDH 5.5.0 : https://www.cloudera.com/documentation/enterprise/release-notes/topics/rg_deprecated.html#concept_y4x_ll4_rs__section_swl_1j3_ws There is some documentation that shows how this would work on earlier version of CDH: https://www.cloudera.com/documentation/enterprise/5-9-x/topics/cdh_ig_spark_configure.html Thanks, Jordan
... View more
04-14-2017
09:52 AM
Jason, Please provide the contents of /opt/cloudera/parcels/CDH-x.cdhx/lib/hue/build/env/lib/python2.6/site-packages/hue.pth from the cluster (replacing x with your version). In a similar case the issue was that the paths were set incorrectly resulting in the error that you see above. Jordan
... View more
04-06-2017
09:36 PM
Hi Jordan, Yes, Cloudera also recommended increasing the heap size and after I did it couple weeks ago, I did not see any more crashes. It is rather surprising though that the default configuration causes crashes. That raises the question how optimal or even acceptable other parameters are and how to tune them. Thank you, Igor
... View more
03-30-2017
01:47 PM
Hi Ravi, This error is typical of a scenario where the proxy/Hue has been configured and has timed out the connection according to its own internal timeout settings before the destination server has been able to respond to a request. I would suggest reviewing the configuration to ensure that your timeout values are sufficient to allow the destination server to respond (or reach its own internal timeout). To determine whether increasing Hue's configured timeouts is appropriate check the follow: For older versions of Cloudera Manager this must be set as a safety valve: 1. From Cloudera Manager, click Hue service > Configuration 2. Enter: Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini in the search field. 3. Add the following: [impala]
query_timeout_s=600
[beeswax]
server_conn_timeout=600 For newer versions of Cloudera Manager, this can be configured using the configuration options for Hue in Cloudera Manager: Hue service > Configuration > HiveServer2 and Impala Thrift Connection Timeout
... View more