Member since
01-27-2017
28
Posts
10
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
13577 | 06-13-2017 12:21 PM | |
2209 | 06-13-2017 12:13 PM | |
9577 | 04-06-2017 02:13 PM |
12-06-2017
01:58 PM
1 Kudo
Hi jmoriarty, You are running into the following : https://issues.apache.org/jira/browse/SPARK-5928 To resolve this you will need to increase the paritions that you are using; by doing that you will be able to decrease your parition size to not exceed 2GB. Try submitting your job via the command line with --conf spark.sql.shuffle.partitions=2000 View the statistics in the Executors tab on the Spark History server to see what sizes are being showing for the executors. Thanks, Jordan
... View more
08-24-2017
12:38 PM
Hi desind, I see that you have the map and reduce cluster wide memory set to 4G and 4G respectively. However the parameter that you will need to change is the PIG_HEAPSIZE = X, I would suggest increasing this and running the job again. For reference on changing properties [1]: Pig Properties Pig supports a number of Java properties that you can use to customize Pig behavior. You can retrieve a list of the properties using the help properties command. All of these properties are optional; none are required. To specify Pig properties use one of these mechanisms: The pig.properties file (add the directory that contains the pig.properties file to the classpath) The -D command line option and a Pig property (pig -Dpig.tmpfilecompression=true) The -P command line option and a properties file (pig -P mypig.properties) The set command (set pig.exec.nocombiner true) Note: The properties file uses standard Java property file format. The following precedence order is supported: pig.properties > -D Pig property > -P properties file > set command. This means that if the same property is provided using the –D command line option as well as the –P command line option and a properties file, the value of the property in the properties file will take precedence. To specify Hadoop properties you can use the same mechanisms: The hadoop-site.xml file (add the directory that contains the hadoop-site.xml file to the classpath) The -D command line option and a Hadoop property (pig –Dmapreduce.task.profile=true) The -P command line option and a property file (pig -P property_file) The set command (set mapred.map.tasks.speculative.execution false) The same precedence holds: hadoop-site.xml > -D Hadoop property > -P properties_file > set command. Hadoop properties are not interpreted by Pig but are passed directly to Hadoop. Any Hadoop property can be passed this way. All properties that Pig collects, including Hadoop properties, are available to any UDF via the UDFContext object. To get access to the properties, you can call the getJobConf method. [1] https://pig.apache.org/docs/r0.9.1/cmds.html#help Thanks, Jordan
... View more
08-18-2017
12:31 PM
Hi desind, If you are using Cloudera Manager, these metrics can be viewed by clicking on the Flume service from the Cloudera Homepage and then clicking on "Metric Details". This will be populated once you start using Flume to push data. Also, if you are looking for a more myopic view from a particular agent; you can then click on the agent link and Cloudera Manager will show you the charts that detail the following : Host Network throughput Disk Throughput Disk IOPS and more. You can also build your own custom dashboard with charts that you specify by using the following: https://www.cloudera.com/documentation/enterprise/5-11-x/topics/cm_dg_dashboards.html - Jordan
... View more
07-31-2017
09:44 AM
Hi fil, I have a few questions regarding your post: 1. Are you asking how to change an active topic size or set the default for newly created topics? And 2. Also in terms of "size" of the topic are you referring to partitions or messages? 3. Lastly, are you referring to topics created being created via the command line or a newly created topics from client? Thanks, Jordan
... View more
07-25-2017
01:33 PM
Hi TomK, There are many options available to accomplish what you are lookng for. Please see and review the link that I have provided below : http://gethue.com/making-hadoop-accessible-to-your-employees-with-ldap/ Thanks, Jordan
... View more
06-13-2017
12:36 PM
@mbigelowTry to search for " How to install CM over an existing CDH Cluster"
... View more
06-13-2017
12:21 PM
3 Kudos
Hey rkkrishnaa, Has the edge node been assigned a the "HDFS Gateway" role? You can confirm this by clicking on "Hosts" and expanding the roles. Cheers
... View more
06-13-2017
12:13 PM
Hey Alon, You are correct in assuming that the issue is related to the snapshots, specifically in /data. This is a bug that will be resolved in the coming 5.11.1 maintenance release. It was originally identified by HDFS-10797, there was an attempt to fix it in HDFS-11515, however in the meantime HDFS-11661 revealed that the original fix introduced some memory pressure that can get pretty high in case the filesystem has tens or hundreds of millions of file while du is running. To resolve you can try to find the error ./snapshot files and delete, however this may prove to be difficult. I would suggest that you install the maintenance release that shouldbe available in the coming weeks (*not guaranteed as we do no give specific timelines to maintenance releases dates*) . You could also use an API call like (http://HOST.DOMAIN.COM:7180/api/v16/clusters/Cluster%201/services/HDFS-1/reports/hdfsUsageReport) to pull the statistics or view in the HDFS usage reports.
... View more
06-13-2017
11:58 AM
While it is "possible" , the point of Cloudera Manager is to be that central point for your configurations and manage the cluster. The best option would be to setup another cluster and import your data/config into the Cloudera Manager "managed" cluster. This has been answered in a previous thread, for review here is the link: http://community.cloudera.com/t5/Cloudera-Manager-Installation/How-to-install-CM-over-an-existing-CDH-Cluster/td-p/18656
... View more
06-13-2017
11:52 AM
You will have to pay for any re-take of the exam. Is that the answer you were looking for? I am not completely sure what you are asking, can you clarify? From the Cloudera Certification page: "Candidates who fail an exam must wait a period of thirty calendar days, beginning the day after the failed attempt, before they may retake the same exam. You may take the exam as many times as you want until you pass, however, you must pay for each attempt; Cloudera offers no discounts for retake exams. Retakes are not allowed after the successful completion of a test."
... View more
06-09-2017
12:04 PM
Unfortunately, we do no release exact dates, however you can review the current release that are available here: https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_vd.html
... View more
06-08-2017
01:14 PM
Thanks for posting to our community. The OS will be CentOS, the database will be MySQL as for any other questions, I would suggest reviewing the following from our site: https://www.cloudera.com/more/training/certification/faq.html#results https://www.cloudera.com/more/training/certification/cca-admin.html
... View more
06-08-2017
01:09 PM
Thank you for posting on our community. I understand that while navigating to CM > Administration > Alerts, you are getting following error: Server Error A server error has occurred. Send the following information to Cloudera. Path: http://clouderamanager.aws/cmf/alerts/config Version: Cloudera Express 5.10.0 (#85 built by jenkins on 20170120-1038 git: aa0b5cd5eceaefe2f971c13ab657020d96bb842a) java.lang.NullPointerException: at AlertData.java line 127 in com.cloudera.server.web.cmf.AlertData isParamSpecEnabled() -- Above is a known issue and fixed in CM 5.12 which is yet to release. we are also plaining the fix this in upcoming bug-fix release CM 5.10.2 -- As a work around you can visit individual services and set up alerts. For eg., you wanted to enable alert for HDFS then go to CM > HDFS > Monitoring (under CATEGORY filter) and configure/modify appropriate alerts configurations.
... View more
06-06-2017
02:55 PM
1 Kudo
Thanks for the clarification and the provided information. I would suggest to us the loadbalancer as stated in the previous post. As for the tool tip, thanks for pointing that out; I will relay this to our internal teams to review. Cheers
... View more
05-31-2017
09:08 AM
Hello ahaeni, I have a couple questions regarding your post: 1. What documentation are you following? 2. Are you utilizing Active Directory for authentication? If so have you considered using the LDAP external authentication and point Cloudera Manager to an Active Directory Global Catalog? 3. Could you provide the contents of the cloudera-scm-server log that shows the error? 4. In this case would it be more beneficial to point to the loadbalancer and then alllow the loadbalancer to decide what server to use for authentication? Thanks
... View more
05-22-2017
11:20 AM
2 Kudos
Hi munna143, You may have to recreate the hash-file for the parcel by completing the steps below: Download the manually deleted parcel and add it to parcel repo location /opt/cloudera/parcel-repo . Create the hash-file based on parcel: $sha1sum /opt/cloudera/parcel-repo/CDH-parcel-file.parcel | cut -d ' ' -f 1 > /opt/cloudera/parce-repo/CDH-parcel-file.parcel.sha Set the ownership of the files to cloudera-scm $chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo/CDH-parcel-file.parcel /opt/cloudera/parcel-repo/CDH-parcel-file.parcel.sha Delete the parcel through Cloudera Manager from the Parcels page. Thanks, Jordan
... View more
05-15-2017
12:16 PM
Hello Igor, Thanks for your post, however running a standalone Spark cluster is deprecated as of CDH 5.5.0 : https://www.cloudera.com/documentation/enterprise/release-notes/topics/rg_deprecated.html#concept_y4x_ll4_rs__section_swl_1j3_ws There is some documentation that shows how this would work on earlier version of CDH: https://www.cloudera.com/documentation/enterprise/5-9-x/topics/cdh_ig_spark_configure.html Thanks, Jordan
... View more
05-15-2017
09:39 AM
Hello, Can you provide what versions of CDH/CM you are currently running? Have any other services/componets been impacted after enabling KMS? Thanks, Jordan
... View more
04-14-2017
09:52 AM
Jason, Please provide the contents of /opt/cloudera/parcels/CDH-x.cdhx/lib/hue/build/env/lib/python2.6/site-packages/hue.pth from the cluster (replacing x with your version). In a similar case the issue was that the paths were set incorrectly resulting in the error that you see above. Jordan
... View more
04-10-2017
10:49 AM
Hey Chris,
Officially CDH does not support GPU offloading, however there are some JIRA's that have been created to explore/brainstorm these possabilities. I have included them below:
https://issues.apache.org/jira/browse/SPARK-3785
https://issues.apache.org/jira/browse/SPARK-12620
I would also keep an eye on our Engineering Blog to see if there have been any new update on new use cases regarding this.
http://blog.cloudera.com/
Thanks,
Jordan
... View more
04-06-2017
02:13 PM
Igor, The log snippet you provided may be consistent with an error related to the heap size of the Thrift server. GoTo CM -----> HBase ----> Configuration -----> On the left side pane, under "Scope" Select HBaseThrift Server" ---> In the search box, search for "hbase.regionserver.thrift" ------> you should then see 2 properties with check boxes-----> Enable HBase Thrift Server Compact Protocol hbase.regionserver.thrift.compact Enable HBase Thrift Server Framed Transport hbase.regionserver.thrift.framed Check both the boxes -----> Save Changes 2) Increase the heap size of ThriftServer to 2GB. Save Changes. 3) CM => HBase => Instances => GoTo Instances page ---> Note that the Thrift Server would now show "roles started with outdated configuration" ----> Restart only the HBase Thrift roles. From there I would monitor for a week and see if it is still crashing or not. *for configuring HBase with TLS/SSL : https://www.cloudera.com/documentation/enterprise/5-9-x/topics/cm_sg_ssl_hbase.html Thanks, Jordan
... View more
03-30-2017
01:47 PM
Hi Ravi, This error is typical of a scenario where the proxy/Hue has been configured and has timed out the connection according to its own internal timeout settings before the destination server has been able to respond to a request. I would suggest reviewing the configuration to ensure that your timeout values are sufficient to allow the destination server to respond (or reach its own internal timeout). To determine whether increasing Hue's configured timeouts is appropriate check the follow: For older versions of Cloudera Manager this must be set as a safety valve: 1. From Cloudera Manager, click Hue service > Configuration 2. Enter: Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini in the search field. 3. Add the following: [impala]
query_timeout_s=600
[beeswax]
server_conn_timeout=600 For newer versions of Cloudera Manager, this can be configured using the configuration options for Hue in Cloudera Manager: Hue service > Configuration > HiveServer2 and Impala Thrift Connection Timeout
... View more