About Borg

Borg · ‎12-06-2017

Hi jmoriarty, You are running into the following : https://issues.apache.org/jira/browse/SPARK-5928 To resolve this you will need to increase the paritions that you are using; by doing that you will be able to decrease your parition size to not exceed 2GB. Try submitting your job via the command line with --conf spark.sql.shuffle.partitions=2000 View the statistics in the Executors tab on the Spark History server to see what sizes are being showing for the executors. Thanks, Jordan

Borg · ‎08-24-2017

Hi desind, I see that you have the map and reduce cluster wide memory set to 4G and 4G respectively. However the parameter that you will need to change is the PIG_HEAPSIZE = X, I would suggest increasing this and running the job again. For reference on changing properties [1]: Pig Properties Pig supports a number of Java properties that you can use to customize Pig behavior. You can retrieve a list of the properties using the help properties command. All of these properties are optional; none are required. To specify Pig properties use one of these mechanisms: The pig.properties file (add the directory that contains the pig.properties file to the classpath) The -D command line option and a Pig property (pig -Dpig.tmpfilecompression=true) The -P command line option and a properties file (pig -P mypig.properties) The set command (set pig.exec.nocombiner true) Note: The properties file uses standard Java property file format. The following precedence order is supported: pig.properties > -D Pig property > -P properties file > set command. This means that if the same property is provided using the –D command line option as well as the –P command line option and a properties file, the value of the property in the properties file will take precedence. To specify Hadoop properties you can use the same mechanisms: The hadoop-site.xml file (add the directory that contains the hadoop-site.xml file to the classpath) The -D command line option and a Hadoop property (pig –Dmapreduce.task.profile=true) The -P command line option and a property file (pig -P property_file) The set command (set mapred.map.tasks.speculative.execution false) The same precedence holds: hadoop-site.xml > -D Hadoop property > -P properties_file > set command. Hadoop properties are not interpreted by Pig but are passed directly to Hadoop. Any Hadoop property can be passed this way. All properties that Pig collects, including Hadoop properties, are available to any UDF via the UDFContext object. To get access to the properties, you can call the getJobConf method. [1] https://pig.apache.org/docs/r0.9.1/cmds.html#help Thanks, Jordan

Borg · ‎06-13-2017

@mbigelowTry to search for "How to install CM over an existing CDH Cluster"

Borg · ‎06-13-2017

Hey rkkrishnaa, Has the edge node been assigned a the "HDFS Gateway" role? You can confirm this by clicking on "Hosts" and expanding the roles. Cheers

Borg · ‎06-13-2017

Hey Alon, You are correct in assuming that the issue is related to the snapshots, specifically in /data. This is a bug that will be resolved in the coming 5.11.1 maintenance release. It was originally identified by HDFS-10797, there was an attempt to fix it in HDFS-11515, however in the meantime HDFS-11661 revealed that the original fix introduced some memory pressure that can get pretty high in case the filesystem has tens or hundreds of millions of file while du is running. To resolve you can try to find the error ./snapshot files and delete, however this may prove to be difficult. I would suggest that you install the maintenance release that shouldbe available in the coming weeks (*not guaranteed as we do no give specific timelines to maintenance releases dates*) . You could also use an API call like (http://HOST.DOMAIN.COM:7180/api/v16/clusters/Cluster%201/services/HDFS-1/reports/hdfsUsageReport) to pull the statistics or view in the HDFS usage reports.

Borg · ‎06-13-2017

While it is "possible" , the point of Cloudera Manager is to be that central point for your configurations and manage the cluster. The best option would be to setup another cluster and import your data/config into the Cloudera Manager "managed" cluster. This has been answered in a previous thread, for review here is the link: http://community.cloudera.com/t5/Cloudera-Manager-Installation/How-to-install-CM-over-an-existing-CDH-Cluster/td-p/18656

Borg · ‎05-22-2017

Hi munna143, You may have to recreate the hash-file for the parcel by completing the steps below: Download the manually deleted parcel and add it to parcel repo location /opt/cloudera/parcel-repo. Create the hash-file based on parcel: $sha1sum /opt/cloudera/parcel-repo/CDH-parcel-file.parcel | cut -d ' ' -f 1 > /opt/cloudera/parce-repo/CDH-parcel-file.parcel.sha Set the ownership of the files to cloudera-scm $chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo/CDH-parcel-file.parcel /opt/cloudera/parcel-repo/CDH-parcel-file.parcel.sha Delete the parcel through Cloudera Manager from the Parcels page. Thanks, Jordan

Borg · ‎05-15-2017

Hello, Can you provide what versions of CDH/CM you are currently running? Have any other services/componets been impacted after enabling KMS? Thanks, Jordan

Borg · ‎04-06-2017

Igor, The log snippet you provided may be consistent with an error related to the heap size of the Thrift server. GoTo CM -----> HBase ----> Configuration -----> On the left side pane, under "Scope" Select HBaseThrift Server" ---> In the search box, search for "hbase.regionserver.thrift" ------> you should then see 2 properties with check boxes-----> Enable HBase Thrift Server Compact Protocol hbase.regionserver.thrift.compact Enable HBase Thrift Server Framed Transport hbase.regionserver.thrift.framed Check both the boxes -----> Save Changes 2) Increase the heap size of ThriftServer to 2GB. Save Changes. 3) CM => HBase => Instances => GoTo Instances page ---> Note that the Thrift Server would now show "roles started with outdated configuration" ----> Restart only the HBase Thrift roles. From there I would monitor for a week and see if it is still crashing or not. *for configuring HBase with TLS/SSL : https://www.cloudera.com/documentation/enterprise/5-9-x/topics/cm_sg_ssl_hbase.html Thanks, Jordan

Borg · ‎03-30-2017

Hi Ravi, This error is typical of a scenario where the proxy/Hue has been configured and has timed out the connection according to its own internal timeout settings before the destination server has been able to respond to a request. I would suggest reviewing the configuration to ensure that your timeout values are sufficient to allow the destination server to respond (or reach its own internal timeout). To determine whether increasing Hue's configured timeouts is appropriate check the follow: For older versions of Cloudera Manager this must be set as a safety valve: 1. From Cloudera Manager, click Hue service > Configuration 2. Enter: Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini in the search field. 3. Add the following: [impala] query_timeout_s=600 [beeswax] server_conn_timeout=600 For newer versions of Cloudera Manager, this can be configured using the configuration options for Hue in Cloudera Manager: Hue service > Configuration > HiveServer2 and Impala Thrift Connection Timeout

Online	Offline
Last Visited	‎05-23-2019 01:53 PM

Member Since	‎01-27-2017 07:17 PM
Last Visited	‎05-23-2019 01:53 PM
Posts	28
Kudos received	7

Cloudera Community

Re: Warning: fs.defaultFS is not set

Re: hadoop -du retrun du: java.util.ConcurrentModi...

Re: HBase: Thrift error occurred during processing...

Re: Spark sql cache partitions

Re: Pig job failure with ERROR 2998 : Unhandled in...

Re: Install Cloudera Manager in a running CHD clus...

Re: Warning: fs.defaultFS is not set

Re: hadoop -du retrun du: java.util.ConcurrentModi...

Re: Install Cloudera Manager in a running CHD clus...

Re: Hash file is not found.

Re: Spark Program throwing error java.net.Unknown...

Re: HBase: Thrift error occurred during processing...

Re: Hue 502 Proxy error