Member since 
    
	
		
		
		06-26-2013
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                354
            
            
                Posts
            
        
                68
            
            
                Kudos Received
            
        
                27
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 4403 | 08-05-2016 10:36 AM | |
| 7126 | 06-02-2016 04:57 PM | |
| 7553 | 05-31-2016 03:47 PM | |
| 6388 | 04-11-2016 11:26 AM | |
| 12034 | 03-07-2016 02:04 PM | 
			
    
	
		
		
		03-15-2016
	
		
		05:42 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Dear Cloudera Users,  
   
 We are pleased to announce the general availability of the Cloudera Connector Powered by Teradata 1.5. This release fixes a compatibility issue with CDH 5.5.0 and later. See the download page for more details. 
   
 For more details on new features and usage of Cloudera Connector Powered by Teradata, see: 
   
 
 
 Release Notes Cloudera Connector Powered by Teradata version 1.5 
 
 
 Cloudera Connector Powered by Teradata User Guide, version 1.5 
 
 
 As always, we welcome your feedback. Please send your comments and suggestions through our new community forums.  You can also file bugs in the CDH project at issues.cloudera.org. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-07-2016
	
		
		02:04 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 Hi Alina, 
   
 Although Hive-on-Spark will definitely provide improved performance over MR for batch processing applications (eg ETL), that performance is not going to approach the interactive "BI" experience provided by Impala. 
   
 Here's some recent Impala performance testing results:    http://blog.cloudera.com/blog/2016/02/new-sql-benchmarks-apache-impala-incubating-2-3-uniquely-delivers-analytic-database-performance/ 
   
 Although Hive-on-Spark is not included, one would expect it to perform at levels similar to that of Hive-on-Tez (although having the added advantage of supporting consolidation onto the Spark API). 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-19-2016
	
		
		01:11 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Dear CDH Users, 
   
 We are pleased to announce the release of the Cloudera Distribution of Apache Kafka 2.0 for CDH 5.   
   
 Apache Kafka is a highly scalable, distributed, publish-subscribe messaging system. This release is based on Apache Kafka 0.9, and adds security features such as Kerberos authentication, wire encryption, secure mirroring, a new consumer API, per-user throttling, and many other features and bug fixes that solidify Kafka as an enterprise production-grade component of the Hadoop ecosystem. Kafka 2.0 also ships with new management tooling in Cloudera Manager, for point-and-click configuration of each new capability. 
   
 New Features in Cloudera Distribution of Apache Kafka 2.0 
 
 Kafka is rebased on Apache Kafka 0.9: http://archive.apache.org/dist/kafka/0.9.0.0/RELEASE_NOTES.html. 
 Kerberos authentication of connections from clients and other brokers, including to ZooKeeper. 
 Wire encryption of communications from clients and other brokers using SSL. 
 A new client API for consumers (Java). 
 A refactored, secure MirrorMaker to prevent data loss and improve reliability of cross-data center replication. 
 Per-user quotas to throttle producer and consumer throughput in a multitenant cluster. 
 
 Requirements for Cloudera Distribution of Apache Kafka 2.0 
   
 
 Cloudera Manager 5.5.3 
 Any CDH 5.x release is supported. 
 
   
 Notable Issues Fixed in Cloudera Distribution of Apache Kafka 2.0 
 
 Notable fixes backported into Kafka 2.0: 
 
 KAFKA-2799: WakupException thrown in the followup poll() could lead to data loss 
 KAFKA-2942: Inadvertent auto-commit when pre-fetching can cause message loss 
 KAFKA-2878: Kafka broker throws OutOfMemory exception with invalid join group request 
 KAFKA-2882: Add constructor cache for Snappy and LZ4 Output/Input stream in Compressor.java 
 KAFKA-2913: GroupMetadataManager unloads all groups in removeGroupsForPartitions 
 KAFKA-2880: Fetcher.getTopicMetadata NullPointerException when broker cannot be reached 
 KAFKA-2950: Fix performance regression in the producer 
 KAFKA-2973: Fix leak of child sensors on remove 
 KAFKA-2978: Consumer stops fetching when consumed and fetch positions get out of sync 
 KAFKA-2988: Change default configuration of the log cleaner 
 KAFKA-3012: Avoid reserved.broker.max.id collisions on upgrade 
 
 All backported fixes can be viewed in the git release notes here. 
 
 We look forward to you trying Kafka 2.0! , For more information, please use the links below: 
 
 Install or upgrade Kafka 
 Review the documentation 
 Review the Release Notes 
 
 As always, we welcome your feedback. Please send your comments and suggestions through our community forums.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-18-2016
	
		
		04:01 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 The Fair Scheduler is recommended by Cloudera. Here is some background: 
   
 http://blog.cloudera.com/blog/2016/01/untangling-apache-hadoop-yarn-part-3/ 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-10-2015
	
		
		10:11 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 We are pleased to announce the release of the Cloudera Distribution of Apache Kafka 1.4.0 for CDH 5.  Apache Kafka is a distributed publish-subscribe messaging system. This release is based on Apache Kafka 0.8.2, adds support for distribution as a package as well as a parcel, and includes fixes for key issues. 
   
 New Features: 
   
 
 
 Cloudera Distribution of Apache Kafka 1.4 is now distributed via native packages as well as a parcel 
 
 
 Notable Fixes: 
   
 
 
 KAFKA-2633: Default logging from tools to Stderr. 
 
 
 KAFKA-1664: Kafka does not properly parse multiple ZK nodes with non-root chroot. 
 
 
 KAFKA-2477: Fix a race condition between log append and fetch that causes OffsetOutOfRangeException. 
 
 
 KAFKA-2024: Cleaner can generate unindexable log segments. 
 
 
 KAFKA-2118: Cleaner can not clean after shutdown during replaceSegments. 
 
 
 We look forward to you trying it out, using the information below: 
   
 
 
 Install or upgrade Kafka 
 
 
 Review the Documentation 
 
 
 Review the Release Notes 
 
 
 As always, we welcome your feedback. Please send your comments and suggestions through our community forums.   
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-04-2015
	
		
		08:35 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 It may help if you describe what your use case is here/your goal with this operation. There may be several ways to reach that goal. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-03-2015
	
		
		02:26 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Bhaskar, 
   
 The answer is "yes" (hat tip to John Russell) because HDFS is capable of locating data blocks on any data node, even with a replication factor of 1. 
   
 However, you need to be careful because if you're too fine-grained about distributing your partitions/Parquet files across the cluster, performance can suffer. Performance will be better and more predicatable with fewer blocks for your query to find. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-01-2015
	
		
		03:57 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Today, we’re pleased to announce the availability of a Cloudera QuickStart Docker image! 
   
 If you or your organization is using Docker, this image may provide the ideal lightweight, disposable environment for learning and exploring new technology, playing with new ideas, and for doing continuous integration before testing at scale. (However, Cloudera recommends using a more realistic test environment before moving to production.) 
   
 More details/docs here: 
   
 http://blog.cloudera.com/blog/2015/12/docker-is-the-new-quickstart-option-for-apache-hadoop-and-cloudera/ 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-13-2015
	
		
		01:56 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Have you cleared browser cache and retried recently? Please do that and confirm. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-23-2015
	
		
		07:21 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Dear CDH, Cloudera Manager, Impala, and Cloudera Navigator users,     
 We are pleased to announce the release of Cloudera Enterprise 5.4.7 (CDH 5.4.7, Cloudera Manager 5.4.7, and Cloudera Navigator 2.3.7).   
   Cloudera Enterprise 5.4.7   
 This release fixes key bugs and includes the following. 
 
 
 CDH fixes for the following issues: 
 
 
 
 The Spooling directory source dies when encountering zero-byte files. 
 
 
 The VolumeScanner thread exits with an exception if there is no block pool to be scanned but there are suspicious blocks. 
 
 
 ArrayIndexOutOfBoundsException in CellComparator#  getMinimumMidpointArray. 
 
 
 Lateral view on top of a view throws a RuntimeException. 
 
 
 java.lang.  IndexOutOfBoundsException when union all with if function. 
 
 
 
 For a full list of upstream JIRAs fixed in CDH 5.4.7, see the issues fixed section of the Release Notes. 
   
 
 
 Cloudera Manager fixes for the following issues: 
 
 
 
 The ZooKeeper jute.maxbuffer property is emitted into zoo.cfg instead of in the JVM arguments. 
 
 
 Using the "create a user" API call, a user who normally could not create users is able to create a read-only user account. 
 
 
 Attempting to set the listening_hostname property in the Agent's config.ini file (which is not normally necessary) changes the Agent's host ID to use this hostname, instead of the normal value.  
 
 
 On Kerberized clusters, Cloudera Manager is monitoring the wrong process as the DataNode.  
 
 
 
 For full list of issues fixed in Cloudera Manager 5.4.7, see the issues fixed section of the Release Notes.     
 
 
 There are no updates in Cloudera Navigator 2.3.7. 
 
 
 We look forward to you trying it out, using the information below:     
 
 
 Download Cloudera Enterprise 
 
 
 View the documentation 
 
 
 As always, we are happy to hear your feedback. Please send your comments and suggestions to the user group or through our community forums. You can also file bugs through our external JIRA projects on issues.cloudera.org. 
						
					
					... View more