Member since
06-26-2013
354
Posts
68
Kudos Received
27
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3837 | 08-05-2016 10:36 AM | |
6155 | 06-02-2016 04:57 PM | |
6347 | 05-31-2016 03:47 PM | |
5356 | 04-11-2016 11:26 AM | |
10975 | 03-07-2016 02:04 PM |
03-15-2016
05:42 PM
Dear Cloudera Users,
We are pleased to announce the general availability of the Cloudera Connector Powered by Teradata 1.5. This release fixes a compatibility issue with CDH 5.5.0 and later. See the download page for more details.
For more details on new features and usage of Cloudera Connector Powered by Teradata, see:
Release Notes Cloudera Connector Powered by Teradata version 1.5
Cloudera Connector Powered by Teradata User Guide, version 1.5
As always, we welcome your feedback. Please send your comments and suggestions through our new community forums. You can also file bugs in the CDH project at issues.cloudera.org.
... View more
03-07-2016
02:04 PM
2 Kudos
Hi Alina,
Although Hive-on-Spark will definitely provide improved performance over MR for batch processing applications (eg ETL), that performance is not going to approach the interactive "BI" experience provided by Impala.
Here's some recent Impala performance testing results: http://blog.cloudera.com/blog/2016/02/new-sql-benchmarks-apache-impala-incubating-2-3-uniquely-delivers-analytic-database-performance/
Although Hive-on-Spark is not included, one would expect it to perform at levels similar to that of Hive-on-Tez (although having the added advantage of supporting consolidation onto the Spark API).
... View more
02-19-2016
01:11 PM
Dear CDH Users,
We are pleased to announce the release of the Cloudera Distribution of Apache Kafka 2.0 for CDH 5.
Apache Kafka is a highly scalable, distributed, publish-subscribe messaging system. This release is based on Apache Kafka 0.9, and adds security features such as Kerberos authentication, wire encryption, secure mirroring, a new consumer API, per-user throttling, and many other features and bug fixes that solidify Kafka as an enterprise production-grade component of the Hadoop ecosystem. Kafka 2.0 also ships with new management tooling in Cloudera Manager, for point-and-click configuration of each new capability.
New Features in Cloudera Distribution of Apache Kafka 2.0
Kafka is rebased on Apache Kafka 0.9: http://archive.apache.org/dist/kafka/0.9.0.0/RELEASE_NOTES.html.
Kerberos authentication of connections from clients and other brokers, including to ZooKeeper.
Wire encryption of communications from clients and other brokers using SSL.
A new client API for consumers (Java).
A refactored, secure MirrorMaker to prevent data loss and improve reliability of cross-data center replication.
Per-user quotas to throttle producer and consumer throughput in a multitenant cluster.
Requirements for Cloudera Distribution of Apache Kafka 2.0
Cloudera Manager 5.5.3
Any CDH 5.x release is supported.
Notable Issues Fixed in Cloudera Distribution of Apache Kafka 2.0
Notable fixes backported into Kafka 2.0:
KAFKA-2799: WakupException thrown in the followup poll() could lead to data loss
KAFKA-2942: Inadvertent auto-commit when pre-fetching can cause message loss
KAFKA-2878: Kafka broker throws OutOfMemory exception with invalid join group request
KAFKA-2882: Add constructor cache for Snappy and LZ4 Output/Input stream in Compressor.java
KAFKA-2913: GroupMetadataManager unloads all groups in removeGroupsForPartitions
KAFKA-2880: Fetcher.getTopicMetadata NullPointerException when broker cannot be reached
KAFKA-2950: Fix performance regression in the producer
KAFKA-2973: Fix leak of child sensors on remove
KAFKA-2978: Consumer stops fetching when consumed and fetch positions get out of sync
KAFKA-2988: Change default configuration of the log cleaner
KAFKA-3012: Avoid reserved.broker.max.id collisions on upgrade
All backported fixes can be viewed in the git release notes here.
We look forward to you trying Kafka 2.0! , For more information, please use the links below:
Install or upgrade Kafka
Review the documentation
Review the Release Notes
As always, we welcome your feedback. Please send your comments and suggestions through our community forums.
... View more
02-18-2016
04:01 PM
The Fair Scheduler is recommended by Cloudera. Here is some background:
http://blog.cloudera.com/blog/2016/01/untangling-apache-hadoop-yarn-part-3/
... View more
12-10-2015
10:11 AM
We are pleased to announce the release of the Cloudera Distribution of Apache Kafka 1.4.0 for CDH 5. Apache Kafka is a distributed publish-subscribe messaging system. This release is based on Apache Kafka 0.8.2, adds support for distribution as a package as well as a parcel, and includes fixes for key issues.
New Features:
Cloudera Distribution of Apache Kafka 1.4 is now distributed via native packages as well as a parcel
Notable Fixes:
KAFKA-2633: Default logging from tools to Stderr.
KAFKA-1664: Kafka does not properly parse multiple ZK nodes with non-root chroot.
KAFKA-2477: Fix a race condition between log append and fetch that causes OffsetOutOfRangeException.
KAFKA-2024: Cleaner can generate unindexable log segments.
KAFKA-2118: Cleaner can not clean after shutdown during replaceSegments.
We look forward to you trying it out, using the information below:
Install or upgrade Kafka
Review the Documentation
Review the Release Notes
As always, we welcome your feedback. Please send your comments and suggestions through our community forums.
... View more
12-04-2015
08:35 AM
It may help if you describe what your use case is here/your goal with this operation. There may be several ways to reach that goal.
... View more
12-03-2015
02:26 PM
Bhaskar,
The answer is "yes" (hat tip to John Russell) because HDFS is capable of locating data blocks on any data node, even with a replication factor of 1.
However, you need to be careful because if you're too fine-grained about distributing your partitions/Parquet files across the cluster, performance can suffer. Performance will be better and more predicatable with fewer blocks for your query to find.
... View more
12-01-2015
03:57 PM
1 Kudo
Today, we’re pleased to announce the availability of a Cloudera QuickStart Docker image!
If you or your organization is using Docker, this image may provide the ideal lightweight, disposable environment for learning and exploring new technology, playing with new ideas, and for doing continuous integration before testing at scale. (However, Cloudera recommends using a more realistic test environment before moving to production.)
More details/docs here:
http://blog.cloudera.com/blog/2015/12/docker-is-the-new-quickstart-option-for-apache-hadoop-and-cloudera/
... View more
10-13-2015
01:56 PM
Have you cleared browser cache and retried recently? Please do that and confirm.
... View more
09-23-2015
07:21 AM
Dear CDH, Cloudera Manager, Impala, and Cloudera Navigator users,
We are pleased to announce the release of Cloudera Enterprise 5.4.7 (CDH 5.4.7, Cloudera Manager 5.4.7, and Cloudera Navigator 2.3.7).
Cloudera Enterprise 5.4.7
This release fixes key bugs and includes the following.
CDH fixes for the following issues:
The Spooling directory source dies when encountering zero-byte files.
The VolumeScanner thread exits with an exception if there is no block pool to be scanned but there are suspicious blocks.
ArrayIndexOutOfBoundsException in CellComparator# getMinimumMidpointArray.
Lateral view on top of a view throws a RuntimeException.
java.lang. IndexOutOfBoundsException when union all with if function.
For a full list of upstream JIRAs fixed in CDH 5.4.7, see the issues fixed section of the Release Notes.
Cloudera Manager fixes for the following issues:
The ZooKeeper jute.maxbuffer property is emitted into zoo.cfg instead of in the JVM arguments.
Using the "create a user" API call, a user who normally could not create users is able to create a read-only user account.
Attempting to set the listening_hostname property in the Agent's config.ini file (which is not normally necessary) changes the Agent's host ID to use this hostname, instead of the normal value.
On Kerberized clusters, Cloudera Manager is monitoring the wrong process as the DataNode.
For full list of issues fixed in Cloudera Manager 5.4.7, see the issues fixed section of the Release Notes.
There are no updates in Cloudera Navigator 2.3.7.
We look forward to you trying it out, using the information below:
Download Cloudera Enterprise
View the documentation
As always, we are happy to hear your feedback. Please send your comments and suggestions to the user group or through our community forums. You can also file bugs through our external JIRA projects on issues.cloudera.org.
... View more