Member since
06-26-2013
354
Posts
68
Kudos Received
27
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3891 | 08-05-2016 10:36 AM | |
6275 | 06-02-2016 04:57 PM | |
6443 | 05-31-2016 03:47 PM | |
5460 | 04-11-2016 11:26 AM | |
11120 | 03-07-2016 02:04 PM |
05-05-2016
09:02 AM
Hi,
The Cloudera "documentation" you reference here is actually an 8-year-old blog post. I would defer to the more current docs.
... View more
04-18-2016
07:34 AM
Hey jkestelyn... Thank you very much.. It worked fine 🙂
... View more
04-15-2016
08:25 AM
All,
Cloudera's docs have a new look and improved performance & usability. Read the details here:
http://blog.cloudera.com/blog/2016/04/check-out-those-new-and-improved-cloudera-docs/
We look forward to your feedback!
... View more
04-12-2016
02:46 PM
The organizers of HBaseCon, the conference for the Apache HBase community, have published the agenda for the conference (May 24, 2016, in San Francisco)—and once again, the impressive geographical and use-case diversity of HBase are on full display.
See full agenda/register here:
http://hbasecon.com
... View more
04-07-2016
04:30 PM
A new release of the RecordService beta (0.3.0) is now available; see details here:
http://community.cloudera.com/t5/Beta-Releases-Apache-Kudu/ANNOUNCE-RecordService-0-3-0-Released/m-p/39488#M190
... View more
04-07-2016
04:21 PM
Cloudera Enterprise 5.7 is now generally available (comprising CDH 5.7, Cloudera Manager 5.7, and Cloudera Navigator 2.6).
Cloudera is excited to announce the general availability of Cloudera Enterprise 5.7! Main highlights of this release include production-ready Hive-on-Spark functionality, which will help users accelerate their use of Apache Spark as a data processing standard; 2x performance gains for Apache Impala (incubating); easier cluster configuration and utilization reporting; and end-to-end encryption for Apache Spark data.
The release also contains a long list of incremental improvements across the stack, in addition to the usual hundreds of bug fixes (some of which were uncovered during our multi-dimensional hardening/QA process). Here is a partial list of those improvements (see the Release Notes for a full list):
Performance & Scale
Hive-on-Spark GA (graduates from Cloudera Labs)
2x performance gains for Impala: Better join ordering and cardinality estimation, faster query startup, codegen and code optimizations, more
Support for the Apache HBase WAL on SSD
Support for the HBase-Spark module (graduates from Cloudera Labs)
Dramatic performance improvement for backups/DR
Usability & Management
New per-tenant cluster utilization reporting for YARN and Impala
Support for portable, scriptable, and versionable cluster configuration
New SQL formatting in HUE query editor
Security & Governance
Improved Apache Sentry HDFS sync feature
Encryption over the wire/on disk for Spark data
Support for Kerberos and LDAP auth on the same HiveServer2 instance
New “business views” for data lineage; new managed/secure metadata within Cloudera Navigator
New or Updated Open Source Components
Apache Spark 1.6 (including support for Spark SQL and Dataframes in PySpark and the spark.ml package and Pipelines API)
Apache HBase 1.2
Apache Impala (incubating) 2.5
Apache Kafka 0.9 (separate install)
New or Updated Platform Support
RHEL/CentOS/OEL 7.2
SLES 11 SP4
Debian 7.8
JDK 7_80 and JDK 8_60
Over the next few weeks, we’ll publish blog posts that cover some of these features in detail. In the meantime:
Install Cloudera Enterprise 5.7
Explore documentation
As always, we value your feedback; please provide any comments and suggestions through our community forums. You can also file bugs via issues.cloudera.org.
... View more
04-05-2016
01:15 AM
Nope, it has not been resolved, I got exactly the same behaviour yesterday right after registration, i.e. at first login. May I add, btw, that the registration process for this community is among the worst I've seen. I understand that you really like to feed your CRM system with as much information as possible about future and current customers, but it also presents a roadblock to have to answer that many questions before being able to register.
... View more
03-15-2016
05:42 PM
Dear Cloudera Users,
We are pleased to announce the general availability of the Cloudera Connector Powered by Teradata 1.5. This release fixes a compatibility issue with CDH 5.5.0 and later. See the download page for more details.
For more details on new features and usage of Cloudera Connector Powered by Teradata, see:
Release Notes Cloudera Connector Powered by Teradata version 1.5
Cloudera Connector Powered by Teradata User Guide, version 1.5
As always, we welcome your feedback. Please send your comments and suggestions through our new community forums. You can also file bugs in the CDH project at issues.cloudera.org.
... View more
02-19-2016
01:11 PM
Dear CDH Users,
We are pleased to announce the release of the Cloudera Distribution of Apache Kafka 2.0 for CDH 5.
Apache Kafka is a highly scalable, distributed, publish-subscribe messaging system. This release is based on Apache Kafka 0.9, and adds security features such as Kerberos authentication, wire encryption, secure mirroring, a new consumer API, per-user throttling, and many other features and bug fixes that solidify Kafka as an enterprise production-grade component of the Hadoop ecosystem. Kafka 2.0 also ships with new management tooling in Cloudera Manager, for point-and-click configuration of each new capability.
New Features in Cloudera Distribution of Apache Kafka 2.0
Kafka is rebased on Apache Kafka 0.9: http://archive.apache.org/dist/kafka/0.9.0.0/RELEASE_NOTES.html.
Kerberos authentication of connections from clients and other brokers, including to ZooKeeper.
Wire encryption of communications from clients and other brokers using SSL.
A new client API for consumers (Java).
A refactored, secure MirrorMaker to prevent data loss and improve reliability of cross-data center replication.
Per-user quotas to throttle producer and consumer throughput in a multitenant cluster.
Requirements for Cloudera Distribution of Apache Kafka 2.0
Cloudera Manager 5.5.3
Any CDH 5.x release is supported.
Notable Issues Fixed in Cloudera Distribution of Apache Kafka 2.0
Notable fixes backported into Kafka 2.0:
KAFKA-2799: WakupException thrown in the followup poll() could lead to data loss
KAFKA-2942: Inadvertent auto-commit when pre-fetching can cause message loss
KAFKA-2878: Kafka broker throws OutOfMemory exception with invalid join group request
KAFKA-2882: Add constructor cache for Snappy and LZ4 Output/Input stream in Compressor.java
KAFKA-2913: GroupMetadataManager unloads all groups in removeGroupsForPartitions
KAFKA-2880: Fetcher.getTopicMetadata NullPointerException when broker cannot be reached
KAFKA-2950: Fix performance regression in the producer
KAFKA-2973: Fix leak of child sensors on remove
KAFKA-2978: Consumer stops fetching when consumed and fetch positions get out of sync
KAFKA-2988: Change default configuration of the log cleaner
KAFKA-3012: Avoid reserved.broker.max.id collisions on upgrade
All backported fixes can be viewed in the git release notes here.
We look forward to you trying Kafka 2.0! , For more information, please use the links below:
Install or upgrade Kafka
Review the documentation
Review the Release Notes
As always, we welcome your feedback. Please send your comments and suggestions through our community forums.
... View more
12-29-2015
03:34 PM
Impala does not have control of the physical locations of the HDFS blocks underlying Impala tables. The tables in Impala are backed by files on HDFS and those files are chopped into blocks and distributed according to your HDFS configuration, but for all practical purposes the blocks are distributed round-robin among the data nodes (grossly simplified). Impala queries typically run on all data nodes that store data relevant to answering a parcitular query, so given a fixed amount of data, you can indirectly control Impala's degree of (inter-node) parallelism by changing the HDFS block size. More blocks == more parallelism. If you are interested in learning about Impala, you may also find the CIDR paper useful: http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf
... View more