Member since
07-20-2016
14
Posts
13
Kudos Received
0
Solutions
07-24-2019
02:25 PM
2 Kudos
Cloudera is delighted to announce the release of Cloudera Data Science Workbench 1.6.0. Some major features shipping with this release include: Bring Your Own Editor Cloudera Data Science Workbench enables team collaboration on an end-to-end data science workflow, from data exploration and data engineering to model development and deployment in production. This can involve collaboration among data engineers, data scientists and ML engineers who often have different editor and IDE preferences. With version 1.6, diverse teams can now tap into the benefits of self-service data science for the enterprise with CDSW, all while working within their most familiar or preferred IDE. This feature supports both, third-party IDEs such as PyCharm that run on your local machine, and browser-based IDEs such as Jupyter and RStudio. For details, see Editors. Expanded Support for Distributed Machine Learning Cloudera Data Science Workbench 1.6 allows you to run distributed workloads with frameworks such as TensorFlowOnSpark, H2O, XGBoost, and so on. This is similar to what you can already do with Spark workloads that run on the attached CDH/HDP cluster. For a simple example, see Running Distributed ML Workloads on YARN. Multiple CDSW Deployments Per-Cloudera Manager Instance You can now have multiple Cloudera Data Science Workbench CSD deployments associated with a single instance of Cloudera Manager. For the complete list of new features, changes, and bug fixes shipping with this release, please see the Release Notes. For more information on downloading, installing, and using Cloudera Data Science Workbench, see the links below: Download Cloudera Data Science Workbench Product Overview Installation Guide Quickstart Guide Release Notes As always, we welcome your feedback. Please send your comments and suggestions on our community forums.
... View more
12-17-2018
05:26 PM
1 Kudo
We are pleased to announce the release of Cloudera Data Science Workbench 1.4.3. This release adds new customizations and fixes some critical bugs, including a permanent fix for TSB-346: Risk of Data Loss on Cloudera Data Science Workbench Shutdown and Restart. Please read the Release Notes carefully before you start upgrading or perform a shutdown/restart operation on any previous version of CDSW. For a complete list of new features and bug fixes in this release, please see the Release Notes. For more information on downloading, installing and using Cloudera Data Science Workbench, see: Download Cloudera Data Science Workbench Cloudera Data Science Workbench Overview Getting Started with Cloudera Data Science Workbench Cloudera Data Science Workbench Release Notes Installing and Upgrading Cloudera Data Science Workbench As always, we welcome your feedback. Please send your comments and suggestions on our community forums.
... View more
11-28-2018
08:56 PM
We are pleased to announce the general availability of Cloudera Enterprise 5.16, the modern platform for machine learning and analytics optimized for the cloud. This release delivers a number of enhancements focusing on ease of administration, improved stability and performance at increased scale. The Cloudera Enterprise 5.16 release includes updated versions of many of our platform components, including: CDH Cloudera Manager Apache Impala Cloudera Navigator Apache Kudu Apache Sentry 5.16 adds new capabilities across four key themes: Easier Administration Sentry adds CREATE permission and user-level ownership of tables. Enables securely sharing a single sandbox database among many users. Eliminates the administrative overhead of creating separate databases, roles and groups to preserve privacy for one person or a small group. Customized upgrade guide puts all the steps needed for your upgrade in one document. Scale and Performance Navigator handles larger volumes of data with more select HDFS event and metadata capture. New Kudu tablet rebalancing tool delivers more consistent performance and resource usage (without administrator intervention) Enterprise Quality and Stability New Cloudera Manager health checks for Impala, Kafka, Kudu and Navigator Impala metadata handling improvements and REFRESH METADATA permission Platform Support Java OpenJDK Oracle Enterprise Linux 7.5 Please note: The first release of Cloudera Enterprise 5.16 is numbered 5.16.1. Additional information is available in the documentation and the Release Notes. As always, we'd love your feedback and remain committed to your success! Please provide any comments and suggestions through our community forums.
... View more
10-12-2018
03:36 PM
1 Kudo
We are pleased to announce the release of Cloudera Data Science Workbench 1.4.2. This release adds support for RHEL/CentOS 7.5 and fixes some critical bugs, including TSB-346: Risk of Data Loss on Cloudera Data Science Workbench Shutdown and Restart. Please read the TSB and Release Notes carefully before you start upgrading or perform a shutdown/restart operation on any previous version of CDSW. Note that Cloudera Data Science Workbench 1.4.2 is the next official maintenance release after Cloudera Data Science Workbench 1.4.0. Version 1.4.1 is no longer publicly available. For more information on downloading, installing and using Cloudera Data Science Workbench, see: Download Cloudera Data Science Workbench Cloudera Data Science Workbench Overview Getting Started with Cloudera Data Science Workbench Cloudera Data Science Workbench Release Notes Installing and Upgrading Cloudera Data Science Workbench As always, we welcome your feedback. Please send your comments and suggestions on our community forums.
... View more
07-27-2018
12:51 PM
We are pleased to announce the release of Cloudera Data Science Workbench 1.3.1.
This release includes fixes for some critical bugs. Details are in the Release Notes.
For more information on downloading, installing and using, please see the links below:
Download Cloudera Data Science Workbench
Cloudera Data Science Workbench Overview
Getting Started with Cloudera Data Science Workbench
Cloudera Data Science Workbench Release Notes
Installing Cloudera Data Science Workbench
As always, we welcome your feedback. Please send your comments and suggestions on our community forums.
... View more
06-22-2018
04:32 PM
Cloudera is delighted to announce the release of Cloudera Data Science Workbench 1.4.0. With this release, Cloudera Data Science Workbench extends the machine learning platform experience from research to production. Data scientists can now build, train, and deploy models in a unified workflow with two new key capabilities: experiments and models. Experiments: Experiments let data scientists train, compare, and reproduce versioned models. With this feature,, data scientists can run a batch job that will: create a snapshot of model code, dependencies, and configuration parameters necessary to train the model build and execute the training run in an isolated container track model metrics, performance, and any model artifacts the user specifies Models: Models let data scientists build, deploy, and manage models as REST APIs to serve predictions. With this feature, data scientists can simply select a Python or R function within a project file, and Cloudera Data Science Workbench will: create a snapshot of model code, saved model parameters, and dependencies build an immutable executable container with the trained model and serving code add a REST endpoint that automatically accepts input parameters matching the function signature, and that returns a data structure matching the function’s return type save the built model container, along with metadata like who built or deployed it deploy and start a specified number of model API replicas, automatically load balanced let the user document, test, and share the model In addition, Cloudera Data Science Workbench 1.4 also includes security enhancements that help automate user administration. Simplified user administration: Previous CDSW releases offered LDAP and SAML authentication but allowed every user to log in. The consequence was user sprawl and unintended license consumption. Site administrators had to be manually entitled in the Cloudera Data Science Workbench UI. With version 1.4 you can now designate LDAP and SAML groups for both users and administrators. With automatic synchronisation, the ability to log in or administer CDSW now depends on a user’s group membership. These groups can be assigned in your existing centralised LDAP/SAML authentication system. We’ve added two new properties to CDSW: LDAP/SAML User Groups - Groups whose users can log in to CDSW LDAP/SAML Admin Groups - Groups whose users are automatically made site administrators in CDSW For a complete list of new features and bug fixes in this release, please see the Release Notes. For more information on downloading, installing and using Cloudera Data Science Workbench, please see the links below: Download Cloudera Data Science Workbench Product Overview Installation Guide Quickstart Guide Release Notes As always, we welcome your feedback. Please send your comments and suggestions on our community forums.
... View more
06-15-2018
09:29 PM
5 Kudos
Cloudera is delighted to announce availability of the next iteration of our modern platform for machine learning and analytics, optimized for cloud. This release continues to demonstrate innovation and a commitment to enterprise-grade quality.
Included in the Cloudera Enterprise 5.15 release is support for new versions of many of our platform components:
CDH 5.15.0
Cloudera Manager 5.15.0
Director 2.8
Navigator 2.14
Cloudera Data Science Workbench (CDSW) 1.4
Apache Kafka CDK 3.1.0, based on the upstream version 1.0.1
Navigator Encrypt 3.15.0
Key Trustee Server 5.15.0
Of special note, 5.15 adds new capabilities aligned to our machine learning, analytics, and cloud focus.
Machine Learning:
Easily track and move models from research to production, helping to launch and compare versioned experiments. Also making it easy to deploy and manage versioned models as micro-services (REST APIs)
Analytics:
Apache Kudu now supports the decimal column type with fixed scale and precision suitable for financial and other arithmetic.
Kudu also has a new replica management scheme that allows for much faster recovery of tablets in scenarios where one tablet server goes down and then returns back shortly. The new scheme also provides substantially better overall stability on clusters with frequent server failures.
Apache Impala has a new RPC functionality. This will make clusters more stable and is the foundation work to run on larger clusters.
New Impala stats sampling and extrapolation will allow users to collect table stats using fewer resources and less time by using a sample of the data.
Cloud:
Altus encryption at rest and in motion which covers AWS S3 data and logs, AWS EBS data and root volumes, TLS for web traffic and Impala, and Kerberos for RPC (data movement)
Simplified cluster provisioning in Cloudera Director.
BDR replication to Microsoft ADLS for HDFS and Hive, plus more secure cloud credential handling for both ADLS and AWS S3.
And more!
Apache Spark 2.3 is also now available separately from CDH 5.15.0 and includes:
Spark lineage support, which can be used with Navigator in CM 5.15 for metadata and transformation analysis and better regulatory compliance.
Vectorized PySpark UDF support which improves PySpark performance
History Server Scalability with a UI which can show applications at start/restart much faster than before, even if there are a lot of applications
Apache Parquet timestamp read side adjustment, so that Spark can read timestamps written by Impala
Additional information is available in the documentation and the Release Notes.
As always, we'd love your feedback and remain committed to your success! Please provide any comments and suggestions through our community forums.
... View more
06-07-2018
08:23 PM
Cloudera is pleased to announce the release of CDK 3.1.0 powered by Apache Kafka. Apache Kafka is a highly scalable, distributed, publish-subscribe messaging system. CDK 3.1.0 is a minor release based on Apache Kafka 1.0.1. Notable Issues Fixed in CDK 3.1.0: [KAFKA-5970] - Deadlock due to locking of DelayedProduce and group [KAFKA-6739] - Down-conversion fails for records with headers [KAFKA-6185] - Selector memory leak with high likelihood of OOM in case of down conversion. [KAFKA-6134] - High memory usage on controller during partition reassignment. [KAFKA-6042] - Kafka Request Handler deadlocks and brings down the cluster. [KAFKA-6003] - Replication Fetcher thread for a partition with no data fails to start. All backported fixes can be viewed in the git release notes here or on our website under the Issues fixed section. We look forward to you trying CDK 3.1.0. For more information, please use the links below: Install or upgrade Kafka Review the documentation Review the Release Notes As always, we welcome your feedback. Please send your comments and suggestions through our community forums.
... View more
01-26-2018
04:16 PM
Cloudera is pleased to announce the release of Cloudera Data Science Workbench 1.3.0. This release includes fixes for some critical bugs. Details are in the Release Notes. For more information on downloading, installing and using, please see the links below: Download Cloudera Data Science Workbench Cloudera Data Science Workbench Overview Getting Started with Cloudera Data Science Workbench Cloudera Data Science Workbench Release Notes Installing Cloudera Data Science Workbench As always, we welcome your feedback. Please send your comments and suggestions on our community forums.
... View more
01-26-2018
04:11 PM
Cloudera is pleased to announce that Cloudera Enterprise 5.14 is now generally available (GA). Our C5.14 release is delivering on the promises we’ve made to our customers, as Cloudera continues to innovate and set the pace for big data and machine learning platforms. This version brings to life key SDX components and improves the security profile of our platform, amongst many other product enhancements. Here are some selected highlights of CDH 5.14. As usual, there are also a number of quality enhancements, bug fixes, and other improvements across the stack. Here is a partial list of what’s included (see the Release Notes for a full list) Core Platform New data catalog and self-service discovery features in Cloudera Navigator enables searching and grouping of business metadata definitions for admins and users. Director support for Active Directory provides a tighter integration for authentication of users as clusters are provisioned and managed. Key Management has added encryption key migration so that customers can easily migrate existing clusters to the new HSM KMS for key management. Data Science and Engineering Altus Data Engineering now supports the popular PySpark language, providing even more choice to data scientists and data engineers building models and data pipelines. The Altus SDK (Java) presents a direct interface for analytics application developers who want to tie in data engineering functions via Cloudera’s platform-as-a-service. Analytic DB Navigator Optimizer now has easier SQL Workload Migration with a migration page to show status “at-a-glance”, gauge the effort to migrate, and represent artifacts to kick-start migration projects. Hue offers a new Impala Query Browser to see recently executed SQL statements and speed up ad hoc analytics. Hue also now has an ADLS Browser to find data stored in the Microsoft Azure Data Lake Service. Kudu now has run-time filter support and the ability to add data directories to an existing tablet server. Additional information is available in the documentation. As always, we value your feedback; please provide any comments and suggestions through our community forums. You can also file bugs via issues.cloudera.org.
... View more