Member since
10-04-2021
4
Posts
2
Kudos Received
0
Solutions
09-29-2023
12:00 PM
1 Kudo
Hey Cloudera Community! We're thrilled to announce our latest addition to the Cloudera AMP catalog - Text Summarization and more with Amazon Bedrock! If you've been keen on tapping into the power of foundation models from leading AI companies like Amazon, this AMP is crafted just for you. What are AMPs? Applied ML Prototypes (AMPs) are pre-packaged ML projects designed for easy deployment within Cloudera Machine Learning (CML). AMPs enable data scientists to go from an idea to a fully working ML use case in a fraction of the time. What's Inside: Guided Setup: Navigate effortlessly through the setup process and start using the Amazon Bedrock API in no time. Highlighted Models: Dive deep with two powerful models: amazon.titan-tg1-large anthropic.claude-v2 (Supports a whopping 100K tokens in the prompt!) Flexibility in Action: Customize the instruction prompts as needed to instruct the models for diverse text generation tasks. API Guidance: Benefit from clear directions on prompt formats and request API schema for hassle-free model calls. Why Dive In? Harness the combined power of Cloudera and Amazon's Bedrock. Whether it's the expansive range of foundation models or the adaptability of instruction prompts, this AMP is your next step in advanced ML implementation.
... View more
Labels:
10-17-2022
02:33 PM
Do you love Machine Learning? Do you think your code is so sublime that it should serve as a template for the rest of the ML community? Do you like winning prizes? If you answered yes to these questions, come join Cloudera’s Applied ML Prototype Hackathon where you will have free reign to create an ML project that may be integrated into Cloudera Machine Learning (CML) as an Applied ML Prototype. Cloudera’s Applied Machine Learning Prototypes, (AMPs) are fully built end-to-end data science solutions that allow data scientists to go from an idea to a fully working machine learning in a fraction of the time. Accessible with a single click from Cloudera Machine learning or via public github repositories, AMPs provide an end-to-end framework for building, deploying and monitoring business-ready ML applications instantly. Winning entrants will receive a cash prize, and their projects will be reviewed by Cloudera Fast Forward Labs and added to the AMP Catalog. So if you have a project that you would love to share with the community, are looking to differentiate your resume from the masses, and/or could use some extra cash, then sign up for your chance to win!
... View more
02-22-2022
06:43 AM
1 Kudo
Cloudera Machine Learning now offers Snippet to connect to Data Sources available within the CDP Environment. Administrators can configure custom Spark, Hive or Impala Virtual Warehouse data connections manually or they can use CML’s features to autodetect and configure all connections from the same CDP Environment. Data Scientists can then access the preconfigured Data Connections from their ML Projects.
The Data Connection and Snippet support simplifies getting started on ML Projects. Once a project is created, the first time the users create a session they are offered code snippets to create a connection to their selected data store. Users don’t need to look up the connection boilerplate from the documentation or copy an example code from other projects, they can easily initiate the connection via CML’s connection library and immediately start solving their business problems.
To learn more, you can read the How To article in the Cloudera Docs.
... View more
11-11-2021
08:52 PM
I wanted to quickly share that we’ve released FIVE new Applied ML Prototypes (AMPs) in Cloudera Machine Learning (CML) and Cloudera Data Science Workbench (CDSW)! These AMPs solve a wide range of problems for data scientists and help jumpstart ML / AI projects, enabling them to deliver greater value faster across their organization. Here’s an overview of what was released:
Summarization — This project demonstrates four automatic summarization models, including extractive and abstractive techniques.
Why you should care: Summarization enables users to quickly extract important information from larger bodies of text. This is useful for any industry looking to accelerate research, competitive analysis, or businesses that need to process and understand large amounts of text information that would be time-prohibitive for a human to do.
AutoML with TPot — This project enables automated machine learning on a sample of credit card fraud data.
Why you should care: AutoML has the potential to accelerate many repetitive processes in the ML model development lifecycle. While many proprietary AutoML organizations make it difficult to tweak or adjust how a model is built, TPot — an open-source library for AutoML — makes it easy to get the most from AutoML without compromising flexibility and customization.
Train Gensim’s Word2Vec - This project demonstrates how to train Word2Vec for a non-language use case to learn embeddings for products on an e-commerce website.
Why you should care: Word embedding is a very popular natural language processing (NLP) technique, it enables capturing the meaning of a word in the context of a document. For users this allows them to extract a more accurate meaning from words when performing natural language processing, thus leading to more accurate text analysis.
Getting Started with the CML API — This project demonstrates how to work with the API.
Why you should care: In addition to the UI interface in CML, the new API v2 release delivers the ability to programmatically interact with your models. For users this means greater flexibility across their production environments for interacting with and retraining models, giving them the freedom to maintain more ML projects effectively.
TensorBoard as a CML Application — TensorBoard is a tool that provides the measurements and visualizations needed to help inspect, debug, and iterate during the machine learning workflow.
Why you should care: TensorBoard makes it easier to track the complex process of developing an ML model. This AMP demonstrates how to run TensorBoard within Cloudera Machine Learning (CML) via the Application feature, an example that will be easy to repurpose for any CML project.
... View more