Member since
11-09-2020
3
Posts
1
Kudos Received
0
Solutions
01-08-2021
07:32 AM
1 Kudo
Text classification is a ubiquitous capability with a wealth of use cases including sentiment analysis, topic assignment, document identification, article recommendation, and more. But collecting enough annotated examples to train traditional classifiers can be quite costly. Instead, we take a look at a classic technique that can be used to perform text classification with few or even zero training examples! We're talking about text embeddings, of course. New advances have significantly increased the quality of document embeddings and in our newest writing on Few Shot Text Classification we cover how to use them for topic classification, best practices for using them, and potential limitations. Follow the links in the report to find code snippets so you can try it for yourself, and build your own demo so you can see the method in action!
... View more
01-04-2021
03:58 AM
Cloudera Fast Forward are happy to share recent updates of two of our older reports, now freely available to all! Semantic Image Search We recently revisited the topic of semantic search on image data. We’ve previously studied applications of deep learning to images in our early report Deep Learning for Image Analysis, and our more recent update: Deep Learning for Image Analysis: 2019 edition. In our most recent research cycle, we explored two critical requirements of semantic search at scale. First, we wrote a review of strategies for creating semantic representations of images (including supervised, self supervised, and unsupervised methods). Second, we provide an implementation of semantic search using fast approximate nearest neighbor search (with FAISS). We have released an updated version of ConvNet Playground App, and a set of scripts and tutorials for implementing semantic image search on the Cloudera Machine Learning platform. Federated Learning Two years ago we wrote a research report about Federated Learning. We’re pleased to make the report freely available to everyone. You can read it online here: Federated Learning. In the time since, Federated Learning has only grown in relevance. Numerous startups have cropped up (and some disappeared by acquisition) with Federated Learning as their core technology. Google continues to promote the technology, including for non-machine learning use cases, as in Federated Analytics: Collaborative Data Science without Data Collection. This year saw (what we believe to be) the first conferences with a heavy focus on federated learning, The Federated Learning Conference and the Open Mined Privacy Conference, as well as dedicated workshops at high profile machine learning conferences like ICML and NeurIPS. OpenMined continues to build a strong community around private machine learning, creating courses and open source tools to lower the barrier-to-entry to federated learning and related privacy enhancing techniques. Alongside those, TensorFlow Federated, IBM’s federated learning library and flower.dev are extending the tooling ecosystem. Federated Learning is no panacea. In a privacy setting, decentralized data simply presents a different attack surface to centralized data. Not all applications require or benefit from federation. However, it is an important tool in the private machine learning toolkit.
... View more
11-25-2020
07:18 AM
Cloudera fast forward are pleased to share our two latest applied machine learning research reports, on Meta-Learning and Structural Time Series. Read on here for more on each report, check-out our blog post to read about our evolving research process, or head on over to our Research Roundup to view our latest research webinar on demand! Meta-Learning In contrast to how humans learn, deep learning algorithms need vast amounts of data and compute and may yet struggle to generalize. Humans are successful in adapting quickly because they leverage their knowledge acquired from prior experience when faced with new problems. In this webinar we will explain how meta-learning can leverage previous knowledge acquired from data to solve novel tasks quickly and more efficiently during test time. Our report, Meta-Learning is freely available online, and accompanied by code that applies the technique to an image dataset. Read the report to learn: when you should think about meta-learning and lessons to apply in your data science practice how meta-learning helps models to generalize to new circumstances or classes during inference a foundational approach to the kind of problems it can help us solve, along with our experimental results Structural Time Series Time series data is ubiquitous, and forecasting has a long history. Generalized additive models give us a simple, flexible and interpretable means for modeling some kinds of time series, especially where there is seasonality. We look at the benefits and trade-offs of taking a curve-fitting approach to time series, and demonstrate its use via Facebook’s Prophet library on a demand forecasting problem. Our report, Structural Time Series, is freely available online, and accompanied by code applying the techniques discussed to forecasting electricity demand in California. Read the report to learn: how capturing the uncertainty in time series allows us to ask better questions the importance of baseline models, and how to develop models iteratively the trade-offs of a curve-fitting approach to time series
... View more