Created on 08-28-2024 05:40 PM - edited on 09-03-2024 12:35 AM by VidyaSargur
Cloudera Machine Learning (CML) is a platform designed to help organizations build, deploy, and manage machine learning models at scale. It is part of Cloudera’s suite of enterprise data platforms and solutions, focusing on providing a robust environment for data scientists, analysts, and engineers to collaborate on end-to-end machine learning workflows.
PyGurobi is a Python interface for the Gurobi Optimizer, a powerful and widely used solver for mathematical optimization problems. Gurobi is known for its high performance in solving a variety of optimization problems, including linear programming (LP), quadratic programming (QP), mixed-integer programming (MIP), and others.
In this tutorial, you will use PyGurobi on CML to optimize product prices and maximize enterprise revenue.
The following are required to reproduce this example:
Supporting code for reproducing the tutorial can be found in this Git repository.
Editor: JupyterLab
Kernel: Python 3.10
Edition: Standard
Version: 2024.05
Enable Spark: Spark 3.2 or above
Resource Profile: 2 CPU / 4 GB Mem / 0 GPU
Runtime Image: docker.repository.cloudera.com/cloudera/cdsw/ml-runtime-jupyterlab-python3.10-standard:2024.05.1-b8
pip3 install -r requirements.txt
Run notebook ```00_datagen_iceberg_pyspark.ipynb``` and observe the following:
Run notebook ```01_price_optimization_with_competing_products.ipynb``` and observe the following:
Run notebook ```02_price_optimization_model_deployment.ipynb``` and observe the following:
Navigate back to the CML workspace and notice a new project named ```CML Project for Optimization Model``` has been created. Open it and notice a new Endpoint has been created in the Model Deployments section.
Open the model deployment and, once it has been completed, enter the following sample payload in the Test Request window. Observe the output response.
Test Input:
{"p[1]": [354,353,352,351,354,353,312,311,314,313,352,351], "p[2]": [110,120,320,220,101,100,101,260,355,140,300,299], "n[1]": [54,53,112,151,154,153,52,51,4,53,92,71]}
Sample Test Output:
{
"model_deployment_crn": "crn:cdp:ml:us-west-1:558bc1d2-8867-4357-8524-311d51259233:workspace:f76bd7eb-adde-43eb-9bd9-e16ec2cb0238/c152a438-6449-465e-8685-e1cc0b9988fa",
"prediction": {
"data": {
"n[1]": [
54,
53,
112,
151,
154,
153,
52,
51,
4,
53,
92,
71
],
"p[1]": [
354,
353,
352,
351,
354,
353,
312,
311,
314,
313,
352,
351
],
"p[2]": [
110,
120,
320,
220,
101,
100,
101,
260,
355,
140,
300,
299
]
},
"optimal prices": [
400,
300
],
"optimal product quantities": [
80,
120
],
"total revenue": 68032.83
},
"uuid": "e6700d88-f4e7-4705-988b-89e9c8092194"
}
In this tutorial, you used PyGurobi in Cloudera Machine Learning to maximize product revenue by identifying optimal prices and sales quantities for two products.
The PyGurobi library allows you to solve complex linear and nonlinear programming such as the above. Cloudera on Cloud provides the tooling necessary to use libraries such as PyGurobi in an enterprise setting. With CML you can easily leverage Spark on Kubernetes, Runtime Add-Ons, Iceberg, Python, MLFlow, and more, to install and containerize workloads and machine learning models at scale, without any custom installations.
Here are some useful articles about Cloudera Machine Learning (CML) that can help you better understand its features and capabilities: