About aakulov

aakulov · ‎03-06-2024

Earlier this year Cloudera Machine Learning (CML) added a new way to accelerate GenAI projects by tapping into Hugging Face Spaces and deploying these projects right inside of CML with just a few clicks. With over 6,500 spaces as of this writing, Hugging Face community is still growing rapidly and provides a convenient platform for practitioners and organizations to share their work in areas from classical machine learning to the latest GenAI research. In this article you will learn how to enable and use this feature to accelerate your own ML projects. The default Hugging Face Spaces AMP catalog is enabled for all CML Public Cloud workspaces starting from version 2.0.43-b208. To enable users to launch external Hugging Face AMPs, additional steps are necessary (see end of this article). Steps to Deploy Hugging Face Space AMP Let's dive right in and see how simple it is to deploy a Hugging Face AMP: Click on AMPs in the left sidebar of your ML Workspace. If you don’t see this, then AMPs are not enabled by your administrator. Click on Hugging Face tab to narrow down the view to HF AMPs only On the Can you run it? LLM version card click on Deploy Read through the details of the AMP and the disclosure message. You can also navigate to the HF Space’s official github if you wish Click Configure & Deploy This particular HF Space is focused on answering a question of whether or not a given LLM can run on a particular hardware spec. In the next screen, note the environment variables that can be passed down the project. You can leave these at default values here. Leave the rest of the settings unchanged and click Launch Project At this point CML kicks off the steps required to launch this Hugging Face Space, namely installing dependencies and launching an application. After the steps are completed, the AMP will be fully deployed. Clicking on Applications in the left side-bar, you can see a gradio app deployed. Clicking on the app's card (Application to serve UI :link: ) will take you to the app's UI, opened in a new tab of your web browser. It will look like this: What happened in the background? Applied ML Prototypes (AMPs) are packaged projects that include execution steps that CML can understand and perform. The owner of a project defines .project-metadata.yaml in their project repository to instruct CML on what steps should be done run code, schedule a job, or deploy a model, etc.). In the case of Hugging Face Spaces this metadata is injected on the fly by CML as the project is being spun up. The two steps that are executed with Hugging Space AMPs are the following: Install dependencies that a given HF Space requires Deploy an Application (gradio or streamlit) if one is present in the HF Space Once a Hugging Face AMP is launched in CML, users can treat it as any other local project, reviewing the code, making changes, breaking things and learning as they go. The goal is to accelerate innovation in the enterprise and adjust open projects to meet the requirements of specific customer use cases. Enable Deployment of External HF Spaces While Hugging Face Spaces AMPs is a Tech Preview feature, there is a setting that needs to be enabled in the ML Workspace to make it available to the users. For this you will need to have MLAdmin role in the workspace or work with your workspace administrator through the following steps: Inside of the ML Workspace, navigate to Site Administration Go to Settings tab In the Feature Flags section, check the box next to Allow users to deploy external Hugging Face Space This setting takes effect immediately Once this setting is enabled, users will not only deploy Hugging Face Spaces AMPs from the existing catalog, but also let them point to any Hugging Face space and start working with it as a project within CML. In Tech Preview this supports gradio and streamlit applications only. Iterate Faster with CML At Cloudera we strive to give customers options, from deployment models on-prem or in the cloud to using external or internally-hosted Large Language Models. Introduction of Hugging Face Spaces integration in CML will significantly accelerate customers' Machine Learning projects, especially those focused on Generative AI.

aakulov · ‎11-29-2023

Hello @Timo , Some WebUIs support this, but most do not. For example, Hue can have a custom banner and the steps to make that work are here: https://docs.cloudera.com/cdp-private-cloud-base/7.1.9/administering-hue/topics/hue-add-custom-web-ui-banner.html. Hope that helps, Alex

aakulov · ‎10-27-2023

Hi @AndreaCavenago , In a Data Engineering Data Hub, the difference between Worker group and a Compute group is the presence of HDFS service on the Worker nodes (see here). For up-scaling the node type, I would recommend doing Compute group first as that is the group is meant to be ephemeral and come off and on-online dynamically. Do Compute group first and then evaluate if you really need to scale-up the Worker group next. And remember that you can do the scale-up operation both from the UI as well as through CLI. And with the CLI you can provide the specific the --group <name> parameter (see here). Hope that helps. Kind regards, Alex

aakulov · ‎07-07-2023

Hi @felix_ , First, you may want to check out the troubleshooting tips for DAS. In particular, there is a reference to adding a user that wants to see all queries to the admin list under DAS configuration in Cloudera Manager. Second, do you have Hive impersonation enabled or disabled on this cluster? If you are using Ranger for authorization, then impersonation should be turned off. Finally, I would advice your users to begin switching to Hue as the query and analysis UI. That is where all future engineering effort from Cloudera will go. Hope this points you in the right direction. - Alex

aakulov · ‎03-21-2023

Ok, and where are you installing it? On a server or a home computer? What is the size of hardware you have (cpu, memory, disk)? Is it large enough to meet the minimum hardware requirements?

aakulov · ‎03-20-2023

So, the logs are empty. What was the operation that led to this state? Was it just a routine restart? Was there an update of CDH? What version of Cloudera Manager and CDH/CDP are you on right now?

aakulov · ‎02-23-2023

I would suggest working with Cloudera support on this, as they would be best suited for analyzing logs and suggesting next steps.

aakulov · ‎02-12-2023

Hi @suryawanshinp , Looks like the error you list in point (1) can potentially be due to your external postresDB database using an IP address instead of the hostname. Please consult the documentation here: https://docs.cloudera.com/cdp-private-cloud-data-services/1.5.0/installation/topics/cdppvc-installation-external-db-setup.html Kind regards, Alex

aakulov · ‎01-25-2023

There is not enough detail here to be able to provide any kind of answer. Please open a Cloudera Support case and upload the logs to that case in order to get the best solution for the issue. Regards, Alex

aakulov · ‎05-20-2022

Looks like there are different types of AWS Load Balancers, and the one that can handle TCP sessions are either Network Load Balancer (NLB) or Classic Load Balancer. Which one are you using? I believe stickiness also works for NLB, but do validate with your AWS team. The session timeout is two-fold: LB session timeout. These are settings inside of the load balancer, nothing to do with Impala itself. The recommendation is to set this timeout to 12 hours, 6 hours at a minimum for both client and server timeouts. Hue session timeout (idle_session_timeout). This is how long Hue will keep the connection to Impala alive. Default may be 15 minutes. If no queries run in this time amount of time, and then you run a new query, you'll get this error "Results have expired". Hue will need to start a new session. Also if your query takes longer than idle_session_timeout, then you'll definitely need to increase the timeout setting. 1 hour timeout may be appropriate here. Regards, Alex

Online	Offline
Last Visited	‎09-05-2024 02:11 AM

Member Since	‎02-27-2020 04:13 PM
Last Visited	‎09-05-2024 02:11 AM
Posts	173
Kudos received	42

Cloudera Community

Re: Changing Colours or adding a banner to WebUIs

Re: CDP Public Cloud - Resizing of Worker/Compute ...

Re: How to collect queries submitted by other user...

Re: CDH配置好以后，agent服务能够启动，但是server服务无法启动 (After CDH...

Re: How to increase timeout definition?

🤗 Hugging Face Spaces AMPs Accelerate ML Projects

Re: Changing Colours or adding a banner to WebUIs

Re: CDP Public Cloud - Resizing of Worker/Compute ...

Re: How to collect queries submitted by other user...

Re: CDH配置好以后，agent服务能够启动，但是server服务无法启动 (After CDH...

Re: CDH配置好以后，agent服务能够启动，但是server服务无法启动 (After CDH...

Re: CDP data warehousing catalog creation error

Re: CDP data warehousing catalog creation error

Re: How to increase timeout definition?

Re: ERROR- HUE - Results have expired, rerun the q...