Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Deploy own service on Cloudera

avatar
Explorer

HI,

 

I am trying to install a custom service on Cloudera. This service requires some libs for it to run.

I have created the CSD sdl and control.sh file and created a jar out of it.

 

The issue is that I am not getting where to put service libs in Cloudera. Should these be bundled in CSD jar itself? or should these be installed as parcel. If as parcel, how to install it. I do not see much doc on installing/deploying local parcel on Cloudera manager.

 

Please help. 

3 ACCEPTED SOLUTIONS

avatar
Moderator

Hello @NumeroUnoNU ,

 

thank you for raising your enquiries about how to build a CSD. Have you seen the github repo [1] along with the instructions on how to build and deploy your custom solution, please?

Please note the side-menu on the right with the links of the different instructions on e.g. how to build a parcel, how to create control scripts and so on.

 

Please let us know if you found the information on our github repo that you were looking for!

 

Thank you:

Ferenc

 

[1] https://github.com/cloudera/cm_ext/wiki/CSD-Overview


Ferenc Erdelyi, Technical Solutions Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

View solution in original post

avatar
Moderator

Hello @NumeroUnoNU ,

 

yes, you either parse the contents of the hdfs-site.xml or you utilise the HDFS Client, so you do not need to worry about implementation details. I've just quickly googled for you an explanation of what is HDFS Client [1]. If you go for the parsing exercise, make sure you are not referencing the NN, otherwise on failover you should prepare your script to handle that situation.

 

Kind regards:

Ferenc

 

[1] https://stackoverflow.com/questions/43221993/what-does-client-exactly-mean-for-hadoop-hdfs


Ferenc Erdelyi, Technical Solutions Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

View solution in original post

avatar
Moderator

Hello @NumeroUnoNU ,

 

I've run the "alternatives --list" command on a cluster node and noticed that there is a "hadoop-conf" item, which points to a directory that has the hdfs-site.xml location.

 

You can also discover it by: "/usr/sbin/alternatives --display hadoop-conf".

 

This lead to me to google for "/var/lib/alternatives/hadoop-conf" and found this Community Article reply, which I believe answers your question.

 

In short if you have e.g. gateway roles deployed for HDFS on a node, you will find the up-to-date hdfs-site.xml in /etc/hadoop/conf folder...

 

We have a little bit diverged from the original topic in this thread. To make the conversation easier to read for future visitors, would you mind open a new thread for each major topics, please?

 

Please let us know if the above information helped you by pressing the "Accept as Solution" button.

 

Best regards:

Ferenc


Ferenc Erdelyi, Technical Solutions Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

View solution in original post

11 REPLIES 11

avatar
Explorer

Any help here please?

avatar
Moderator

Hello @NumeroUnoNU ,

 

thank you for raising your enquiries about how to build a CSD. Have you seen the github repo [1] along with the instructions on how to build and deploy your custom solution, please?

Please note the side-menu on the right with the links of the different instructions on e.g. how to build a parcel, how to create control scripts and so on.

 

Please let us know if you found the information on our github repo that you were looking for!

 

Thank you:

Ferenc

 

[1] https://github.com/cloudera/cm_ext/wiki/CSD-Overview


Ferenc Erdelyi, Technical Solutions Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

avatar
Explorer

Thanks Bender, I will take a look.

avatar
Explorer

This works out. Thanks.

One more question, is there any environment variable that I can use in my control script to get the webhdfs url?

avatar
Moderator

Hello @NumeroUnoNU ,

 

thank you for confirming that the github repo covers your enquiries.

 

Regarding to WebHDFS, I would use the hdfs-site.xml config file to get the URLs to the namenodes and datanodes after you've enabled it. The Apache Hadoop WebHDFS documentation describes further how the URIs are composed.

 

Please let me know if it addresses your enquiry.

 

Kind regards:
Ferenc

 


Ferenc Erdelyi, Technical Solutions Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

avatar
Explorer

Hi Bender,

 

Scenario here is I have deployed my service on Cloudera. Now, how will it access the hdfs-site.xml file? Are you saying that this file should be bundled with my parcel?

What if hdfs namenode host is changed?

avatar
Moderator

Hello @NumeroUnoNU ,

 

Cloudera Manager is taking care of the Client Configuration files [1]. It makes sure that the latest configurations are deployed to all nodes where related services deployed or gateway roles for that service is configured.

 

You will find the client configs present the node where e.g. a Datanode role is running under this folder: /var/run/cloudera-scm-agent/process/[largest number]...[Service name].../

 

The up-to-date configs are always in the folder which is starting with the largest number.

 

Hope this helps!

 

Kind regards:
Ferenc

 

 

[1] https://docs.cloudera.com/documentation/enterprise/5-16-x/topics/cm_mc_client_config.html


Ferenc Erdelyi, Technical Solutions Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

avatar
Explorer

HI Ferenc,

 

A very basic question, I might sound silly. For my client to use webhdfs://, it needs to know the hdfs host:port. From your explanation, does client need to parse hdfs-site.xml and get the url?

Or just include hadoop.home.dir? But I think in that case also, client will need hdfs url.

avatar
Moderator

Hello @NumeroUnoNU ,

 

yes, you either parse the contents of the hdfs-site.xml or you utilise the HDFS Client, so you do not need to worry about implementation details. I've just quickly googled for you an explanation of what is HDFS Client [1]. If you go for the parsing exercise, make sure you are not referencing the NN, otherwise on failover you should prepare your script to handle that situation.

 

Kind regards:

Ferenc

 

[1] https://stackoverflow.com/questions/43221993/what-does-client-exactly-mean-for-hadoop-hdfs


Ferenc Erdelyi, Technical Solutions Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community: