Created 06-12-2020 06:45 AM
HI,
I am trying to install a custom service on Cloudera. This service requires some libs for it to run.
I have created the CSD sdl and control.sh file and created a jar out of it.
The issue is that I am not getting where to put service libs in Cloudera. Should these be bundled in CSD jar itself? or should these be installed as parcel. If as parcel, how to install it. I do not see much doc on installing/deploying local parcel on Cloudera manager.
Please help.
Created on 06-15-2020 01:35 AM - edited 06-15-2020 07:19 AM
Hello @NumeroUnoNU ,
thank you for raising your enquiries about how to build a CSD. Have you seen the github repo [1] along with the instructions on how to build and deploy your custom solution, please?
Please note the side-menu on the right with the links of the different instructions on e.g. how to build a parcel, how to create control scripts and so on.
Please let us know if you found the information on our github repo that you were looking for!
Thank you:
Ferenc
[1] https://github.com/cloudera/cm_ext/wiki/CSD-Overview
Ferenc Erdelyi, Technical Solutions Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 06-24-2020 08:55 AM
Hello @NumeroUnoNU ,
yes, you either parse the contents of the hdfs-site.xml or you utilise the HDFS Client, so you do not need to worry about implementation details. I've just quickly googled for you an explanation of what is HDFS Client [1]. If you go for the parsing exercise, make sure you are not referencing the NN, otherwise on failover you should prepare your script to handle that situation.
Kind regards:
Ferenc
[1] https://stackoverflow.com/questions/43221993/what-does-client-exactly-mean-for-hadoop-hdfs
Ferenc Erdelyi, Technical Solutions Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 06-25-2020 06:03 AM
Hello @NumeroUnoNU ,
I've run the "alternatives --list" command on a cluster node and noticed that there is a "hadoop-conf" item, which points to a directory that has the hdfs-site.xml location.
You can also discover it by: "/usr/sbin/alternatives --display hadoop-conf".
This lead to me to google for "/var/lib/alternatives/hadoop-conf" and found this Community Article reply, which I believe answers your question.
In short if you have e.g. gateway roles deployed for HDFS on a node, you will find the up-to-date hdfs-site.xml in /etc/hadoop/conf folder...
We have a little bit diverged from the original topic in this thread. To make the conversation easier to read for future visitors, would you mind open a new thread for each major topics, please?
Please let us know if the above information helped you by pressing the "Accept as Solution" button.
Best regards:
Ferenc
Ferenc Erdelyi, Technical Solutions Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 06-14-2020 08:04 AM
Any help here please?
Created on 06-15-2020 01:35 AM - edited 06-15-2020 07:19 AM
Hello @NumeroUnoNU ,
thank you for raising your enquiries about how to build a CSD. Have you seen the github repo [1] along with the instructions on how to build and deploy your custom solution, please?
Please note the side-menu on the right with the links of the different instructions on e.g. how to build a parcel, how to create control scripts and so on.
Please let us know if you found the information on our github repo that you were looking for!
Thank you:
Ferenc
[1] https://github.com/cloudera/cm_ext/wiki/CSD-Overview
Ferenc Erdelyi, Technical Solutions Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 06-15-2020 06:42 AM
Thanks Bender, I will take a look.
Created 06-20-2020 01:17 AM
This works out. Thanks.
One more question, is there any environment variable that I can use in my control script to get the webhdfs url?
Created 06-22-2020 02:30 AM
Hello @NumeroUnoNU ,
thank you for confirming that the github repo covers your enquiries.
Regarding to WebHDFS, I would use the hdfs-site.xml config file to get the URLs to the namenodes and datanodes after you've enabled it. The Apache Hadoop WebHDFS documentation describes further how the URIs are composed.
Please let me know if it addresses your enquiry.
Kind regards:
Ferenc
Ferenc Erdelyi, Technical Solutions Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 06-24-2020 06:17 AM
Hi Bender,
Scenario here is I have deployed my service on Cloudera. Now, how will it access the hdfs-site.xml file? Are you saying that this file should be bundled with my parcel?
What if hdfs namenode host is changed?
Created 06-24-2020 06:45 AM
Hello @NumeroUnoNU ,
Cloudera Manager is taking care of the Client Configuration files [1]. It makes sure that the latest configurations are deployed to all nodes where related services deployed or gateway roles for that service is configured.
You will find the client configs present the node where e.g. a Datanode role is running under this folder: /var/run/cloudera-scm-agent/process/[largest number]...[Service name].../
The up-to-date configs are always in the folder which is starting with the largest number.
Hope this helps!
Kind regards:
Ferenc
[1] https://docs.cloudera.com/documentation/enterprise/5-16-x/topics/cm_mc_client_config.html
Ferenc Erdelyi, Technical Solutions Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created 06-24-2020 06:59 AM
HI Ferenc,
A very basic question, I might sound silly. For my client to use webhdfs://, it needs to know the hdfs host:port. From your explanation, does client need to parse hdfs-site.xml and get the url?
Or just include hadoop.home.dir? But I think in that case also, client will need hdfs url.
Created 06-24-2020 08:55 AM
Hello @NumeroUnoNU ,
yes, you either parse the contents of the hdfs-site.xml or you utilise the HDFS Client, so you do not need to worry about implementation details. I've just quickly googled for you an explanation of what is HDFS Client [1]. If you go for the parsing exercise, make sure you are not referencing the NN, otherwise on failover you should prepare your script to handle that situation.
Kind regards:
Ferenc
[1] https://stackoverflow.com/questions/43221993/what-does-client-exactly-mean-for-hadoop-hdfs
Ferenc Erdelyi, Technical Solutions Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: