Created 08-19-2019 07:05 AM
Hi,
As there are what seems to be 1000s of settings for HDFS, Hive, CM, Hue, Spark, etc. I was wondering if there is a way to see the actual settings that are being used for all applications in the CM/Hadoop suite?
I know you can see some of them in config files, some of them using CM's interface. But not all of them are in both.
Defaults, for example, don't show up anywhere as sometimes code sets them. CM does not cover all the parameters as it has a special location it stores Safety Valve settings, as I've been told???
I would like to be able to see what is resident in memory and actually being used as you may add a setting and it not take as you could have put it in the wrong area or is being overridden by the same setting somewhere else.
Is there a way to actually see what is being used for any/all parameters in the suite?
Thanks!
Created 08-20-2019 12:29 AM
Hi @pollard,
looking at Impala Configs, you can find the /varz servlet at the debug WebUI of Impala Statestore, Catalogserver and Daemon. When you use the default ports these should be:
On the Impala Daemon, you also have a servlet for Hadoop vars:
Besides these servlets, Impala also prints it flags (which you were asking for in the second paragraph) during startup in the INFO-logs of each service. This may help, if your debug WebUIs are disabled for security reasons.
For instance for Impala daemon: /var/log/impalad/impalad.INFO
I0819 17:43:55.279785 18999 logging.cc:156] Flags (see also /varz are on debug webserver):
--catalog_service_port=26000
--catalog_topic_mode=full
[…]
--symbolize_stacktrace=false
--v=1
--vmodule=
For other services, the /conf servlets at WebUIs or the Cloudera Manager configs (see other reply) mostly apply.
If this or the first answer was helpful to you, please set it as accepted solution.
Regards, Benjamin
Created 08-19-2019 07:32 AM
Hi pollard,
most services expose a "/conf" servlet in their WebUI which gives you the most complete set of actually used parameters. This should be the most promising source of thruth. Impala has a similar Servlet with path "/varz".
The instance process view of Cloudera Manager shows you the actually distributed config files - which often helps a lot but does not include default values. You can reach it from a service (e.g. Impala) by clicking on "Instances" -> (the instance you want to see, e.g. Impala Catalog Server on a node) -> Processes
Regards,
Benjamin
Created 08-19-2019 12:24 PM
Hey Ben,
>most services expose a "/conf" servlet in their WebUI which gives you the most complete set of actually used parameters. This should be the most promising source of thruth. Impala has a similar Servlet with path "/varz".
I've been looking for this /varz path for Impala and cannot find the correct path. Can you give me the port and path that it should be on? Unless it's a name that is given during setup and it's different than "/varz"???
> The instance process view of Cloudera Manager shows you the actually distributed config files - which often helps a lot but does not include default values. You can reach it from a service (e.g. Impala) by clicking on "Instances" -> (the instance you want to see, e.g. Impala Catalog Server on a node) -> Processes
I've seen this but I don't see anything that comes close to what I need. This is the list of Impala related config files:
impala-conf/fair-scheduler.xml
impala-conf/llama-site.xml
impala-conf/sentry-site.xml
impala-conf/log4j.properties
impala.keytab
impala-conf/.htpasswd
impala-conf/impalad_flags
The only thing that comes close is impala-conf/impalad_flags file and that has no link.
Thanks!
Created 08-20-2019 12:29 AM
Hi @pollard,
looking at Impala Configs, you can find the /varz servlet at the debug WebUI of Impala Statestore, Catalogserver and Daemon. When you use the default ports these should be:
On the Impala Daemon, you also have a servlet for Hadoop vars:
Besides these servlets, Impala also prints it flags (which you were asking for in the second paragraph) during startup in the INFO-logs of each service. This may help, if your debug WebUIs are disabled for security reasons.
For instance for Impala daemon: /var/log/impalad/impalad.INFO
I0819 17:43:55.279785 18999 logging.cc:156] Flags (see also /varz are on debug webserver):
--catalog_service_port=26000
--catalog_topic_mode=full
[…]
--symbolize_stacktrace=false
--v=1
--vmodule=
For other services, the /conf servlets at WebUIs or the Cloudera Manager configs (see other reply) mostly apply.
If this or the first answer was helpful to you, please set it as accepted solution.
Regards, Benjamin
Created 08-20-2019 07:44 AM
Hey Ben,
One very key thing you mentioned was the fact that the INFO logs write out the parameter settings. Not sure if that includes defaults as I have not researched it yet but this is extremely useful as I may not have enough time to figure out why the servlets/services are not running or are not available but being able to see the parameter settings this way is more then sufficient.
For anyone needing a way to quickly see the parameters separated from the rest of the log information, you can use this egrep pattern I put together. Hasn't been thoroughly tested but works so far:
egrep '^--\w+=\w?+' /var/log/impalad/impalad.INFO
This command extracted 229 parameters from our INFO log file. Sounds like a pretty complete list???
Created 08-19-2019 10:03 AM
Hi pollard,
Did You try to use Cloudera REST API to do that?
Regards,
Bart
Created 08-20-2019 10:37 AM
Not yet but I have a good bit of experience writing Python code using the CM API. I'll look into it. If there's anything useful, it might be helpful to post my findings here...
Created 08-26-2019 08:56 AM
Just a note about the CM API:
From what I can tell, the API doesn't bring more parameters back that what you can see in the Configuration tab for each app/service.
Being able to see them in the INFO logs was exactly what was needed. However, it would be nice to be able to use the API to get the same info as the INFO logs provides. I could see some automation opportunities in the future...