Member since
07-11-2019
102
Posts
4
Kudos Received
9
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
18200 | 12-13-2019 12:03 PM | |
4273 | 12-09-2019 02:42 PM | |
3127 | 11-26-2019 01:21 PM | |
1432 | 08-27-2019 03:03 PM | |
2729 | 08-14-2019 07:33 PM |
11-25-2019
03:31 PM
Attempting to add a client node to cluster via Ambari (v2.7.3.0) (HDP 3.1.0.0-78) and seeing odd error
stderr:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 38, in <module>
BeforeAnyHook().execute()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 352, in execute
method(env)
File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 31, in hook
setup_users()
File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/shared_initialization.py", line 51, in setup_users
fetch_nonlocal_groups = params.fetch_nonlocal_groups,
File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
self.env.run()
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/ambari-agent/lib/resource_management/core/providers/accounts.py", line 90, in action_create
shell.checked_call(command, sudo=True)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
result = function(command, **kwargs)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call
raise ExecutionFailed(err_msg, code, out, err)resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwdError: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']2019-11-25 13:07:58,000 - Reporting component version failedTraceback (most recent call last):
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 363, in execute
self.save_component_version_to_structured_out(self.command_name)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 223, in save_component_version_to_structured_out
stack_select_package_name = stack_select.get_package_name()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 109, in get_package_name
package = get_packages(PACKAGE_SCOPE_STACK_SELECT, service_name, component_name)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 223, in get_packages
supported_packages = get_supported_packages()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages
raise Fail("Unable to query for supported packages using {0}".format(stack_selector_path))
Fail: Unable to query for supported packages using /usr/bin/hdp-select
stdout:
2019-11-25 13:07:57,644 - Stack Feature Version Info: Cluster Stack=3.1, Command Stack=None, Command Version=None -> 3.1
2019-11-25 13:07:57,651 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf2019-11-25 13:07:57,652 - Group['livy'] {}
2019-11-25 13:07:57,654 - Group['spark'] {}
2019-11-25 13:07:57,654 - Group['ranger'] {}
2019-11-25 13:07:57,654 - Group['hdfs'] {}
2019-11-25 13:07:57,654 - Group['zeppelin'] {}
2019-11-25 13:07:57,655 - Group['hadoop'] {}
2019-11-25 13:07:57,655 - Group['users'] {}
2019-11-25 13:07:57,656 - User['yarn-ats'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-11-25 13:07:57,658 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-11-25 13:07:57,659 - Modifying user hiveError: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
2019-11-25 13:07:57,971 - The repository with version 3.1.0.0-78 for this command has been marked as resolved. It will be used to report the version of the component which was installed2019-11-25 13:07:58,000 - Reporting component version failedTraceback (most recent call last):
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 363, in execute
self.save_component_version_to_structured_out(self.command_name)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 223, in save_component_version_to_structured_out
stack_select_package_name = stack_select.get_package_name()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 109, in get_package_name
package = get_packages(PACKAGE_SCOPE_STACK_SELECT, service_name, component_name)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 223, in get_packages
supported_packages = get_supported_packages()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages
raise Fail("Unable to query for supported packages using {0}".format(stack_selector_path))
Fail: Unable to query for supported packages using /usr/bin/hdp-select
Command failed after 1 tries
The problem appears to be
resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd
caused by
2019-11-25 13:07:57,659 - Modifying user hive Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
Though, when running
[root@HW001 .ssh]# /usr/bin/hdp-select versions3.1.0.0-78
from the ambari server node, I can see the command runs.
Looking at what the hook script is trying to run/access, I see
[root@client001~]# ls -lha /var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py -rw-r--r-- 1 root root 1.2K Nov 25 10:51 /var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py [root@client001~]# ls -lha /var/lib/ambari-agent/data/command-632.json
-rw------- 1 root root 545K Nov 25 13:07 /var/lib/ambari-agent/data/command-632.json
[root@client001~]# ls -lha /var/lib/ambari-agent/cache/stack-hooks/before-ANY
total 0drwxr-xr-x 4 root root 34 Nov 25 10:51 .drwxr-xr-x 8 root root 147 Nov 25 10:51 ..drwxr-xr-x 2 root root 34 Nov 25 10:51 files
drwxr-xr-x 2 root root 188 Nov 25 10:51 scripts [root@client001~]# ls -lha /var/lib/ambari-agent/data/structured-out-632.json ls: cannot access /var/lib/ambari-agent/data/structured-out-632.json: No such file or directory [root@client001~]# ls -lha /var/lib/ambari-agent/tmp
total 96Kdrwxrwxrwt 3 root root 4.0K Nov 25 13:06 .drwxr-xr-x 10 root root 267 Nov 25 10:50 . .drwxr-xr-x 6 root root 4.0K Nov 25 13:06 ambari_commons -rwx------ 1 root root 1.4K Nov 25 13:06 ambari-sudo.sh -rwxr-xr-x 1 root root 1.6K Nov 25 13:06 create-python-wrap.sh -rwxr-xr-x 1 root root 1.6K Nov 25 10:50 os_check_type1574715018.py -rwxr-xr-x 1 root root 1.6K Nov 25 11:12 os_check_type1574716360.py -rwxr-xr-x 1 root root 1.6K Nov 25 11:29 os_check_type1574717391.py -rwxr-xr-x 1 root root 1.6K Nov 25 13:06 os_check_type1574723161.py -rwxr-xr-x 1 root root 16K Nov 25 10:50 setupAgent1574715020.py -rwxr-xr-x 1 root root 16K Nov 25 11:12 setupAgent1574716361.py -rwxr-xr-x 1 root root 16K Nov 25 11:29 setupAgent1574717392.py -rwxr-xr-x 1 root root 16K Nov 25 13:06 setupAgent1574723163.py
notice there is ls: cannot access /var/lib/ambari-agent/data/structured-out-632.json: No such file or directory. Not sure if this is normal, though.
Anyone know what could be causing this or any debugging hints from this point?
... View more
Labels:
- Labels:
-
Apache Ambari
11-01-2019
03:03 PM
Is there a difference between installing ambari via the apache docs vs Hortonworks docs? I assume that the end result is exactly the same since Hortonworks labels the distribution in the docs as "Apache" and the repo they instruct to add uses apache.org as the package URL: [root@HW001 ~]# yum info ambari-server Installed Packages Name : ambari-server Arch : x86_64 Version : 2.7.3.0 Release : 139 Size : 418 M Repo : installed From repo : ambari-2.7.3.0 Summary : Ambari Server URL : http://www.apache.org License : (c) Apache Software Foundation Description : Maven Recipe: RPM Package. However, the installation instructions that the ambari project site links to are different and only involve building from source via maven (and seem to nowhere mention installation options via package manager), so gives me pause as to whether these are exactly the same. Could anyone with more experience here explain this a bit more to me? Is the underlying code any different when getting the same ambari version from Hortonworks vs building from source via the apache docs? And the reason for asking was due to the differences in installation method, yet having both marketed as "Apache".
... View more
Labels:
- Labels:
-
Apache Ambari
09-24-2019
05:59 PM
Had only done sudo -u postgres /usr/bin/pg_ctl -D $PGDATA reload from https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/bk_ambari-installation/content/configuring_postgresql_for_ranger.html So I think restarting the service helped (honestly, did many other things so hard to tell which did the trick). For other finding this, a hint that the service should have been restarted could have been found in the docs here: https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/bk_ambari-installation/content/install-postgres.html
... View more
09-20-2019
03:20 PM
Attempting to install HDP 3.1.0 via Ambari 2.7.3 using an existing postgresql DB after following the docs here (https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/bk_ambari-installation/content/configuring_postgresql_for_ranger.html) and entering the commands below:
----------
dbname=hive
postgres=postgres
user=hive
passwd=hive
echo "CREATE DATABASE $dbname;" | sudo -u $postgres psql -U postgres
echo "CREATE USER $user WITH PASSWORD '$passwd';" | sudo -u $postgres psql -U postgres
echo "GRANT ALL PRIVILEGES ON DATABASE $dbname TO $user;" | sudo -u $postgres psql -U postgres
echo "alter user $postgres superuser;" | sudo -u $postgres psql -U postgres
dbname=oozie
postgres=postgres
user=oozie
passwd=oozie
echo "CREATE DATABASE $dbname;" | sudo -u $postgres psql -U postgres
echo "CREATE USER $user WITH PASSWORD '$passwd';" | sudo -u $postgres psql -U postgres
echo "GRANT ALL PRIVILEGES ON DATABASE $dbname TO $user;" | sudo -u $postgres psql -U postgres
echo "alter user $postgres superuser;" | sudo -u $postgres psql -U postgres
dbname=ranger
postgres=postgres
user=rangeradmin
passwd=ranger
echo "CREATE DATABASE $dbname;" | sudo -u $postgres psql -U postgres
echo "CREATE USER $user WITH PASSWORD '$passwd';" | sudo -u $postgres psql -U postgres
echo "GRANT ALL PRIVILEGES ON DATABASE $dbname TO $user;" | sudo -u $postgres psql -U postgres
echo "alter user $user superuser;" | sudo -u $postgres psql -U postgres
dbname=rangerkms
postgres=postgres
user=rangerkms
passwd=ranger
echo "CREATE DATABASE $dbname;" | sudo -u $postgres psql -U postgres
echo "CREATE USER $user WITH PASSWORD '$passwd';" | sudo -u $postgres psql -U postgres
echo "GRANT ALL PRIVILEGES ON DATABASE $dbname TO $user;" | sudo -u $postgres psql -U postgres
echo "alter user $user superuser;" | sudo -u $postgres psql -U postgres
dbname=superset
postgres=postgres
user=superset
passwd=superset
echo "CREATE DATABASE $dbname;" | sudo -u $postgres psql -U postgres
echo "CREATE USER $user WITH PASSWORD '$passwd';" | sudo -u $postgres psql -U postgres
echo "GRANT ALL PRIVILEGES ON DATABASE $dbname TO $user;" | sudo -u $postgres psql -U postgres
echo "alter user $user superuser;" | sudo -u $postgres psql -U postgres
----------
This was done based on the docs here: https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/administering-ambari/content/amb_using_hive_with_postgresql.html
However, when doing the connection tests in the Ambari installation phase that checks the databases for the services to be installed, getting the error
Error injecting constructor, java.lang.RuntimeException: org.postgresql.util.PSQLException: FATAL: no pg_hba.conf entry for host "<some host>", user "<some service user>", database "<some service user>", SSL off
for the hive DB and I assume it would be the same for the druid and superset DBs as well if Ambari had provided a "test connection" button for those.
My question is: what is the problem here? The docs don't seem to indicate that anything more should be done (unlike with the docs for installing ranger), so what should be done?
Currently, my though is to do something like was done for ranger:
[root@HW001 ~]# echo "local all postgres,rangeradmin,rangerlogger,hive,oozie,ranger,rangerkms,superset,druid trust" >> /var/lib/pgsql/data/pg_hba.conf [root@HW001 ~]# echo "host all postgres,rangeradmin,rangerlogger,hive,oozie,ranger,rangerkms,superset,druid 0.0.0.0/0 trust" >> /var/lib/pgsql/data/pg_hba.conf [root@HW001 ~]# echo "host all postgres,rangeradmin,rangerlogger,hive,oozie,ranger,rangerkms,superset,druid ::/0 trust" >> /var/lib/pgsql/data/pg_hba.conf [root@HW001 ~]# cat /var/lib/pgsql/data/pg_hba.conf
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only local all postgres peer # IPv4 local connections: host all postgres 127.0.0.1/32 ident # IPv6 local connections: host all postgres ::1/128 ident # Allow replication connections from localhost, by a user with the # replication privilege. #local replication postgres peer #host replication postgres 127.0.0.1/32 ident #host replication postgres ::1/128 ident
local all ambari,mapred md5 host all ambari,mapred 0.0.0.0/0 md5 host all ambari,mapred ::/0 md5 local all postgres,rangeradmin,rangerlogger,hive,oozie,ranger,rangerkms,superset,druid trust host all postgres,rangeradmin,rangerlogger,hive,oozie,ranger,rangerkms,superset,druid 0.0.0.0/0 trust host all postgres,rangeradmin,rangerlogger,hive,oozie,ranger,rangerkms,superset,druid ::/0 trust
but not sure if there is something else I'm missing here or some other things that I should be seeing that I am not. Is this is the correct thing to do or something else?
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hive
-
Apache Ranger
08-27-2019
03:03 PM
Found the answer here: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/running-spark-applications/content/running_sample_spark_2_x_applications.html The binaries appear to be in /usr/hdp/current/spark2-client/bin Though note that the right way to refer to SPARK_HOME seems to be /usr/hdp/current/spark2-client
... View more
08-27-2019
02:58 PM
Using HDP 3.1 and unable to run spark2 despite the clients being installed on all nodes (via Ambari), eg.
(venv) ➜ ~ spark
zsh: spark: command not found...
zsh: command not found: spark
(venv) ➜ ~ spark2
zsh: spark2: command not found...
zsh: command not found: spark2
Checking the filesystem, nothing seems to be related directly to any spark binaries:
(venv) ➜ ~ find / -name spark 2>&1 | grep -v "Permission denied"
/home/spark
/var/lib/smartsense/hst-agent/resources/collection-scripts/spark
/var/log/spark
/var/spool/mail/spark
/tmp/hadoop-unjar3014181574139383154/org/apache/hadoop/hive/ql/parse/spark
/tmp/hadoop-unjar3014181574139383154/org/apache/hadoop/hive/ql/optimizer/spark
/tmp/hadoop-unjar3014181574139383154/org/apache/hadoop/hive/ql/exec/spark
/tmp/hadoop-unjar3014181574139383154/org/apache/hadoop/hive/common/jsonexplain/spark
/tmp/hadoop-unjar3014181574139383154/org/apache/hive/spark
/tmp/hadoop-unjar3014181574139383154/biz/k11i/xgboost/spark
/usr/hdp/3.1.0.0-78/spark2/examples/src/main/java/org/apache/spark
/usr/hdp/3.1.0.0-78/spark2/examples/src/main/scala/org/apache/spark
/usr/hdp/3.1.0.0-78/oozie/share/lib/spark
Anyone know where the spark binaries are for any given nodes?
... View more
Labels:
08-26-2019
06:54 PM
Problem appears to be related to How to properly change uid for HDP / ambari-created user? and the fact that having a user exist on a node and have a hdfs://user/<username> directory with correct permissions (as I was lead to believe from a Hortonworks forum post) is not sufficient to be acknowledges as "existing" on the cluster. Running the hadoop jar command for a different user (in this case, the Ambari-created hdfs user) that exists on all cluster nodes (even though Ambari created this user having different uids across nodes (IDK if this is a problem)) and has a hdfs://user/hdfs dir, found that the h2o jar ran as expected. Will look into this a bit more before posting as an answer. I think basically will need to look for a bit more clarification as to when HDP considers a user to "exist" on a cluster.
... View more
08-26-2019
06:29 PM
Looking at the docs for installing NiFi on HDP 3.1 via management pack and looking at list of repository locations here for HDF 3.4. Was wondering if the HDF management pack version needs to be the same as the HDP version for correct installation (else how to tell if the version are compatible with each other)?
... View more
Labels:
08-15-2019
02:40 AM
@Geoffrey Shelton Okot What about if the cluster is not using kerberos (eg. hadoop.security.authentication=local)?
... View more
08-14-2019
07:33 PM
Here is the answer that I was told from discussion on the apache hadoop mailing list: I think access time refers to the POSIX atime attribute for files, the “time of last access” as described here for instance [1]. While HDFS keeps a correct modification time (mtime), which is important, easy and cheap, it only keeps a very low-resolution sense of last access time, which is less important, and expensive to monitor and record, as described here [2] and here [3]. It doesn’t even expose this low-rez atime value in the hadoop fs -stat command; you need to use Java if you want to read it from HDFS apis. However, to have a conforming NFS api, you must present atime, and so the HDFS NFS implementation does. But first you have to configure it on. The documentation says that the default value is 3,600,000 milliseconds (1 hour), but many sites have been advised to turn it off entirely by setting it to zero, to improve HDFS overall performance. See for example here ( [4], section "Don’t let Reads become Writes”). So if your site has turned off atime in HDFS, you will need to turn it back on to fully enable NFS. Alternatively, you can maintain optimum efficiency by mounting NFS with the “noatime” option, as described in the document you reference. I don’t know where the nfs3 daemon log file is, but it is almost certainly on the server node where you’ve configured the NFS service to be served from. Log into it and check under /var/log, eg with find /var/log -name ‘*nfs3*’ -print [1] https://www.unixtutorial.org/atime-ctime-mtime-in-unix-filesystems [2] https://issues.apache.org/jira/browse/HADOOP-1869 [3] https://superuser.com/questions/464290/why-is-cat-not-changing-the-access-time [4] https://community.hortonworks.com/articles/43861/scaling-the-hdfs-namenode-part-4-avoiding-performa.html
... View more