About rvillanueva

rvillanueva · ‎09-24-2019

Had only done sudo -u postgres /usr/bin/pg_ctl -D $PGDATA reload from https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/bk_ambari-installation/content/configuring_postgresql_for_ranger.html So I think restarting the service helped (honestly, did many other things so hard to tell which did the trick). For other finding this, a hint that the service should have been restarted could have been found in the docs here: https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/bk_ambari-installation/content/install-postgres.html

rvillanueva · ‎09-20-2019

Attempting to install HDP 3.1.0 via Ambari 2.7.3 using an existing postgresql DB after following the docs here (https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/bk_ambari-installation/content/configuring_postgresql_for_ranger.html) and entering the commands below: ---------- dbname=hive postgres=postgres user=hive passwd=hive echo "CREATE DATABASE $dbname;" | sudo -u $postgres psql -U postgres echo "CREATE USER $user WITH PASSWORD '$passwd';" | sudo -u $postgres psql -U postgres echo "GRANT ALL PRIVILEGES ON DATABASE $dbname TO $user;" | sudo -u $postgres psql -U postgres echo "alter user $postgres superuser;" | sudo -u $postgres psql -U postgres dbname=oozie postgres=postgres user=oozie passwd=oozie echo "CREATE DATABASE $dbname;" | sudo -u $postgres psql -U postgres echo "CREATE USER $user WITH PASSWORD '$passwd';" | sudo -u $postgres psql -U postgres echo "GRANT ALL PRIVILEGES ON DATABASE $dbname TO $user;" | sudo -u $postgres psql -U postgres echo "alter user $postgres superuser;" | sudo -u $postgres psql -U postgres dbname=ranger postgres=postgres user=rangeradmin passwd=ranger echo "CREATE DATABASE $dbname;" | sudo -u $postgres psql -U postgres echo "CREATE USER $user WITH PASSWORD '$passwd';" | sudo -u $postgres psql -U postgres echo "GRANT ALL PRIVILEGES ON DATABASE $dbname TO $user;" | sudo -u $postgres psql -U postgres echo "alter user $user superuser;" | sudo -u $postgres psql -U postgres dbname=rangerkms postgres=postgres user=rangerkms passwd=ranger echo "CREATE DATABASE $dbname;" | sudo -u $postgres psql -U postgres echo "CREATE USER $user WITH PASSWORD '$passwd';" | sudo -u $postgres psql -U postgres echo "GRANT ALL PRIVILEGES ON DATABASE $dbname TO $user;" | sudo -u $postgres psql -U postgres echo "alter user $user superuser;" | sudo -u $postgres psql -U postgres dbname=superset postgres=postgres user=superset passwd=superset echo "CREATE DATABASE $dbname;" | sudo -u $postgres psql -U postgres echo "CREATE USER $user WITH PASSWORD '$passwd';" | sudo -u $postgres psql -U postgres echo "GRANT ALL PRIVILEGES ON DATABASE $dbname TO $user;" | sudo -u $postgres psql -U postgres echo "alter user $user superuser;" | sudo -u $postgres psql -U postgres ---------- This was done based on the docs here: https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/administering-ambari/content/amb_using_hive_with_postgresql.html However, when doing the connection tests in the Ambari installation phase that checks the databases for the services to be installed, getting the error Error injecting constructor, java.lang.RuntimeException: org.postgresql.util.PSQLException: FATAL: no pg_hba.conf entry for host "<some host>", user "<some service user>", database "<some service user>", SSL off for the hive DB and I assume it would be the same for the druid and superset DBs as well if Ambari had provided a "test connection" button for those. My question is: what is the problem here? The docs don't seem to indicate that anything more should be done (unlike with the docs for installing ranger), so what should be done? Currently, my though is to do something like was done for ranger: [root@HW001 ~]# echo "local all postgres,rangeradmin,rangerlogger,hive,oozie,ranger,rangerkms,superset,druid trust" >> /var/lib/pgsql/data/pg_hba.conf [root@HW001 ~]# echo "host all postgres,rangeradmin,rangerlogger,hive,oozie,ranger,rangerkms,superset,druid 0.0.0.0/0 trust" >> /var/lib/pgsql/data/pg_hba.conf [root@HW001 ~]# echo "host all postgres,rangeradmin,rangerlogger,hive,oozie,ranger,rangerkms,superset,druid ::/0 trust" >> /var/lib/pgsql/data/pg_hba.conf [root@HW001 ~]# cat /var/lib/pgsql/data/pg_hba.conf # TYPE DATABASE USER ADDRESS METHOD # "local" is for Unix domain socket connections only local all postgres peer # IPv4 local connections: host all postgres 127.0.0.1/32 ident # IPv6 local connections: host all postgres ::1/128 ident # Allow replication connections from localhost, by a user with the # replication privilege. #local replication postgres peer #host replication postgres 127.0.0.1/32 ident #host replication postgres ::1/128 ident local all ambari,mapred md5 host all ambari,mapred 0.0.0.0/0 md5 host all ambari,mapred ::/0 md5 local all postgres,rangeradmin,rangerlogger,hive,oozie,ranger,rangerkms,superset,druid trust host all postgres,rangeradmin,rangerlogger,hive,oozie,ranger,rangerkms,superset,druid 0.0.0.0/0 trust host all postgres,rangeradmin,rangerlogger,hive,oozie,ranger,rangerkms,superset,druid ::/0 trust but not sure if there is something else I'm missing here or some other things that I should be seeing that I am not. Is this is the correct thing to do or something else?

rvillanueva · ‎08-27-2019

Found the answer here: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/running-spark-applications/content/running_sample_spark_2_x_applications.html The binaries appear to be in /usr/hdp/current/spark2-client/bin Though note that the right way to refer to SPARK_HOME seems to be /usr/hdp/current/spark2-client

rvillanueva · ‎08-27-2019

Using HDP 3.1 and unable to run spark2 despite the clients being installed on all nodes (via Ambari), eg. (venv) ➜ ~ spark zsh: spark: command not found... zsh: command not found: spark (venv) ➜ ~ spark2 zsh: spark2: command not found... zsh: command not found: spark2 Checking the filesystem, nothing seems to be related directly to any spark binaries: (venv) ➜ ~ find / -name spark 2>&1 | grep -v "Permission denied" /home/spark /var/lib/smartsense/hst-agent/resources/collection-scripts/spark /var/log/spark /var/spool/mail/spark /tmp/hadoop-unjar3014181574139383154/org/apache/hadoop/hive/ql/parse/spark /tmp/hadoop-unjar3014181574139383154/org/apache/hadoop/hive/ql/optimizer/spark /tmp/hadoop-unjar3014181574139383154/org/apache/hadoop/hive/ql/exec/spark /tmp/hadoop-unjar3014181574139383154/org/apache/hadoop/hive/common/jsonexplain/spark /tmp/hadoop-unjar3014181574139383154/org/apache/hive/spark /tmp/hadoop-unjar3014181574139383154/biz/k11i/xgboost/spark /usr/hdp/3.1.0.0-78/spark2/examples/src/main/java/org/apache/spark /usr/hdp/3.1.0.0-78/spark2/examples/src/main/scala/org/apache/spark /usr/hdp/3.1.0.0-78/oozie/share/lib/spark Anyone know where the spark binaries are for any given nodes?

rvillanueva · ‎08-26-2019

Looking at the docs for installing NiFi on HDP 3.1 via management pack and looking at list of repository locations here for HDF 3.4. Was wondering if the HDF management pack version needs to be the same as the HDP version for correct installation (else how to tell if the version are compatible with each other)?

rvillanueva · ‎08-14-2019

Here is the answer that I was told from discussion on the apache hadoop mailing list: I think access time refers to the POSIX atime attribute for files, the “time of last access” as described here for instance [1]. While HDFS keeps a correct modification time (mtime), which is important, easy and cheap, it only keeps a very low-resolution sense of last access time, which is less important, and expensive to monitor and record, as described here [2] and here [3]. It doesn’t even expose this low-rez atime value in the hadoop fs -stat command; you need to use Java if you want to read it from HDFS apis. However, to have a conforming NFS api, you must present atime, and so the HDFS NFS implementation does. But first you have to configure it on. The documentation says that the default value is 3,600,000 milliseconds (1 hour), but many sites have been advised to turn it off entirely by setting it to zero, to improve HDFS overall performance. See for example here ( [4], section "Don’t let Reads become Writes”). So if your site has turned off atime in HDFS, you will need to turn it back on to fully enable NFS. Alternatively, you can maintain optimum efficiency by mounting NFS with the “noatime” option, as described in the document you reference. I don’t know where the nfs3 daemon log file is, but it is almost certainly on the server node where you’ve configured the NFS service to be served from. Log into it and check under /var/log, eg with find /var/log -name ‘*nfs3*’ -print [1] https://www.unixtutorial.org/atime-ctime-mtime-in-unix-filesystems [2] https://issues.apache.org/jira/browse/HADOOP-1869 [3] https://superuser.com/questions/464290/why-is-cat-not-changing-the-access-time [4] https://community.hortonworks.com/articles/43861/scaling-the-hdfs-namenode-part-4-avoiding-performa.html

rvillanueva · ‎07-31-2019

Issue was that the --target-dir path included some variables in the start of the path and ended up with the path looking like //some/hdfs/path and the "empty" // was confusing sqoop.

rvillanueva · ‎07-31-2019

Trying to import data from oracle DB and getting error .... 19/07/31 13:07:10 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-myuser/compile/375d3de163797c05cd7b480fddcfe58c/QueryResult.jar 19/07/31 13:07:10 ERROR tool.ImportTool: Import failed: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "null" at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3281) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3301) .... My sqoop command looks like... sqoop import \ -Dmapreduce.map.memory.mb=3144 -Dmapreduce.map.java.opts=-Xmx1048m \ -Dyarn.app.mapreduce.am.log.level=DEBUG \ -Dmapreduce.map.log.level=DEBUG \ -Dmapreduce.reduce.log.level=DEBUG \ -Dmapred.job.name="Ora import table $tablename" \ -Djava.security.egd=file:///dev/urandom \ -Djava.security.egd=file:///dev/urandom \ -Doraoop.timestamp.string=false \ -Dmapreduce.map.max.attempts=10 \ $oracle_cnxn_str \ --as-parquetfile \ --target-dir /some/hdfs/path \ -query "$sqoop_query" \ --split-by $splitby \ --where "1=1" \ --num-mappers 12 \ --delete-target-dir Not sure what to make of this error message. Any debugging suggestions or fixes?

rvillanueva · ‎07-29-2019

@Jay Kumar SenSharma I understand using SSSD for cluster-wide users with LDAP, but my question more has to do with... "Ambari in any case is not responsible for creating user/groups for those ambari UI users in any node. For example you will see "admin" user in ambari but you wont see any such user on ambari server host or on any other node." What I was more interested in was, given the above, what is the point of these ambari users / groups? What is the context they are intended to be used in? I would think they could be used for adding ACL-like permissions to folders in the Ambari files view or something, but that does not seem to be the case, so I'm not sure what the point of them is. **Note I previously used MapR Hadoop which did operate in a similar way to this (where users of HDFS needed to exist across all nodes and the MapR mgmt UI allowed ACL-like permissions on HDFS volumes based on users and groups), so that's my frame of reference.

rvillanueva · ‎07-29-2019

Having a problem with HDFS NFS, addressed on another site where it is recommended to set hdfs-site.xml like... <property> <name>dfs.namenode.accesstime.precision</name> <value>3600000</value> <description> The access time for HDFS file is precise upto this value. The default value is 1 hour. Setting a value of 0 disables access times for HDFS. </description> </property> Am confused about what exactly "access times for HDFS" means / is. Looking at the hadoop docs, was still not able to determine. Could someone give better understanding as to what this is doing? Also where is the nfs3 daemon log file ?

Online	Offline
Last Visited	‎10-31-2020 09:19 PM

Member Since	‎07-11-2019 08:54 PM
Last Visited	‎10-31-2020 09:19 PM
Posts	102
Kudos received	4

Cloudera Community

Re: How to run spark-submit in virtualenv for pysp...

Re: LDAP/AD users not appearing in Ranger

Re: Ambari unable to run custom hook for modifying...

Re: Where are the spark2 binaries?

Re: What are HDFS NFS "access times"?

Re: HDP Ambari installation throws "org.postgresql...

HDP Ambari installation throws "org.postgresql.uti...

Re: Where are the spark2 binaries?

Where are the spark2 binaries?

Installing HDF management pack, does version need ...

Re: What are HDFS NFS "access times"?

Re: sqoop ERROR tool.ImportTool: Import failed: or...

sqoop ERROR tool.ImportTool: Import failed: org.ap...

Re: Understanding Ambari users and groups

What are HDFS NFS "access times"?