Member since
07-11-2019
102
Posts
4
Kudos Received
9
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
17694 | 12-13-2019 12:03 PM | |
4136 | 12-09-2019 02:42 PM | |
3001 | 11-26-2019 01:21 PM | |
1389 | 08-27-2019 03:03 PM | |
2605 | 08-14-2019 07:33 PM |
08-27-2019
03:03 PM
Found the answer here: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/running-spark-applications/content/running_sample_spark_2_x_applications.html The binaries appear to be in /usr/hdp/current/spark2-client/bin Though note that the right way to refer to SPARK_HOME seems to be /usr/hdp/current/spark2-client
... View more
08-26-2019
06:54 PM
Problem appears to be related to How to properly change uid for HDP / ambari-created user? and the fact that having a user exist on a node and have a hdfs://user/<username> directory with correct permissions (as I was lead to believe from a Hortonworks forum post) is not sufficient to be acknowledges as "existing" on the cluster. Running the hadoop jar command for a different user (in this case, the Ambari-created hdfs user) that exists on all cluster nodes (even though Ambari created this user having different uids across nodes (IDK if this is a problem)) and has a hdfs://user/hdfs dir, found that the h2o jar ran as expected. Will look into this a bit more before posting as an answer. I think basically will need to look for a bit more clarification as to when HDP considers a user to "exist" on a cluster.
... View more
08-26-2019
06:34 PM
@rvillanueva HDF and HDP versions can be different in a cluster. They need not to be exactly same. For example please refer to the https://supportmatrix.hortonworks.com/ Click on "HDP 3.1" (or click on desired HDF version like HDF 3.4.1.1) and then you will find the compatibility matrix with Ambari + HDF versions.
... View more
07-31-2019
11:28 PM
Issue was that the --target-dir path included some variables in the start of the path and ended up with the path looking like //some/hdfs/path and the "empty" // was confusing sqoop.
... View more
08-14-2019
07:33 PM
Here is the answer that I was told from discussion on the apache hadoop mailing list: I think access time refers to the POSIX atime attribute for files, the “time of last access” as described here for instance [1]. While HDFS keeps a correct modification time (mtime), which is important, easy and cheap, it only keeps a very low-resolution sense of last access time, which is less important, and expensive to monitor and record, as described here [2] and here [3]. It doesn’t even expose this low-rez atime value in the hadoop fs -stat command; you need to use Java if you want to read it from HDFS apis. However, to have a conforming NFS api, you must present atime, and so the HDFS NFS implementation does. But first you have to configure it on. The documentation says that the default value is 3,600,000 milliseconds (1 hour), but many sites have been advised to turn it off entirely by setting it to zero, to improve HDFS overall performance. See for example here ( [4], section "Don’t let Reads become Writes”). So if your site has turned off atime in HDFS, you will need to turn it back on to fully enable NFS. Alternatively, you can maintain optimum efficiency by mounting NFS with the “noatime” option, as described in the document you reference. I don’t know where the nfs3 daemon log file is, but it is almost certainly on the server node where you’ve configured the NFS service to be served from. Log into it and check under /var/log, eg with find /var/log -name ‘*nfs3*’ -print [1] https://www.unixtutorial.org/atime-ctime-mtime-in-unix-filesystems [2] https://issues.apache.org/jira/browse/HADOOP-1869 [3] https://superuser.com/questions/464290/why-is-cat-not-changing-the-access-time [4] https://community.hortonworks.com/articles/43861/scaling-the-hdfs-namenode-part-4-avoiding-performa.html
... View more
07-30-2019
12:52 AM
@Reed Villanueva Regarding your query: 1. What is the point of these ambari users / groups? Ambari-level administrators can assign user and group access to Ambari-, Cluster-, Host-, Service-, and User- (view-only) level permissions. Access levels allow administrators to categorised cluster users and groups based on the permissions that each level includes. Permissions that an Ambari-level administrator assigns each user or group define each role. These roles can be understood using the following table mentioned in the following doc. To understand which ambari role holder can do what. https://docs.hortonworks.com/HDPDocuments/Ambari-2.7.3.0/administering-ambari/content/amb_roles_and_authorizations.html 2. What is the context they are intended to be used in? When a user wants to login to Ambari UI or say in a Specific View like File View / Hive View ...etc then in that case the Users created in Ambari DB (listed in the "users" table) can perform the actions according to their roles defined. For Local users ambari will authenticate them using the password listed inside the "users" table. But for the LDAP users the authentication will be done at the LDAP level (because ambari does not store the LDAP Sync users passwords in it's DB). .
... View more
07-25-2019
09:07 PM
Found the solution in this other community post relating to install sqoop drivers in HDP: http://community.hortonworks.com/answers/50556/view.html The correct location that sqoop drivers should go on your calling client node is /usr/hdp/current/sqoop-client/lib/ **Note that the post references a $SQOOP_HOME env var, but my installation does not have such a var on any other the nodes. Anyone know if this indicates a problem?
... View more
07-26-2019
09:10 PM
Think I found the problem, TLDR: firewalld (nodes running on centos7) was still running, when should be disabled on HDP clusters. From another community post: For Ambari to communicate during setup with the hosts it deploys to and manages, certain ports must be open and available. The easiest way to do this is to temporarily disable iptables, as follows: systemctl disable firewalld service firewalld stop So apparently iptables and firewalld need to be disabled across the cluster (supporting docs can be found here, I only disabled them on the Ambari installation node). After stopping these services across the cluster (I recommend using clush), was able to run the upload job without incident.
... View more
07-27-2019
12:33 AM
TLDR: nfs gateway service was already running (by default, apparently) and the service that I thought was blocking the hadoop nfs3 service ( jsvc.exec ) from starting was (I'm assuming) part of that service already running. What made me suspect this was that when shutting down the cluster, the service also stopped plus the fact that it was using the port I needed for nfs. The way that I confirmed this was just from following the verification steps in the docs and seeing that my output was similar to what should be expected. [root@HW02 ~]# rpcinfo -p hw02
program vers proto port service 100000 4 tcp 111 portmapper 100000 3 tcp 111 portmapper 100000 2 tcp 111 portmapper 100000 4 udp 111 portmapper 100000 3 udp 111 portmapper 100000 2 udp 111 portmapper 100005 1 udp 4242 mountd 100005 2 udp 4242 mountd 100005 3 udp 4242 mountd 100005 1 tcp 4242 mountd 100005 2 tcp 4242 mountd 100005 3 tcp 4242 mountd 100003 3 tcp 2049 nfs
[root@HW02 ~]# showmount -e hw02
Export list for hw02:
/ * Another thing that could told me that the jsvc process was part of an already running hdfs nfs service would have been checking the process info... [root@HW02 ~]# ps -feww | grep jsvc root 61106 59083 0 14:27 pts/2 00:00:00 grep --color=auto jsvc root 163179 1 0 12:14 ? 00:00:00 jsvc.exec -Dproc_nfs3 -outfile /var/log/hadoop/root/hadoop-hdfs-root-nfs3-HW02.ucera.local.out -errfile /var/log/hadoop/root/privileged-root-nfs3-HW02.ucera.local.err -pidfile /var/run/hadoop/root/hadoop-hdfs-root-nfs3.pid -nodetach -user hdfs -cp /usr/hdp/3.1.0.0-78/hadoop/conf:
...
and seeing jsvc.exec -Dproc_nfs3 ... to get the hint that jsvc (which apparently is for running java apps on linux) was being used to run the very nfs3 service I was trying to start. And for anyone else with this problem, note that I did not stop all the services that the docs want you to stop (since using centos7) [root@HW01 /]# service nfs status
Redirecting to /bin/systemctl status nfs.service
● nfs-server.service - NFS server and services Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; disabled; vendor preset: disabled) Active: inactive (dead)
[root@HW01 /]# service rpcbind status
Redirecting to /bin/systemctl status rpcbind.service
● rpcbind.service - RPC bind service Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2019-07-19 15:17:02 HST; 6 days ago Main PID: 2155 (rpcbind) CGroup: /system.slice/rpcbind.service └─2155 /sbin/rpcbind -w
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
... View more
07-24-2019
12:57 AM
See comments / discussion of accepted answer for the steps that ultimately solved the problem.
... View more
- « Previous
-
- 1
- 2
- Next »