About rvillanueva

rvillanueva · ‎07-25-2019

@Jay Kumar SenSharma Running you precribed tests across all the cluster nodes, getting... [root@HW01 groups.d]# clush -ab hostname -f --------------- HW01 --------------- HW01.ucera.local --------------- HW02 --------------- HW02.ucera.local --------------- HW03 --------------- HW03.ucera.local --------------- HW04 --------------- HW04.ucera.local [root@HW01 groups.d]# clush -ab cat /etc/hosts --------------- HW[01-04] (4) --------------- 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 172.18.4.46 HW01.ucera.local 172.18.4.47 HW02.ucera.local 172.18.4.48 HW03.ucera.local 172.18.4.49 HW04.ucera.local [root@HW01 groups.d]# [root@HW01 groups.d]# [root@HW01 groups.d]# [root@HW01 groups.d]# clush -ab ping hw04.ucera.local ^CHW03: Killed by signal 2. HW04: Killed by signal 2. HW02: Killed by signal 2. Warning: Caught keyboard interrupt! --------------- HW01 --------------- PING HW04.ucera.local (172.18.4.49) 56(84) bytes of data. 64 bytes from HW04.ucera.local (172.18.4.49): icmp_seq=1 ttl=64 time=0.231 ms 64 bytes from HW04.ucera.local (172.18.4.49): icmp_seq=2 ttl=64 time=0.300 ms --------------- HW02 --------------- PING HW04.ucera.local (172.18.4.49) 56(84) bytes of data. 64 bytes from HW04.ucera.local (172.18.4.49): icmp_seq=1 ttl=64 time=0.177 ms --------------- HW03 --------------- PING HW04.ucera.local (172.18.4.49) 56(84) bytes of data. 64 bytes from HW04.ucera.local (172.18.4.49): icmp_seq=1 ttl=64 time=0.260 ms --------------- HW04 --------------- PING HW04.ucera.local (172.18.4.49) 56(84) bytes of data. 64 bytes from HW04.ucera.local (172.18.4.49): icmp_seq=1 ttl=64 time=0.095 ms Keyboard interrupt (HW[01-04] did not complete). [root@HW01 groups.d]# [root@HW01 groups.d]# [root@HW01 groups.d]# [root@HW01 groups.d]# clush -ab "telnet hw04.ucera.local 50075" HW01: telnet: connect to address 172.18.4.49: No route to host HW02: telnet: connect to address 172.18.4.49: No route to host HW03: telnet: connect to address 172.18.4.49: No route to host HW04: Killed by signal 2. Warning: Caught keyboard interrupt! --------------- HW[01-03] (3) --------------- Trying 172.18.4.49... --------------- HW04 --------------- Trying 172.18.4.49... Connected to hw04.ucera.local. Escape character is '^]'. clush: HW[01-03] (3): exited with exit code 1 Keyboard interrupt. So other than telnet not working things seem to be OK. Will continue looking into why telnet can't connect.

rvillanueva · ‎07-25-2019

Found the solution in this other community post relating to install sqoop drivers in HDP: http://community.hortonworks.com/answers/50556/view.html The correct location that sqoop drivers should go on your calling client node is /usr/hdp/current/sqoop-client/lib/ **Note that the post references a $SQOOP_HOME env var, but my installation does not have such a var on any other the nodes. Anyone know if this indicates a problem?

rvillanueva · ‎07-25-2019

Indent to call sqoop on a client node and need to be able to connect to an oracle DB. Currently seeing error... 19/07/25 10:48:55 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: oracle.jdbc.OracleDriver java.lang.RuntimeException: Could not load db driver class: oracle.jdbc.OracleDriver When looking into this, appears need to install the proper jdbc driver for sqoop to use. However, the only docs I can find on installing sqoop drivers is for Cloudera CDP and following these docs runs into problems where certain expected directories do not exist on my cluster (particularly /var/lib/sqoop) (and I am assuming that they should already be there and not need to be manually created). Are there any docs on doing this for HDP? Docs for doing this just through the Ambari UI?

rvillanueva · ‎07-25-2019

@Jay Kumar SenSharma Think I have determined the source of the error. Was previously getting an error of Unauthorized connection for super-user: root from IP from the files view when attempting to upload files. In order to fix this, did hadoop.proxyuser.root.groups=* hadoop.proxyuser.root.hosts=* from http://community.hortonworks.com/answers/70794/view.html This initially appeared to work, but suddenly stopped working. Am finding that the Ambari HDFS configs that I had previously set are continually reverting back to the default configs group (back from the configs that I had made to address the original problem). The dashboard configs history shows this... yet the HDFS service says it is using "version 1" (the initial configs) and configs>advance>core-site shows... implying that in fact the custom changes are not being used. This is a bit confusing to me. Need to continue debugging the situation. Any suggestions?

rvillanueva · ‎07-25-2019

@Jay Kumar SenSharma Checking the /etc/hosts files on the hw[1-3] cluster nodes, there was no hw04 FQDN (note that these were the original nodes, hw04 was a node that was added later via Ambari), have added now as well as to hw04. But was able to ping hw04 from all the other nodes regardless. However, was not able to run telnet hw04.ucera.local 50075 from any of the nodes. Not sure what this means or how to fix it, though? ** Also interesting that Ambari does not automatically added the new FQDN to the /etc/hosts file of the existing nodes when added another node (since already has root access). So seems that this means that any time an admin wanted to add more nodes, they would need to go through all the cluster nodes and add this manually. Any specific reason for this design?

rvillanueva · ‎07-25-2019

Trying to upload a simple csv file into HDFS via the Ambari files view, getting the error below java.net.NoRouteToHostException: hw04.ucera.local:50075: No route to host (Host unreachable) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.... For reference, here the hosts on the cluster and the services on the HW04 node Accumulo Client DataNode HBase Client HDFS Client Hive Client HST Agent Infra Solr Client Log Feeder MapReduce2 Client Metrics Monitor Oozie Client Pig Client Spark2 Client Sqoop Client Tez Client YARN Client ZooKeeper Client Not sure what the error means. Any debugging suggestions or fixes?

rvillanueva · ‎07-24-2019

Attempting to use / startup HDFS NFS following the docs (ignoring the instructions to stop the rpcbind service and did not start the hadoop portmap service given that the OS is not SLES 11 and RHEL 6.2), but running into error when trying to set up the NFS service starting the hdfs nfs3 service: [root@HW02 ~]# [root@HW02 ~]# [root@HW02 ~]# cat /etc/os-release NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/" CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7" [root@HW02 ~]# [root@HW02 ~]# [root@HW02 ~]# service nfs status Redirecting to /bin/systemctl status nfs.service Unit nfs.service could not be found. [root@HW02 ~]# [root@HW02 ~]# [root@HW02 ~]# service nfs stop Redirecting to /bin/systemctl stop nfs.service Failed to stop nfs.service: Unit nfs.service not loaded. [root@HW02 ~]# [root@HW02 ~]# [root@HW02 ~]# service rpcbind status Redirecting to /bin/systemctl status rpcbind.service ● rpcbind.service - RPC bind service Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2019-07-23 13:48:54 HST; 28s ago Process: 27337 ExecStart=/sbin/rpcbind -w $RPCBIND_ARGS (code=exited, status=0/SUCCESS) Main PID: 27338 (rpcbind) CGroup: /system.slice/rpcbind.service └─27338 /sbin/rpcbind -w Jul 23 13:48:54 HW02.ucera.local systemd[1]: Starting RPC bind service... Jul 23 13:48:54 HW02.ucera.local systemd[1]: Started RPC bind service. [root@HW02 ~]# [root@HW02 ~]# [root@HW02 ~]# hdfs nfs3 19/07/23 13:49:33 INFO nfs3.Nfs3Base: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting Nfs3 STARTUP_MSG: host = HW02.ucera.local/172.18.4.47 STARTUP_MSG: args = [] STARTUP_MSG: version = 3.1.1.3.1.0.0-78 STARTUP_MSG: classpath = /usr/hdp/3.1.0.0-78/hadoop/conf:/usr/hdp/3.1.0.0-78/hadoop/lib/jersey-server-1.19.jar:/usr/hdp/3.1.0.0-78/hadoop/lib/ranger-hdfs-plugin-shim-1.2.0.3.1.0.0-78.jar: ... <a bunch of other jars> ... STARTUP_MSG: build = git@github.com:hortonworks/hadoop.git -r e4f82af51faec922b4804d0232a637422ec29e64; compiled by 'jenkins' on 2018-12-06T12:26Z STARTUP_MSG: java = 1.8.0_112 ************************************************************/ 19/07/23 13:49:33 INFO nfs3.Nfs3Base: registered UNIX signal handlers for [TERM, HUP, INT] 19/07/23 13:49:33 INFO impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties 19/07/23 13:49:33 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s). 19/07/23 13:49:33 INFO impl.MetricsSystemImpl: Nfs3 metrics system started 19/07/23 13:49:33 INFO oncrpc.RpcProgram: Will accept client connections from unprivileged ports 19/07/23 13:49:33 INFO security.ShellBasedIdMapping: Not doing static UID/GID mapping because '/etc/nfs.map' does not exist. 19/07/23 13:49:33 INFO nfs3.WriteManager: Stream timeout is 600000ms. 19/07/23 13:49:33 INFO nfs3.WriteManager: Maximum open streams is 256 19/07/23 13:49:33 INFO nfs3.OpenFileCtxCache: Maximum open streams is 256 19/07/23 13:49:34 INFO nfs3.DFSClientCache: Added export: / FileSystem URI: / with namenodeId: -1408097406 19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Configured HDFS superuser is 19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Delete current dump directory /tmp/.hdfs-nfs 19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Create new dump directory /tmp/.hdfs-nfs 19/07/23 13:49:34 INFO nfs3.Nfs3Base: NFS server port set to: 2049 19/07/23 13:49:34 INFO oncrpc.RpcProgram: Will accept client connections from unprivileged ports 19/07/23 13:49:34 INFO mount.RpcProgramMountd: FS:hdfs adding export Path:/ with URI: hdfs://hw01.ucera.local:8020/ 19/07/23 13:49:34 INFO oncrpc.SimpleUdpServer: Started listening to UDP requests at port 4242 for Rpc program: mountd at localhost:4242 with workerCount 1 19/07/23 13:49:34 ERROR mount.MountdBase: Failed to start the TCP server. org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:4242 at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) at org.apache.hadoop.oncrpc.SimpleTcpServer.run(SimpleTcpServer.java:89) at org.apache.hadoop.mount.MountdBase.startTCPServer(MountdBase.java:83) at org.apache.hadoop.mount.MountdBase.start(MountdBase.java:98) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startServiceInternal(Nfs3.java:56) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startService(Nfs3.java:69) at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:79) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:433) at sun.nio.ch.Net.bind(Net.java:425) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:391) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:315) at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42) at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 19/07/23 13:49:34 INFO util.ExitUtil: Exiting with status 1: org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:4242 19/07/23 13:49:34 INFO nfs3.Nfs3Base: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down Nfs3 at HW02.ucera.local/172.18.4.47 ************************************************************/ Not sure how to interpret any of the errors seen here (and have not installed any packages like nfs-utils , assuming that Ambari would have installed all needed packages when cluster was initially installed). Any debugging suggestions or solutions for what to do about this? ** UPDATE: After looking at the error, I can see Caused by: java.net.BindException: Address already in use and looking into what is already using it, we see... [root@HW02 ~]# netstat -ltnp | grep 4242 tcp 0 0 0.0.0.0:4242 0.0.0.0:* LISTEN 98067/jsvc.exec Not sure what this is. Does this have any known HDP-related significance? Is this safe to delete?

rvillanueva · ‎07-24-2019

See comments / discussion of accepted answer for the steps that ultimately solved the problem.

rvillanueva · ‎07-24-2019

That fixed it, thank you. Would you mind explaining what you think could have gone wrong with the original view for future reference (the files-view.log was just empty)? I assume I can just delete the AUTO_FILES_INSTANCE files view that came with the initial Ambari instance, correct?

rvillanueva · ‎07-23-2019

@Jay Kumar SenSharma Did the above (then logged out and back into Ambari) and did not change results. Furthermore, I had been using the Files View in Ambari fine until recently (not sure what has changed and can see no alerts in the HDFS section in Ambari) without having the admin user having a HDFS user directory made. For my own info, could you explain what you had thought the problem was and how adding a admin HDFS user dir would have helped there?

Online	Offline
Last Visited	‎10-31-2020 09:19 PM

Member Since	‎07-11-2019 08:54 PM
Last Visited	‎10-31-2020 09:19 PM
Posts	102
Kudos received	4

Cloudera Community

Re: How to run spark-submit in virtualenv for pysp...

Re: LDAP/AD users not appearing in Ranger

Re: Ambari unable to run custom hook for modifying...

Re: Where are the spark2 binaries?

Re: What are HDFS NFS "access times"?

Re: Ambari files view upload error: java.net.NoRou...

Re: How to add jdbc drivers to sqoop?

How to add jdbc drivers to sqoop?

Re: Ambari files view upload error: java.net.NoRou...

Re: Ambari files view upload error: java.net.NoRou...

Ambari files view upload error: java.net.NoRouteTo...

HDFS NFS startup error: “ERROR mount.MountdBase: F...

Re: Unable to access Ambari Files View: "Service '...

Re: Unable to access Ambari Files View: "Service '...

Re: Unable to access Ambari Files View: "Service '...