Member since
06-07-2017
6
Posts
0
Kudos Received
0
Solutions
07-20-2017
09:17 AM
Hi all, My colleague has ran into an issue with HDFS where it is giving out incorrect capacity values. He added a new data node to the cluster with some disks which were a slightly different layout and ever since, HDFS has been reporting wrong values. The calculation for configured capacity is wrong. It is shown wrong both in the WebGUI and via HDFS command line. Calculations here: https://community.hortonworks.com/articles/98936/details-of-the-output-hdfs-dfsadmin-report.html Here is the setup: 1) NODE 1-3 (original servers) running RHEL7.3 - Virtual Machine with 1 x 600 GB disk, which has both the OS and hdfs 2) NODE 4 (new node) running RHEL7.2 - Virtual Machine with 4 disks, three of which are 300GB. Each of the extra three disks are mount for HDFS only on /data1/, /data2/, /data3/ etc. HDFS is configured to pick these drives up. To summarize, when configured with NODE1-3, HDFS was working fine. When adding NODE4, HDFS has become messed up but keeps functioning. Not sure what will happen when the disk space limit is hit. The calculated sizes are: Configured Capacity: 7.36TB DFS Used: 1.16 TB Non DFS Used: 2.79 TB DFS Remaining: 3.36TB I've tried restarting services and various other things but nothing works. The JDK version is the same for both nodes. The only difference I could see what the version of the OS (which I know, should be the same). Any thoughts and suggestions are much appreciated! Cheers
... View more
06-26-2017
11:40 PM
Hey all, I have a spark job which works fine via spark-submit that fails when executed through the livy rest API with the following exception: "User class threw exception: org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'some_database' not found" The thing is, the job already works. I've tried modifying all sorts of stuff in the livy conf, in the REST API call (changing the X-Requested-By user, changing the proxyUser, enabling/disabling various proxyUser settings etc to get it to work. I've following all logs from livy (doesn't seem to log anything much) through to YARN, hdfs and the spark logs. The only thing that gives any kind of useful output if the YARN log which tells me I had an exception. Any ideas? I don't see why livy isn't just getting the spark/scala defaults and running the job as whatever user I specify?
... View more
Labels:
06-14-2017
10:05 AM
Interesting - so basically it sounds like the Ambari installer is not actually inspecting the memory available on the machines in the cluster (in my case, just a single node) and adjusting the default parameters for each service accordingly. Because of this, the defaults it provides for things like heap size cause the ambari-server to crash and the installation fails before it finishes. I was hoping Ambari was more aware of its available hardware resources. This is the second case in which I've seen the ambari-server break itself because of poor automation. If you install the Ambari-server using maria-db from say, the Red Hat repository and then install a service which forces the installation of mysql (such as Hive or another service), it will force install mysql community, overriding maria-db, removing its own database. Also during the automated installation, I don't believe its possible to force a service into maintenance mode automatically after it is installed to save on resources. This would be a good option/feature if you don't know which services you need yet and you just want a basic, simple cluster. Thanks for the link - nice read.
... View more
06-14-2017
09:33 AM
Hey all, I've been trying to deploy an all-in-one HDP machine using Ambari on AWS similar to the sandbox using the latest version of Ambari and HDP. I've found that if I use a large instance size on AWS (8 Gb of ram and 2VCPU/CORES) the Ambari-server crashes every time during deployment, breaking the installation, requiring a total rebuild. However, using an XL instance, with 16GB of RAM and 4 cores/vCPU the installation works fine every time and I get no errors. I have tested this 3 times, doing installations on both servers side-by-side (XL vs Large instances). * The services installed on the single node are: HDFS, YARN + MapReduce2, Hive, HBase, Pig, Sqoop, Oozie, ZooKeeper, Flume, Ambair Infra, Ambari Metrics, Kafka, SmartSense, Spark and Spark2. What is strange about this is that the deployment is that I can't find any reference to the hardware requirements anywhere and you can run most of these services in the sandbox with less memory and cores. Horton works and most big data trainers drill into you that Hadoop will 'run on all commodity hardware' but these hardware constraints seem to suggest otherwise. Can anyone shed any light on this? Cheers, - Calvin
... View more
06-14-2017
09:26 AM
Hey all, I've been trying to deploy an all-in-one HDP machine using Ambari on AWS similar to the sandbox using the latest version of Ambari and HDP. I've found that if I use a large instance size on AWS (8 Gb of ram and 2VCPU/CORES) the Ambari-server crashes every time during deployment, breaking the installation, requiring a total rebuild. However, using an XL instance, with 16GB of RAM and 4 cores/vCPU the installation works fine every time and I get no errors. I have tested this 3 times, doing installations on both servers side-by-side (XL vs Large instances). * The services installed on the single node are: HDFS, YARN + MapReduce2, Hive, HBase, Pig, Sqoop, Oozie, ZooKeeper, Flume, Ambair Infra, Ambari Metrics, Kafka, SmartSense, Spark and Spark2. What is strange about this is that the deployment is that I can't find any reference to the hardware requirements anywhere and you can run most of these services in the sandbox with less memory and cores. Horton works and most big data trainers drill into you that Hadoop will 'run on all commodity hardware' but these hardware constraints seem to suggest otherwise. Can anyone shed any light on this? Cheers, - Calvin
... View more
06-07-2017
01:11 PM
Hey all, I'm trying to build an all-in-one HDP appliance using HDP 2.6.X and Ambari 2.5.0.3, repos here: #VERSION_NUMBER=2.5.0.3-7 [ambari-2.5.0.3]
name=ambari Version - ambari-2.5.0.3
baseurl=http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.5.0.3
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.5.0.3/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
[HDP-2.6.0.3]
name=HDP Version - HDP-2.6.0.3
baseurl=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.0.3
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.0.3/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
[HDP-UTILS-1.1.0.21]
name=HDP-UTILS Version - HDP-UTILS-1.1.0.21
baseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.21/repos/centos7
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.6.0.3/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
When configuring MariaDB/MySQL I encountered an issue several times where the ambari-server setup kept placing the hostname: 'digeroo.ha247.co.uk' everywhere. Later on after I fixed the MySQL errors, I get the SAME issue when trying to setup the agent automatically via SSH: Host registration aborted. Ambari Agent host cannot reach Ambari Server 'digeroo.ha247.co.uk:8080'. Please check the network connectivity between the Ambari Agent host and the Ambari Server What is quite funny here is that my server does not have this FQDN, I have no idea where this FQDN is referenced or why this FQDN is even in the code. It is actually a security vulnerability as ha247.co.uk is a real domain, they could spin up something listening on port 8080 which could intercept communications from my servers if I'm not careful (and anyone else who has this issue). I'm guessing someone from ha247 (maybe its a Horton Works customer) forgot to remove references to them when checking in some code fixes. Can anyone elaborate? Why is Ambari trying to register my agents against a non-existent ambari-server which I have not even specified anywhere? I can't see the server name appear in any obvious config file. Cheers, - Calvin Update: I configured the ambari agent manually (which worked fine), but when installing services it ALSO refers to this random FQDN: ash : <urlopen error [Errno 111] Connection refused>
WARNING 2017-06-07 07:43:09,584 FileCache.py:184 - Error occurred during cache update. Error tolerate setting is set to true, so ignoring this error and continuing with current cache. Error details: Can not download file from url http://digeroo.ha247.co.uk:8080/resources//stacks/HDP/2.1/services/SMARTSENSE/package/.hash : <urlopen error [Errno 111] Connection refused> This is very annoying, I ended up putting this random fqdn in the hosts file to try and fix the issue.
... View more
Labels: