Member since
07-30-2019
181
Posts
205
Kudos Received
51
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4959 | 10-19-2017 09:11 PM | |
1591 | 12-27-2016 06:46 PM | |
1237 | 09-01-2016 08:08 PM | |
1179 | 08-29-2016 04:40 PM | |
3013 | 08-24-2016 02:26 PM |
06-06-2016
09:03 PM
3 Kudos
@Timothy Spann Compression can improve the performance of Hive queries by decreasing the amount of data that needs to be read from disk (reading compressed vs. uncompressed). Just because your query only returns a small number of rows, doesn't mean that the processing isn't going to read lots of data to process the query, too. There would be an inflection point where reading the uncompressed data costs less than uncompressing the data (small data sets) where you might want to skip the compression. However, when larger amounts of data need to be read to fulfill a query, compressing the data can provide performance gains.
... View more
06-06-2016
08:57 PM
If the username in AD is the same (e.g. ambari@EXAMPLE.COM), then SSSD integration will use the AD account instead of the local account. Ideally, you'd already have SSSD set up before doing the Ambari installation. If you're using customized service account names (e.g. my_hive, somecustomuser), then you'd need to modify the sudo entries for the "Customizable Users" to account for this.
... View more
06-06-2016
08:50 PM
1 Kudo
@Scott Shaw One of the commands that you grant to the ambari user via sudo is the adduser command. This allows the ambari user to create the service accounts on each node of the cluster. All you need to do is install and start the ambari agent on each node (which you can do as the ambari user once the sudo rules are in place).
... View more
06-03-2016
06:33 PM
1 Kudo
@Pardeep Gorla Typically, in order to set up an SMTP proxy, you'll need to set up your own mail server to send emails to the external SMTP server on behalf of the domain. What you would need to have installed would be something like postfix to handle the mail. Once you get your local email server successfully forwarding emails to the main server, you'll set your email server in the Ambari Alerts to the local email server. This article has some information to get you started on configuring a postfix email proxy.
... View more
06-03-2016
03:19 PM
@PJ Moutrie Have you verified that the firewall is open on the NiFi nodes?
... View more
06-01-2016
07:45 PM
1 Kudo
@Sri Bandaru If all you need to do is automate the grabbing of the ticket, then you can set up a keytab file and use the login script to automatically kinit when the user logs in with something similar to the following: > ktutil
ktutil: addent -password -p username@DOMAIN.COM -k 1 -e rc4-hmac
Password for username@DOMAIN.COM: [enter your password]
ktutil: addent -password -p username@DOMAIN.COM -k 1 -e aes256-cts
Password for username@DOMAIN.COM: [enter your password]
ktutil: wkt username.keytab
ktutil: quit
> mkdir /home/username/keytabs
> chmod 700 /home/username/keytabs
> mv username.keytab /home/username/keytabs
> chmod 600 /home/username/keytabs/username.keytab
> echo "kinit -kt /home/username/keytabs/username.keytab username@DOMAIN.COM" >> /home/username/.bash_profile This will create a keytab for the user, move it into a secure directory, and automatically get a ticket when the user logs in with a bash shell. If you are trying to automate the use of a ticket from the desktop, then you can use a similar method. You will have to install something like the Oracle JDK to get a kinit tool, but you can create the keytab on a Linux machine and copy it to the windows system. Obviously, whatever tool you are trying to use (SAS, etc.) will need to be able to pass the Kerberos ticket to the cluster for authentication.
... View more
05-31-2016
07:46 PM
1 Kudo
@Davide Ferrari If you are referring to the database ambari uses to to store it's configuration info, then you'll need to re-setup Ambari for the new MySQL address: [root@sandbox nifi_demo]# ambari-server setup
Using python /usr/bin/python2
Setup ambari-server
Checking SELinux...
SELinux status is 'disabled'
Customize user account for ambari-server daemon [y/n] (n)? n
Adjusting ambari-server permissions and ownership...
Checking firewall status...
Checking JDK...
Do you want to change Oracle JDK [y/n] (n)? n
Completing setup...
Configuring database...
Enter advanced database configuration [y/n] (n)? y
Configuring database...
==============================================================================
Choose one of the following options:
[1] - PostgreSQL (Embedded)
[2] - Oracle
[3] - MySQL
[4] - PostgreSQL
[5] - Microsoft SQL Server (Tech Preview)
[6] - SQL Anywhere
==============================================================================
Enter choice (1): 3
Hostname (localhost):
The MySQL database that is restarted by Ambari is most likely the Hive Metastore database. You will need to change the configuration for the Hive database to an "Existing MySQL" database. Shutdown Hive and repoint the database to the new HA configuration.
... View more
05-18-2016
09:53 PM
1 Kudo
@Nicola Marangoni What user is your Ambari view server running as? It looks like "ambari-server" in your configs, is that correct? If so, then you need to add the following parameters to the custom core-site.xml in HDFS configs: hadoop.proxyuser.ambari-server.groups=*
hadoop.proxyuser.ambari-server.hosts=* This assumes that you have run ambari-server setup and changed the user that the server runs as to "ambari-server". If you haven't done that, then Ambari is still running as the root user and your proxy user settings will not work. If your Ambari view server runs as root, then you need to change your settings to: auth=KERBEROS;proxyuser=roo
... View more
05-17-2016
01:05 PM
8 Kudos
Virtual Memory swapping can have a large impact on the performance of a Hadoop system. Because of the memory requirements of YARN containers and processes running on the nodes in a cluster, swapping process out of memory to disk can cause serious performance limitations. As such, the historical recommendations for setting the swappiness, or propensity to swap out a process, on a Hadoop system has been to disable swap altogether. With newer versions of the Linux kernel, Out Of Memory (OOM) situations can be more likely to indiscriminately kill important processes to reclaim valuable physical memory on the system with a swappiness of 0. In order to prevent the system from swapping processes too frequently, but still allow for emergency swapping (instead of killing processes), the recommendation is now to set swappiness to 1 on Linux systems. This will still allow swapping, but with the least possible aggressiveness (for comparison, the default value for swappiness is 60). To change the swappiness on a running machine, use the following command: echo "1" > /proc/sys/vm/swappiness To ensure the swappiness is set appropriately on reboot, use the following command: echo "vm.swappiness=1" >> /etc/sysctl.conf
... View more
Labels:
05-15-2016
05:07 PM
@ida ida There a couple of ways to accomplish this id recommend starting with sqoop. It is a tool designed specifically to extract data from an RDBMS and load it into Hadoop. This tutorial should help you get started.
... View more