Member since
02-09-2016
559
Posts
422
Kudos Received
98
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2111 | 03-02-2018 01:19 AM | |
3426 | 03-02-2018 01:04 AM | |
2341 | 08-02-2017 05:40 PM | |
2334 | 07-17-2017 05:35 PM | |
1694 | 07-10-2017 02:49 PM |
08-23-2016
01:45 PM
1 Kudo
@Yukti Agrawal Where is myFile.txt located? What are the file permissions for that file? It is most likely that the hdfs user does not have read access to the file, or does not have access to the directory where the file is located. Alternatives would be to run the entire script as the hdfs user from outside the script. With both of these methods, the permissions I mentioned above will also need to be fixed. You have a couple of options: su - hdfs -c <your script name> sudo -u hdfs <your script name> Many systems will have sudo available, however you have to be configured with access to run commands via sudo. If you don't have access to sudo, then option number 1 will work for you. However, you have to know the hdfs user password. If you feel my answer was helpful, don't forget to accept the answer. This helps other people find solutions.
... View more
08-22-2016
10:00 PM
@Binu Mathew I am curious how https://pig.apache.org/docs/r0.9.1/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html would compare to the Pig RegEx approach.
... View more
08-22-2016
06:55 PM
@Sunile Manjee Have you seen this article for tuning: https://community.hortonworks.com/articles/38591/hadoop-and-ldap-usage-load-patterns-and-tuning.html This article provides good background on the performance scaling of LDAP: http://researchweb.watson.ibm.com/people/d/dverma/papers/sigmetrics2001.pdf
... View more
08-20-2016
10:42 PM
@gkeys This is a great article and filled with helpful tips!
... View more
08-20-2016
12:46 PM
@zkfs Anbari is a management service that knows how to install and manage each of the HDP components. Ambari itself does not contain the packages or software for HDP. It uses repositories to fetch those packages as needed. Each component of HDP (YARN, HDFS, Ranger, Knox, Hive, etc) is comprised of a number of RPM packages. When you use Ambari to install HDFS, it has to get all of the required packages for HDFS. If you were to install HDP without Ambari, you would still have to install all of those packages by hand. This link provides the packages needed for each HDP component: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_installing_manually_book/content/ch_getting_ready_chapter.html
... View more
08-19-2016
06:30 PM
1 Kudo
@jknulst You should be using NAT for the network setting. The piece you need to check is Port Forwarding. You can find it under Network -> Advanced -> Port Forwarding. In my screenshot, I have port forwarding set to 2200<->22. By default the Sandbox is 2222<->22. This is because I'm using Vagrant with my sandbox. I have provided a couple of screenshots:
... View more
08-19-2016
12:55 PM
@Yukti Agrawal You have the core of what you need to do this via a shell script. To do what you are looking for, you you just need to add some more logic. Why not prompt for the directory path, the owner/group, then the quota? You can always check the input for a special value like "-done-" to let the script know you are done. while [ "$directory" != "-done-" ]
do
echo "To exit, enter -done- or press ^C"
read -p "Enter a directory : " directory
echo $directory
read -p "Enter permissions (ie: 755) : " permissions
echo $permissions
read -p "Enter ownership (ie: user:group) : " ownership
echo $ownership
read -p "Enter quota (ie: 1024m) : " quota
echo $quota
echo "
hdfs dfs -mkdir $directory
hdfs dfs -chown $ownership $directory
hdfs dfs -chmod $permissions $directory
hdfs dfsadmin -setSpaceQuota $quota $directory"
done
You would still have to enter the info for each directory, which may not be ideal. An alternative would be read the input from a text file. The text file could be something like this (tab delimited): /user/testuser1 testuser1:testgroup 775 1024M
/user/testuser2 testuser2:testgroup 775 1024M
Then all you need to do is load the file (say we call it myFile.txt) in and parse each line and execute the commands based on the values. while IFS=$'\t' read -r -a myOptions
do
echo "directory: ${myOptions[0]}"
echo "ownership: ${myOptions[1]}"
echo "permissions: ${myOptions[2]}"
echo "quota: ${myOptions[3]}"
done < myFile.txt
... View more
08-19-2016
12:20 PM
1 Kudo
@Narasimha Gunturu Here is a link to the Hive authorization documentation. This should be helpful for you: https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization You can also do this with Ranger. Here is a tutorial: http://hortonworks.com/hadoop-tutorial/manage-security-policy-hive-hbase-knox-ranger/
... View more
08-18-2016
10:45 PM
@Aengus Rooney You can find logs for HDP AWS at /var/lib/cloudbreak-deployment/cbreak.log.
... View more
08-18-2016
07:00 PM
7 Kudos
Objective
The objective of this tutorial is to walk you through process of updating your Cloudbreak Deployer on Amazon AWS from 1.3 to 1.4. Once this is complete, you can now deploy an HDP 2.5 TP cluster.
Prerequisites
You should have deployed Cloudbreak 1.3 on Amazon AWS using the instructions found here: Cloudbreak Documentation 1.3 - AWS.
You should add TCP port 443 to the security group on Amazon AWS as Cloudbreak 1.4+ appears to proxy requests from port 443 now.
Do not run cbd start.
Note: This process will update the Cloudbreak Deployer to the latest available version. At the time of initial writing, this was 1.4. As of September 1, 2016, the latest version is 1.6.
Steps
1. Connect to your Cloudbreak Amazon AWS instance
You should have access to your key file from amazon. Log into your Cloudbreak deployer instance using:
ssh -i <amazon key file> cloudbreak@<amazon instance public ip>
Note: If you have permission issues connecting via ssh, make sure you set your key file permissions to 0600
2. All commands should be run from the Cloudbreak Deployer root directory:
cd /var/lib/cloudbreak-deployer
3. Before you can start Cloudbreak, you need to initialize the environment by running cbd init. You should see something similar to:
$ cbd init
===> Deployer doctor: Checks your environment, and reports a diagnose.
uname: Linux ip-172-31-15-160.ec2.internal 3.10.0-327.10.1.el7.x86_64 #1 SMP Sat Jan 23 04:54:55 EST 2016 x86_64 x86_64 x86_64 GNU/Linux
local version:1.3.0
latest release:1.3.0
docker images:
docker command exists: OK
docker client version: 1.9.1
docker client version: 1.9.1
ping 8.8.8.8 on host: OK
ping github.com on host: OK
ping 8.8.8.8 in container: OK
ping github.com in container: OK
Note: If you previously ran cbd start, then you should run cbd kill before upgrading Cloudbreak.
4. If the initialization completed successfully, now you can update Cloudbreak to version 1.4 using cbd update master && cbd regenerate && cbd pull-parallel. You should see something similar to:
$ cbd update master && cbd regenerate && cbd pull-parallel
Update /usr/bin/cbd from: https://x.example.com/0//tmp/circle-artifacts.VMI9cYT/cbd-linux.tgz
mv: try to overwrite '/usr/bin/cbd', overriding mode 0755 (rwxr-xr-x)? y
* removing old docker-compose binary
* Dependency required, installing docker-compose 1.7.1 ...
Generating Cloudbreak client certificate and private key in /var/lib/cloudbreak-deployment/certs.
generating docker-compose.yml
/tmp/bashenv.793850575: line 674: /var/lib/cloudbreak-deployment/.deps/tmp/uaa-delme.yml: No such file or directory
diff: /var/lib/cloudbreak-deployment/.deps/tmp/uaa-delme.yml: No such file or directory
renaming: uaa.yml to: uaa-20160817-135202.yml
generating uaa.yml
latest: Pulling from catatnight/postfix
d64336e52f9a: Pulling fs layer
be760b6bdfc8: Pulling fs layer
2bed4b6dfef0: Pulling fs layer
d64336e52f9a: Downloading 3.784 MB/67.5 MB
be760b6bdfc8: Download complete
2bed4b6dfef0: Download complete
d64336e52f9a: Downloading 11.32 MB/67.5 MB
9e3e55be1c1f: Download complete
089c741cddd5: Download complete
d64336e52f9a: Downloading 12.4 MB/67.5 MB
0d03ba7124d6: Download complete
7ab5db5b9418: Downloading 5.93 MB/20.09 MB
d0f7cd1223d0: Downloading 1.506 MB/14.72 MB
a7aff224adfb: Download complete
7ab5db5b9418: Downloading 14.66 MB/20.09 MB
d0f7cd1223d0: Downloading 10.67 MB/14.72 MB
Digest: sha256:028b5f6f49d87a10e0c03208156ffedcef6da2e7f59efa8886640ba15cbe0e69
7ab5db5b9418: Downloading 15.52 MB/20.09 MB
d0f7cd1223d0: Downloading 11.55 MB/14.72 MB
Digest: sha256:87cf35f319f40f657a68e21e924dd5ba182d8253005c86a116992f2e17570765
Status: Image is up to date for gliderlabs/registrator:v5
v1.0.0: Pulling from library/traefik
cb056548b9bb: Pulling fs layer
1.0.0: Pulling from sequenceiq/socat
1.2.0: Pulling from sequenceiq/cbdb
0ec8b08ed2db: Pulling fs layer
e052a65b2e55: Pulling fs layer
462cae4bc514: Pulling fs layer
f0adfc577336: Pulling fs layer
1.4.0: Pulling from hortonworks/cloudbreak-web
88d2a45c4dd8: Pulling fs layer
d66b5ca0a6a8: Pulling fs layer
d30d76b6140e: Pulling fs layer
a339cf4fec1c: Pulling fs layer
1b32f6c5164a: Pulling fs layer
dd079b923689: Pulling fs layer
ef55927d23dc: Pulling fs layer
09097ea804bd: Pulling fs layer
ff9a6f48abb7: Pulling fs layer
Digest: sha256:8e2ec7a47b17ff50583e05224ca1243ed188aff8087bb546e406effb82b691fe
Status: Image is up to date for sequenceiq/socat:1.0.0
1.4.0: Pulling from hortonworks/cloudbreak-auth
88d2a45c4dd8: Pulling fs layer
17611781a601: Pulling fs layer
9777e7c06cfb: Pulling fs layer
d64336e52f9a: Downloading 15.65 MB/67.5 MB
033171f95048: Pulling fs layer
7ab5db5b9418: Downloading 16.37 MB/20.09 MB
d0f7cd1223d0: Downloading 11.99 MB/14.72 MB
3311ad4fbcb0: Pulling fs layer
1e870254f4fa: Pulling fs layer
13d69d98d1f7: Pulling fs layer
d64336e52f9a: Downloading 16.73 MB/67.5 MB
7ab5db5b9418: Download complete
d0f7cd1223d0: Download complete
d64336e52f9a: Downloading 17.27 MB/67.5 MB
ad3025da7360: Pulling fs layer
v2.7.1: Pulling from sequenceiq/uaadb
1.4.0: Pulling from sequenceiq/periscope
d34921bc2709: Pulling fs layer
7062b3d97728: Pulling fs layer
767584930cea: Pulling fs layer
c05d09cea848: Pulling fs layer
597fe94dd548: Pulling fs layer
Digest: sha256:270e87a90add32c69d8cb848c7455256f4c0a73e14a9ba2c9335b11853f688a6
Status: Image is up to date for sequenceiq/uaadb:v2.7.1
2.7.1: Pulling from sequenceiq/uaa
1.1: Pulling from sequenceiq/haveged
1.4.0: Pulling from sequenceiq/cloudbreak
d34921bc2709: Pulling fs layer
7062b3d97728: Pulling fs layer
767584930cea: Pulling fs layer
c05d09cea848: Pulling fs layer
d64336e52f9a: Downloading 17.81 MB/67.5 MB
c277e7f5e8b7: Pulling fs layer
a30f653c4d56: Pulling fs layer
d64336e52f9a: Downloading 18.35 MB/67.5 MB
7f0c2637ebf6: Pulling fs layer
9e1df59c970a: Pulling fs layer
d64336e52f9a: Downloading 19.97 MB/67.5 MB
6f980013dd43: Pulling fs layer
5f32c66af8ea: Pulling fs layer
eacee569f539: Pulling fs layer
a4c72beb2675: Pulling fs layer
1.2.0: Pulling from sequenceiq/pcdb
Digest: sha256:a64d40d0d51b001d2e0cb8490fcf04da59e0c8ede5121038a175d9bf2374cb6a
Status: Image is up to date for sequenceiq/haveged:1.1
0123c5510cfa: Pulling fs layer
Digest: sha256:361163496cde9183235355b6d043908c96af57a56db4b7d7b2cf40e255026716
Status: Image is up to date for sequenceiq/uaa:2.7.1
c277e7f5e8b7: Pulling fs layer
447edeb914d3: Pulling fs layer
e75814ea06f9: Pulling fs layer
6b4d47a92a9b: Pulling fs layer
Extracting 11.01 MB/18.53 MBExtracting 11.01 MB/18.53 MB
35cab74c8aa7: Downloading 8.466 MB/10.42 MB
c05d09cea848: 71 MB/42.5 MB
c05d09cea848: Downloading 42.18 MB/42.5 MB
Downloading 16.71 MB/18.53 MBownloading 8.623 MB/108.1 MB
d34921bc2709: Pull complete
7062b3d97728: Pull complete
767584930cea: Pull complete
767584930cea: Pull complete
c05d09cea848: Download complete
c05d09cea848: Pull complete
597fe94dd548: Pull complete
c277e7f5e8b7: Pull complete
c277e7f5e8b7: Pull complete
Downloading 50.25 MB/108.1 MB07 MB/130.1 MB
6b4d47a92a9b: Downloading 5.14 MB/10.42 MB
Downloading 2.741 MB/10.42 MBownloading 7.001 MB/13.19 MB
a30f653c4d56: Pull complete
Extracting 117.5 MB/130.1 MBExtracting 117.5 MB/130.1 MB
5cb6cc1fb08d: Pull complete
eacee569f539: Extracting 284.4 kB/284.4 kB
d66b5ca0a6a8: Pull complete
Extracting 16.Extracting 24.71 MB/42.5 MB
7f0c2637ebf6: Pull complete
9e1df59c970a: Pull complete
42396e8dcbcd: Pull complete
6f980013dd43: Extracting 32 B/32 B
6f980013dd43: Extracting 32 B/32 B
6f980013dd43: Pull complete
5f32c66af8ea: Pull complete
eacee569f539: Pull complete
a4c72beb2675: Pull complete
447edeb914d3: Pull complete
e75814ea06f9: Pull complete
6b4d47a92a9b: Pull complete
4b4f74f41ebf: Pull complete
be602741d584: Pull complete
07b4015931e0: Pull complete
5cab237e98a9: Pull complete
3a12055ee388: Extracting 32 B/32 B
73fb6d32d5e3: Pull complete
3a12055ee388: Pull complete
18afddb9bf55: Pull complete
Digest: sha256:8085718c474c40ce4dcc5f64b9ccf23a3f91b3cb2f7fe2e8572fc549a25e6953
Status: Downloaded newer image for sequenceiq/cloudbreak:1.4.0
5. Once the upgrade process is complete, start Cloudbreak using cbd start. You should see something similar to:
$ cbd start
generating docker-compose.yml
generating uaa.yml
Creating cbreak_haveged_1...
Creating cbreak_uluwatu_1...
Creating cbreak_cbdb_1...
Creating cbreak_consul_1...
Creating cbreak_cloudbreak_1...
Creating cbreak_registrator_1...
Creating cbreak_pcdb_1...
Creating cbreak_periscope_1...
Creating cbreak_sultans_1...
Creating cbreak_uaadb_1...
Creating cbreak_logsink_1...
Creating cbreak_logspout_1...
Creating cbreak_identity_1...
Uluwatu (Cloudbreak UI) url:
http://54.164.138.139:3000
login email:
admin@example.com
password:
cloudbreak
6. Login to the Cloudbreak UI.
Note: As I mentioned in the prerequisites, Cloudbreak appears to proxy requests from port 443 now. The url to access the Cloudbreak UI will be https://<amazon cloudbreak instance ip> As of Cloudbreak 1.6, the properly URL is displayed for the UI.
7. Create a platform definition. This is done by expand the manage platforms area of the Cloudbreak UI. We are using AWS, so create a platform by selecting AWS. The UI will look similar to this:
You can provide any Name and Description you like.
8. Create a credential. This is done by expand the manage credentials area of the Cloudbreak UI. We are using AWS, so select AWS. For ease of configuration, change the AWS Credential Type to Key Based. The Select Platform option should be set to the platform you created in the previous step. The UI will look similar to this:
You can provide any Name and Description you like. The Access Key and Secret Access Key are from your Amazon account. See this documentation to setup an access key: AWS Credentials. The SSH Public Key is found in the /home/cloudbreak/.ssh/id_rsa.pub file that was created when you followed the steps for creating the Cloudbreak instance. You can see the key by using cat on the file like this:
$ cat /home/cloudbreak/.ssh/id_rsa.pub
Note: Remember to download your credentials from Amazon. If you forget this step, there is no way to determine your Secret Access Key. You will have to delete those credentials and create new ones.
9. Once the credential is created, you need to select it.
10. Now we will create our own blueprint by copying one of the existing ones. We are doing this to deploy HDP 2.5. The default blueprints will currently deploy HDP 2.4.
Note: As of Cloudbreak 1.6, the default version is HDP 2.5 so this step is not necessary.
Expand the manage blueprints section of the UI. The UI will look similar to this:
Now select the hdp-small-default blueprint. We will copy this for our HDP 2.5 blueprint. The UI will look similar to this:
Click the copy & edit button to create a copy of the blueprint. The UI will look similar to this:
You can provide any Name and Description that you like. In the JSON Text field, scroll down to the bottom. Change the "stack.version": "2.4" to "stack.version": "2.5". Click the green create blueprint button.
11. Now you can create your cluster. click the green Create cluster button.
Provide a Cluster Name and select the appropriate AWS Region. Click the Setup Network and Security button.
You don't need to change anything here. Click the Choose Blueprint button.
Select the Blueprint we created in the previous steps. You will notice there is an Ambari Server check box on each of the servers listed. You need to determine where you want to deploy Ambari. Select the checkbox for that server. The UI will look similar to this:
Click the Review and Launch button. This will provide a final confirmation screen with a summary of the cluster.
If everything looks good, click the green create and start cluster button.
12. Cloudbreak will now start creating the cluster. The UI will look similar to this:
13. If you click on the test1 cluster name, you can see more information on the cluster. The UI will look similar to this:
14. Once the cluster build is complete, the UI should look similar to this:
You can see more information during the cluster build process by expanding the Event History section. The UI will look similar to this:
15. Once the cluster build is complete, you can log into Ambari using the Ambari Server Address link provided.
16.Once you are logged in to Ambari, select the Stacks and Versions view.
17. You can see by the components listed there are new HDP 2.5 components like Log Search and Spark2. You should see something similar to:
18 And finally, you can see HDP version by click the Versions tab. You should see something similar to:
Review
We successfully upgraded Cloudbreak 1.3 to Cloudbreak 1.4 on Amazon AWS. Using Cloudbreak 1.4, we were able to clone a blueprint, change the stack version to 2.5 and deploy a HDP 2.5 TP cluster.
... View more
Labels: