Member since
09-26-2016
74
Posts
4
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2241 | 08-06-2018 06:55 PM | |
1451 | 12-21-2017 04:28 PM | |
1317 | 11-03-2017 05:07 PM | |
2369 | 03-20-2017 03:37 PM | |
6168 | 03-06-2017 03:54 PM |
12-21-2017
04:28 PM
Yes, you should be able to passwordless SSH to all nodes FROM Ambari. That way, the agents will be able to install automatically. You don't want to be installing Ambari agents manually. https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.1.0/bk_ambari-installation/content/set_up_password-less_ssh.html
... View more
11-03-2017
05:07 PM
1 Kudo
You achieve this by limiting access via firewall rules, other than that KNOX + Kerberos is the built in method. Some resources: Secure Authentication: The core Hadoop uses Kerberos and Hadoop delegation tokens for security. WebHDFS also uses Kerberos (SPNEGO) and Hadoop delegation tokens for authentication. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_security/content/configure_webhdfs_for_knox.html https://www.cloudera.com/documentation/enterprise/5-9-x/topics/cdh_sg_secure_webhdfs_config.html
... View more
10-20-2017
03:49 PM
Try increasing the polling intervals on some of the processors. This can help the CPU + UI.
... View more
10-20-2017
02:45 PM
2 Kudos
This info might be more helpful to guide you down the road of DR. With HDP in production, you must combine different technologies being offered and tailor these together as your own solution. I've read through many solutions, and the info below is the most critical in my opinion. Remember, preventing data loss is better than recovering from it! Read these slides first: https://www.slideshare.net/cloudera/hadoop-backup-and-disaster-recovery https://www.slideshare.net/hortonworks/ops-workshop-asrunon20150112/72 1. VM Snapshots If your not using VM's, then switch over Ambari nightly VM snapshots Namenode VM snapshots 2. Lockdown critical directories: fs.protected.directories - Under HDFS config in ambari Protect critical directories from deletion. There could be accidental deletes of the critical data-sets. These catastrophic errors should be avoided by adding appropriate protections. For example the /user directory is the parent of all user-specific sub-directories. Attempting to delete the entire /user directory is very likely to be unintentional. To protect against accidental data loss, mark the /user directory as protected. This prevents attempts to delete it unless the directory is already empty 3. Backups Backups can be automated using tools like Apache Falcon (being deprecated in HDP 3.0, switch to workflow editor + DistCp) and Apache Oozie Using Snapshots HDFS snapshots can be combined with DistCp to create the basis for an online backup solution. Because a snapshot is a read-only, point-in-time copy of the data, it can be used to back up files while HDFS is still actively serving application clients. Backups can even be automated using tools like Apache Falcon and Apache Oozie. Example: “Accidentally” remove the important file sudo -u hdfs hdfs dfs -rm -r -skipTrash /tmp/important-dir/important-file.txt Recover the file from the snapshot: hdfs dfs -cp /tmp/important-dir/.snapshot/first-snapshot/important-file.txt /tmp/important-dir hdfs dfs -cat /tmp/important-dir/important-file.txt HDFS Snapshots Overview A snapshot is a point-in-time, read-only image of the entire file system or a sub tree of the file system. HDFS snapshots are useful for:
Protection against user error: With snapshots, if a user accidentally deletes a file, the file can be restored from the latest snapshot that contains the file. Backup: Files can be backed up using the snapshot image while the file system continues to serve HDFS clients. Test and development: Files in an HDFS snapshot can be used to test new programs without affecting the HDFS file system that is concurrently supporting HDFS clients. Disaster recovery: Snapshots can be replicated to a remote recovery site for disaster recovery.
DistCp Overview Hadoop DistCp (distributed copy) can be used to copy data between Hadoop clusters or within a Hadoop cluster. DistCp can copy just files from a directory or it can copy an entire directory hierarchy. It can also copy multiple source directories to a single target directory. DistCp:
Uses MapReduce to implement its I/O load distribution, error handling, and reporting. Has built-in support for multiple file system types. It can work with HDFS, Amazon S3, Cassandra, and others. DistCp also supports copying between different HDFS versions. Can generate a significant workload on the cluster if a large volume of data is being transferred. Has many command options. Use hadoop distcp –help to get online command help information.
... View more
08-29-2017
06:28 PM
If your password has any unique characters such as "&" it will break the XML The fix for this example would be changing the & to: "& amp;" without the space (this website will not show the correct value).
... View more
07-12-2017
02:56 PM
I'm trying to achieve a simple count on fileflows via attribute. The idea is to count to 100, then execute an email. How do I achieve this?
... View more
Labels:
- Labels:
-
Apache NiFi
04-21-2017
09:14 AM
The answer given is worthless this issue comes down to not having the ambari agents setup on the nodes. Run a yum install ambari-agent then configure the agent and start the agent with ambari-agent start. Any issues at this point check: Agents are up and running /etc/hosts files are correct Ssh is working There is another potential problem. If this is your SECOND attempt after restarting the entire process, there is a BUG. If this is the second attempt after a successful SSH. Then try the alternative option for manual install. It will work if the agents are running and vi /etc/ambari-agent/conf/ambari-agent.ini
[server]
hostname=<your.ambari.server.hostname> url_port=8440 secured_url_port=8441 Start the agent on every host in your cluster.
ambari-agent start
... View more
04-20-2017
08:46 PM
There is another potential problem. If this is your SECOND attempt after restarting the entire process, there is a BUG. If this is the second attempt after a successful
SSH. Then try the alternative option for manual install. It will work if the agents are running and vi /etc/ambari-agent/conf/ambari-agent.ini
[server] hostname=<your.ambari.server.hostname> url_port=8440 secured_url_port=8441 Start the agent on every host in your cluster. ambari-agent start
... View more
04-12-2017
02:35 PM
Try using incognito mode in chrome, this works for me and asks me to proceed. Make sure you import the cert via IE first.
... View more
03-23-2017
05:40 PM
@Matt Burgess Is there anyway of not using the password in your examples if I had already setup password-less SSH. The groovy script is throwing an error if I try to not use the password.
... View more