Member since
09-29-2015
286
Posts
601
Kudos Received
60
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
11508 | 03-21-2017 07:34 PM | |
2912 | 11-16-2016 04:18 AM | |
1626 | 10-18-2016 03:57 PM | |
4298 | 09-12-2016 03:36 PM | |
6293 | 08-25-2016 09:01 PM |
01-22-2016
06:24 AM
1 Kudo
Instead of using a symlink, could you try doing a mount bind as it won't impact the folder permissions. For eg, mount -o bind /p01/app/had /usr/hdp If it works, you will need to add it to /etc/fstab
... View more
01-21-2016
04:02 PM
5 Kudos
dfs.datanode.max.xcievers / dfs.datanode.max.transfer.threads = 4096 (use 16k if running HBase)
dfs.datanode.balance.max.concurrent.moves = 500 (can go to 1000 if needed)
/* Each data node has a limited bandwidth for rebalancing. The default value for the bandwidth is 5MB/s. In the worst case, each data transfer has a limited
bandwidth of 1MB/s. Default is dfs.datanode.balance.bandwidthPerSec = 5242880 */
dfs.datanode.balance.bandwidthPerSec = 104857600 /* 100 MB/s */
hdfs balancer -Dfs.defaultFS=hdfs://<NN_HOSTNAME>:8020 -Ddfs.balancer.movedWinWidth=5400000 -Ddfs.balancer.moverThreads=1000 -Ddfs.balancer.dispatcherThreads=200 -Ddfs.datanode.balance.max.concurrent.moves=5 -Ddfs.balance.bandwidthPerSec=100000000 -Ddfs.balancer.max-size-to-move=10737418240 -threshold 5
... View more
Labels:
03-31-2017
12:21 PM
Hi, I am planning to create Ambari Hadoop Storm Cluster and as this is fresh new for me I have some doubts how to setup it on the best way. Here is what I have for resources: - Platform: AWS (8 EC2 instances - 1 master. 4 slaves, 3 workers (zookeepers)) - Tool: As I want to automate setup, I will use Terraform, Ansible and Blueprint to setup all environment - I research a little bit and create some conclusion and I need some advice/opinion is this a good path??? Thanks
MASTER
SLAVE
ZOO
NAMENODE
SECONDARY_NAMENODE
DATANODE
NIMBUS
RESOURCE_MANAGER
NODEMANAGER
DRPC_SERVER
SUPERVISOR
ZOOKEEPER_SERVER
STORM_UI_SERVER
ZOOKEEPER_CLIENT
METRICS_MONITOR
ZOOKEEPER_CLIENT
METRICS_MONITOR
MAPREDUCE2_CLIENT
HDFS_CLIENT
HDFS_CLIENT
HDFS_CLIENT
PIG
PIG
PIG
TEZ_CLIENT
TEZ_CLIENT
TEZ_CLIENT
YARN_CLIENT
YARN_CLIENT
YARN_CLIENT
METRICS_COLLECTOR
HISTORY_SERVER
METRICS_GRAFANA
MAPREDUCE2_CLIENT
APP_TIMELINE_SERVER
HIVE_SERVER
HCAT
HIVE_METASTORE
WEBHCAT_SERVER
MYSQL_SERVER
HIVE_CLIENT
... View more
10-09-2018
05:31 AM
Have you figured out the solution yet? Would you mind to share with us. I got the same problem with POC project. Thanks , Have you figured out the solution? Would you mind to share with us about your solution. Thanks
... View more
01-20-2016
06:42 PM
1 Kudo
Thank you all for prompt replies and extra thanks to @Ancil McBarnett your answer was toward the right direction. I enabled all traffic in my security group (for testing purposes) as opposed to only tcp traffic before. As seen in the above logs the right port for ambari-server is 8440. That resolved the issue and now agents can be registered through Ambari. Just in sake of completeness my setup is: OS: RHEL-7.2 Ambari: 2.1.2 JDK: 1.8 Python: 2.7
... View more
02-13-2017
09:49 AM
Having 4 environments including development, testing, pre-production/staging and production in a Big company would be good for best practices because in staging we can make sure that all are working properly. Of course the dev, testing and staging environments are smaller than planned production. For instance, if I take 2 nodes in dev, testing and staging then we can have a almost 8 nodes in production and again it's always depends on replication, traffic, and other relevant facts. Thanks!
... View more
07-19-2016
06:02 AM
We can store application related data and logs on SAN/NAS However
SAN/NAS are not at all recommended for I/O sensitive and CPU bound jobs , that
is to avoid bottleneck situations while reading data from disk or from network
or in processing data So
for Logs/application data --> SAN/NAS Data
nodes data --> DAS with JBOD
configuration NO RAID NN/SN/JT nodes --> should be highly available [ RAID
5/10(depends on usecase) ] Hadoop
is a scale out and shared nothing architecture http://www.bluedata.com/blog/2015/12/separating-hadoop-compute-and-storage/ https://community.emc.com/servlet/JiveServlet/previewBody/41473-102-1-132603/Virtualizing%20Hadoop%20in%20Large%20Scale%20Infrastructures.pdf Also I understand sometimes
true cost of DAS is also more considering Hadoop replication , but this is how
Hadoop is thriving (One of the key tenets of Hadoop is to bring the compute to
the storage instead of the storage to the compute.)
... View more
01-29-2016
03:24 AM
Ok, problem was with postgre, this partially helped pg_resetxlog -f /var/lib/pgsql/data
... View more
01-16-2016
06:12 PM
4 Kudos
Question: I am about to initiate the cluster install wizard on a new Ambari install. I reviewed the information on service users at http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.0.0/bk_ambari_reference_guide/content/_defining_service_users_and_groups_for_a_hd I am wondering whether I should take the "Skip Group Modifications" option. The doc states "Choosing this option is typically required if your
environment manages groups using LDAP and not on the local Linux
machines". In our environment, users and groups are managed via Active
Directory (via Centrify). We are planning to enable security on the cluster after it's
installed, and that will include a host of new users being created,
after which many of the initial users and groups will be orphaned. What does that "Skip group modifications" option actually do? Should it be used in this case? Answer: I believe the answer lies
in the fact that we do a groupmod hadoop statement and there is no group
called hadoop, or this is not allowed in your environment. Since
you will be integrating with LDAP or AD you should use the "Skip Group
Modifications". Upon installing of your Linux nodes references groupds
from LDAP, the groupmod hadoop statement would fail. See http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.0.0/bk_Installing_HDP_AMB/content/_customize_services.html "Service Account Users and Groups The service account users and groups are available under the Misc tab. These are the operating system accounts the service components will run as.
If these users do not exist on your hosts, Ambari will automatically
create the users and groups locally on the hosts. If these users already
exist, Ambari will use those accounts. Depending on how your environment is configured, you might not allow groupmod or usermod operations. If this is the case, you must be sure all users and groups are already created and be sure to select the "Skip group modifications" option on the Misc tab. This tells Ambari to not modify group membership for the service users." Also in http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.0.0/bk_ambari_troubleshooting/content/_resolving_cluster_install_and_configuration_problems.html "3.7. Problem: Cluster Install Fails with Groupmod Error The
cluster fails to install with an error related to running groupmod.
This can occur in environments where groups are managed in LDAP, and not
on local Linux machines. You may see an error message similar to the
following one: Fail: Execution of 'groupmod hadoop' returned 10. groupmod: group 'hadoop' does not exist in /etc/group 3.7.1. Solution When
installing the cluster using the Cluster Installer Wizard, at
the Customize Services step, select the Misc tab and choose the Skip
group modifications during install option."
... View more
Labels:
01-16-2016
11:40 PM
I'd recommend to test connectivity first. You can do it with beeline tool. Once you connected to jdbc url from beeline u can continue experiments with eclipse.
... View more