Member since
07-11-2016
6
Posts
3
Kudos Received
0
Solutions
09-02-2016
04:51 PM
1 Kudo
At the bare minimum, you will need the cluster to have the following components: HDFS (data storage), MR (processing), Zookeeper (distributed coordination), YARN (Resource Manager), Ambari (components deployment and monitoring) and then Spark for your processing. Ambari will not proceed to deploy without these components.
... View more
07-26-2016
04:06 PM
@Praphul Agarwal @Gurjinder Singh - Here are the steps 1) Add the hive connection property in pig properties - PIG_OPTS=-Dhive.metastore.uris=thrift://localhost:10000 2) restart hiveserver, hivemetastore and pig services 3) create a pig script with the load statement 'using org.apache.hive.hcatalog.pig.HCatLoader();' 4) run the pig script - pig -useHCatalog 'pigscript'
... View more
07-19-2016
02:33 PM
1) Maintenance mode is turned ON at a service/node level. They are turned ON to perform the following activities but not limitied to OS maintenance configuration changes Decommission a node Generally speaking, when the maintenance mode is switched ON, the alerts are suppressed and no bulk operations are performed on the node. However, the node is still listed in NN's DN list. 2) Decommissioning a DN is highly recommended when the maintenance mode is turned ON (to avoid data loss). When the DN is set to decommissioning state, NN starts copying blocks to other DN's. The DN will be decommissioned only when NN completes the copy process. This activity is performed to maintain the replication factor policy. 3) Deletion of a DN can be performed after successful completion of decommissioning a DN. At this point, DN is completely removed from the cluster and NN's list. 4) 'Rebalancer' is a manual activity performed on the cluster to rebalance the data between the under utilizied and over utilized DN's
... View more