Member since
09-15-2015
457
Posts
507
Kudos Received
90
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
14016 | 11-01-2016 08:16 AM | |
9729 | 11-01-2016 07:45 AM | |
6171 | 10-25-2016 09:50 AM | |
1485 | 10-21-2016 03:50 AM | |
2797 | 10-14-2016 03:12 PM |
01-27-2016
07:20 PM
1 Kudo
It looks like your Metrics Collector cannot start because the HBase Master is not coming up. Just to make sure you're running a non-kerberized environment with a single Namenode, right? Are you using Metrics in distributed or embedded mode? Could you please validate and post this configuration => hbase.rootdir Your HBase Master log files show a connection refused when the HBase Master is trying to connect to the Namenode 2016-01-27 18:36:57,189 FATAL [hdp1n3:61300.activeMasterManager] master.HMaster: Unhandled exception. Starting shutdown.
java.net.ConnectException: Call From hdp1n3/XXXXXXXXX to hdp1n3.aye1vpcdev:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1431)
...
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
...
Is this the hostname of your namenode "hdp1n3.aye1vpcdev:8020"? Can you access the HDFS from the Metrics Collector node?
... View more
01-27-2016
11:36 AM
1 Kudo
This is a long shot, but I had some trouble with Parquet and Hive in the past and one change that fixed my problem was the switch to ORC. The new Spark Version does support ORC files and Hive is optimized towards ORC. Could you save your data as ORC and run your spark sql again? df.write.format("orc")...
... View more
01-27-2016
09:50 AM
1 Kudo
What privileges did you grant? I think there is something missing in your post
... View more
01-26-2016
09:22 PM
@Artem Ervits @Vladimir Zlatkin I have created an article for this solution. Please see this https://community.hortonworks.com/articles/11852/ambari-api-run-all-service-checks-bulk.html I have also added 9 more Services to the payload, that should cover almost every service of the cluster now.
... View more
01-26-2016
09:10 PM
15 Kudos
In order to check the status and stability of your cluster it makes sense to run the service checks that are included in Ambari. Usually each Ambari Service provides its own service check, but their might be services that wont include any service check at all. To run a service check you have to select the service (e.g. HDFS) in Ambari and click "Run Service Check" in the "Actions" dropdown menu.
Service Checks can be started via the Ambari API and it is also possible to start all available service checks with a single API command. To bulk run these checks it is necessary to use the same API/method that is used to trigger a rolling restart of Datanodes (request_schedules). The "request_schedules" API starts all defined commands in the specified order, its even possible to specify a pause between the commands.
Available Service Checks:
Service Name
service_name
Command
HDFS
HDFS
HDFS_SERVICE_CHECK
YARN
YARN
YARN_SERVICE_CHECK
MapReduce2
MAPREDUCE2
MAPREDUCE2_SERVICE_CHECK
HBase
HBASE
HBASE_SERVICE_CHECK
Hive
HIVE
HIVE_SERVICE_CHECK
WebHCat
WEBHCAT
WEBHCAT_SERVICE_CHECK
Pig
PIG
PIG_SERVICE_CHECK
Falcon
FALCON
FALCON_SERVICE_CHECK
Storm
STORM
STORM_SERVICE_CHECK
Oozie
OOZIE
OOZIE_SERVICE_CHECK
ZooKeeper
ZOOKEEPER
ZOOKEEPER_QUORUM_SERVICE_CHECK
Tez
TEZ
TEZ_SERVICE_CHECK
Sqoop
SQOOP
SQOOP_SERVICE_CHECK
Ambari Metrics
AMBARI_METRICS
AMBARI_METRICS_SERVICE_CHECK
Atlas
ATLAS
ATLAS_SERVICE_CHECK
Kafka
KAFKA
KAFKA_SERVICE_CHECK
Knox
KNOX
KNOX_SERVICE_CHECK
Spark
SPARK
SPARK_SERVICE_CHECK
SmartSense
SMARTSENSE
SMARTSENSE_SERVICE_CHECK
Ranger
RANGER
RANGER_SERVICE_CHECK
Note: Make sure you replace user, password, clustername and ambari-server with the actual values
Start single service check via Ambari API (e.g. HDFS Service Check):
curl -ivk -H "X-Requested-By: ambari" -u <user>:<password> -X POST -d @payload http://<ambari-server>:8080/api/v1/clusters/<clustername>/requests
Payload:
{
"RequestInfo":{
"context":"HDFS Service Check",
"command":"HDFS_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"HDFS"
}
]
}
Start bulk Service checks via Ambari API (e.g. HDFS, Yarn, MapReduce2 Service Checks):
curl -ivk -H "X-Requested-By: ambari" -u <user>:<password> -X POST -d @payload http://<ambari-server>:8080/api/v1/clusters/<clustername>/request_schedules
Payload:
[
{
"RequestSchedule":{
"batch":[
{
"requests":[
{
"order_id":1,
"type":"POST",
"uri":"/api/v1/clusters/<clustername>/requests",
"RequestBodyInfo":{
"RequestInfo":{
"context":"HDFS Service Check (batch 1 of 3)",
"command":"HDFS_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"HDFS"
}
]
}
},
{
"order_id":2,
"type":"POST",
"uri":"/api/v1/clusters/<clustername>/requests",
"RequestBodyInfo":{
"RequestInfo":{
"context":"YARN Service Check (batch 2 of 3)",
"command":"YARN_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"YARN"
}
]
}
},
{
"order_id":3,
"type":"POST",
"uri":"/api/v1/clusters/<clustername>/requests",
"RequestBodyInfo":{
"RequestInfo":{
"context":"MapReduce Service Check (batch 3 of 3)",
"command":"MAPREDUCE2_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"MAPREDUCE2"
}
]
}
}
]
},
{
"batch_settings":{
"batch_separation_in_seconds":1,
"task_failure_tolerance":1
}
}
]
}
}
]
This is returned by the api
{
"resources" : [
{
"href" : "http://<ambari-server>:8080/api/v1/clusters/<clustername>/request_schedules/68",
"RequestSchedule" : {
"id" : 68
}
}
]
}
This is what it looks like in Ambari Payload to run all Service Checks
Please see this gist:
https://gist.github.com/mr-jstraub/0b55de318eeae6695c3f#payload-to-run-all-service-checks
... View more
Labels:
01-26-2016
05:56 PM
1 Kudo
Just to avoid confusion, there is no Secondary Namenode, there will only be a Namenode-1 and Namenode-2, one of these two namenodes is always the active one, the other one will be a standby NN. During the Blueprint rollout Ambari will execute several steps (see here) to initialize the Journalnodes, format ZK Znode and distribute all the metadata (fsimage,etc.). These steps are only executed once. If you restart an active Namenode, it will transition to a standby state first and make the other NN the active one, once thats done it restarts, so the metadata is not reinitialized again.
... View more
01-26-2016
04:48 PM
2 Kudos
If you install HDFS HA via blueprint, you dont have to initialize anything manually afterwards, its all done during the blueprint rollout. You can monitor the status of Ambari Agents or hosts via http://<ambari server>:8080/api/v1/hosts/<hostname> This will return a lot of information about the host, e.g. disk info, running services (inkl. process id), last heartbeat of ambari agent, health status, etc. {
"href" : "http://example.com:8080/api/v1/hosts/horton01.example.com",
"Hosts" : {
"cluster_name" : "bigdata",
"cpu_count" : 2,
"desired_configs" : null,
"disk_info" : [
{
"available" : "5922140",
"device" : "/dev/vda1",
"used" : "13670764",
"percent" : "70%",
"size" : "20641404",
"type" : "ext3",
"mountpoint" : "/"
...
],
"host_health_report" : "",
"host_name" : "horton01.cloud.hortonworks.com",
"host_state" : "HEALTHY",
"host_status" : "HEALTHY",
"ip" : "172.24.68.17",
....
....
"agentTimeStampAtReporting" : 1453826633797,
"serverTimeStampAtReporting" : 1453826633829,
"liveServices" : [
...
]
},
"umask" : 18,
....
},
"last_heartbeat_time" : 1453826643874,
"last_registration_time" : 1452849291890,
"os_arch" : "x86_64",
"os_family" : "redhat6",
"os_type" : "centos6",
"ph_cpu_count" : 2,
"public_host_name" : "horton01.example.com",
"rack_info" : "/14",
"recovery_report" : {
"summary" : "DISABLED",
"component_reports" : [ ]
},
"recovery_summary" : "DISABLED",
"total_mem" : 7543576
},
"alerts_summary" : {
"CRITICAL" : 0,
"MAINTENANCE" : 0,
"OK" : 18,
"UNKNOWN" : 0,
"WARNING" : 1
}
... View more
01-26-2016
05:57 AM
1 Kudo
What is the size of the dataset you are trying to process? Could you please paste you YARN/MapReduce container configurations as well as your hardware specs.?
... View more
01-23-2016
08:42 AM
7 Kudos
. @Vladimir Zlatkin I just found a way to execute all service checks with one call 🙂
To bulk start service checks we have to use the same API/method that is used to trigger a rolling restart of Datanodes. The "request_schedules" api starts all defined commands in the specified order, we can even specify a pause between the commands.
Start bulk Service checks:
curl -ivk -H "X-Requested-By: ambari" -u <user>:<password> -X POST -d @payload.json http://myexample.com:8080/api/v1/clusters/bigdata/request_schedules
Payload.json
[
{
"RequestSchedule":{
"batch":[
{
"requests":[
{
"order_id":1,
"type":"POST",
"uri":"/api/v1/clusters/bigdata/requests",
"RequestBodyInfo":{
"RequestInfo":{
"context":"HDFS Service Check (batch 1 of 11)",
"command":"HDFS_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"HDFS"
}
]
}
},
{
"order_id":2,
"type":"POST",
"uri":"/api/v1/clusters/bigdata/requests",
"RequestBodyInfo":{
"RequestInfo":{
"context":"YARN Service Check (batch 2 of 11)",
"command":"YARN_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"YARN"
}
]
}
},
{
"order_id":3,
"type":"POST",
"uri":"/api/v1/clusters/bigdata/requests",
"RequestBodyInfo":{
"RequestInfo":{
"context":"MapReduce Service Check (batch 3 of 11)",
"command":"MAPREDUCE2_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"MAPREDUCE2"
}
]
}
},
{
"order_id":4,
"type":"POST",
"uri":"/api/v1/clusters/bigdata/requests",
"RequestBodyInfo":{
"RequestInfo":{
"context":"HBase Service Check (batch 4 of 11)",
"command":"HBASE_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"HBASE"
}
]
}
},{
"order_id":5,
"type":"POST",
"uri":"/api/v1/clusters/bigdata/requests",
"RequestBodyInfo":{
"RequestInfo":{
"context":"Hive Service Check (batch 5 of 11)",
"command":"HIVE_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"HIVE"
}
]
}
},
{
"order_id":6,
"type":"POST",
"uri":"/api/v1/clusters/bigdata/requests",
"RequestBodyInfo":{
"RequestInfo":{
"context":"WebHCat Service Check (batch 6 of 11)",
"command":"WEBHCAT_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"WEBHCAT"
}
]
}
},
{
"order_id":7,
"type":"POST",
"uri":"/api/v1/clusters/bigdata/requests",
"RequestBodyInfo":{
"RequestInfo":{
"context":"PIG Service Check (batch 7 of 11)",
"command":"PIG_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"PIG"
}
]
}
},
{
"order_id":8,
"type":"POST",
"uri":"/api/v1/clusters/bigdata/requests",
"RequestBodyInfo":{
"RequestInfo":{
"context":"Falcon Service Check (batch 8 of 11)",
"command":"FALCON_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"FALCON"
}
]
}
},
{
"order_id":9,
"type":"POST",
"uri":"/api/v1/clusters/bigdata/requests",
"RequestBodyInfo":{
"RequestInfo":{
"context":"Storm Service Check (batch 9 of 11)",
"command":"STORM_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"STORM"
}
]
}
},
{
"order_id":10,
"type":"POST",
"uri":"/api/v1/clusters/bigdata/requests",
"RequestBodyInfo":{
"RequestInfo":{
"context":"Oozie Service Check (batch 10 of 11)",
"command":"OOZIE_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"OOZIE"
}
]
}
},
{
"order_id":11,
"type":"POST",
"uri":"/api/v1/clusters/bigdata/requests",
"RequestBodyInfo":{
"RequestInfo":{
"context":"Zookeeper Service Check (batch 11 of 11)",
"command":"ZOOKEEPER_QUORUM_SERVICE_CHECK"
},
"Requests/resource_filters":[
{
"service_name":"ZOOKEEPER"
}
]
}
}
]
},
{
"batch_settings":{
"batch_separation_in_seconds":1,
"task_failure_tolerance":1
}
}
]
}
}
]
Result
This is returned by the api
{
"resources" : [
{
"href" : "http://myexample.com:8080/api/v1/clusters/bigdata/request_schedules/68",
"RequestSchedule" : {
"id" : 68
}
}
]
}
Ambari operations
... View more
01-23-2016
07:52 AM
8 Kudos
@Neeraj Sabharwal Yes, but I am afraid not without a little bit of additional work. Maybe copying the database and adjusting some values like repo id, ranger address, etc. is an alternative to look into (not recommended though!). Here is the API-way 🙂 You can access all policies of a repository (e.g. hdfs/hadoop) by using: http://<ranger_address>:6080/service/plugins/policies/download/<clustername>_hadoop For example: curl -ivk -H "Content-type:application/json" -u <user>:<password> http://<ranger_address>:6080/service/plugins/policies/download/bigdata_hadoop This will return: {
"serviceName":"bigdata_hadoop",
"serviceId":1,
"policyVersion":23,
"policyUpdateTime":1450245444000,
"policies":[
{
"id":2,
"guid":"1448089401967_197_71",
"isEnabled":true,
"createdBy":"Admin",
"updatedBy":"Admin",
"createTime":1448118201000,
"updateTime":1449582864000,
"version":5,
"service":"bigdata_hadoop",
"name":"Ranger_audits",
"description":"",
"resourceSignature":"6dbd7c49e533baa8082b48895acabf20",
"isAuditEnabled":false,
"resources":{
"path":{
"isRecursive":true,
"values":[
"/apps/solr/ranger_audits"
],
"isExcludes":false
}
},
"policyItems":[
{
"users":[
"solr"
],
"groups":[
],
"delegateAdmin":false,
"accesses":[
{
"isAllowed":true,
"type":"read"
},
{
"isAllowed":true,
"type":"write"
},
{
"isAllowed":true,
"type":"execute"
}
],
"conditions":[
]
}
]
},
{
...
...
}
...
],
...
...
...
}
After downloading all policies of a repo you can use the Rest calls I mentioned here => https://community.hortonworks.com/questions/10826/rest-api-url-to-configure-ranger-objects.html to recreate the policies in your other cluster. Note: Make sure the users from Cluster1 are available in Cluster2 as well, otherwise Ranger will throw an exception when you create a policy for a user that doesn't exist. Thats it 🙂
... View more