About jstraub

jstraub · ‎01-27-2016

It looks like your Metrics Collector cannot start because the HBase Master is not coming up. Just to make sure you're running a non-kerberized environment with a single Namenode, right? Are you using Metrics in distributed or embedded mode? Could you please validate and post this configuration => hbase.rootdir Your HBase Master log files show a connection refused when the HBase Master is trying to connect to the Namenode 2016-01-27 18:36:57,189 FATAL [hdp1n3:61300.activeMasterManager] master.HMaster: Unhandled exception. Starting shutdown. java.net.ConnectException: Call From hdp1n3/XXXXXXXXX to hdp1n3.aye1vpcdev:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) at org.apache.hadoop.ipc.Client.call(Client.java:1431) ... at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ... Is this the hostname of your namenode "hdp1n3.aye1vpcdev:8020"? Can you access the HDFS from the Metrics Collector node?

jstraub · ‎01-27-2016

This is a long shot, but I had some trouble with Parquet and Hive in the past and one change that fixed my problem was the switch to ORC. The new Spark Version does support ORC files and Hive is optimized towards ORC. Could you save your data as ORC and run your spark sql again? df.write.format("orc")...

jstraub · ‎01-27-2016

What privileges did you grant? I think there is something missing in your post

jstraub · ‎01-26-2016

@Artem Ervits @Vladimir Zlatkin I have created an article for this solution. Please see this https://community.hortonworks.com/articles/11852/ambari-api-run-all-service-checks-bulk.html I have also added 9 more Services to the payload, that should cover almost every service of the cluster now.

jstraub · ‎01-26-2016

In order to check the status and stability of your cluster it makes sense to run the service checks that are included in Ambari. Usually each Ambari Service provides its own service check, but their might be services that wont include any service check at all. To run a service check you have to select the service (e.g. HDFS) in Ambari and click "Run Service Check" in the "Actions" dropdown menu. Service Checks can be started via the Ambari API and it is also possible to start all available service checks with a single API command. To bulk run these checks it is necessary to use the same API/method that is used to trigger a rolling restart of Datanodes (request_schedules). The "request_schedules" API starts all defined commands in the specified order, its even possible to specify a pause between the commands. Available Service Checks: Service Name service_name Command HDFS HDFS HDFS_SERVICE_CHECK YARN YARN YARN_SERVICE_CHECK MapReduce2 MAPREDUCE2 MAPREDUCE2_SERVICE_CHECK HBase HBASE HBASE_SERVICE_CHECK Hive HIVE HIVE_SERVICE_CHECK WebHCat WEBHCAT WEBHCAT_SERVICE_CHECK Pig PIG PIG_SERVICE_CHECK Falcon FALCON FALCON_SERVICE_CHECK Storm STORM STORM_SERVICE_CHECK Oozie OOZIE OOZIE_SERVICE_CHECK ZooKeeper ZOOKEEPER ZOOKEEPER_QUORUM_SERVICE_CHECK Tez TEZ TEZ_SERVICE_CHECK Sqoop SQOOP SQOOP_SERVICE_CHECK Ambari Metrics AMBARI_METRICS AMBARI_METRICS_SERVICE_CHECK Atlas ATLAS ATLAS_SERVICE_CHECK Kafka KAFKA KAFKA_SERVICE_CHECK Knox KNOX KNOX_SERVICE_CHECK Spark SPARK SPARK_SERVICE_CHECK SmartSense SMARTSENSE SMARTSENSE_SERVICE_CHECK Ranger RANGER RANGER_SERVICE_CHECK Note: Make sure you replace user, password, clustername and ambari-server with the actual values Start single service check via Ambari API (e.g. HDFS Service Check): curl -ivk -H "X-Requested-By: ambari" -u <user>:<password> -X POST -d @payload http://<ambari-server>:8080/api/v1/clusters/<clustername>/requests Payload: { "RequestInfo":{ "context":"HDFS Service Check", "command":"HDFS_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"HDFS" } ] } Start bulk Service checks via Ambari API (e.g. HDFS, Yarn, MapReduce2 Service Checks): curl -ivk -H "X-Requested-By: ambari" -u <user>:<password> -X POST -d @payload http://<ambari-server>:8080/api/v1/clusters/<clustername>/request_schedules Payload: [ { "RequestSchedule":{ "batch":[ { "requests":[ { "order_id":1, "type":"POST", "uri":"/api/v1/clusters/<clustername>/requests", "RequestBodyInfo":{ "RequestInfo":{ "context":"HDFS Service Check (batch 1 of 3)", "command":"HDFS_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"HDFS" } ] } }, { "order_id":2, "type":"POST", "uri":"/api/v1/clusters/<clustername>/requests", "RequestBodyInfo":{ "RequestInfo":{ "context":"YARN Service Check (batch 2 of 3)", "command":"YARN_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"YARN" } ] } }, { "order_id":3, "type":"POST", "uri":"/api/v1/clusters/<clustername>/requests", "RequestBodyInfo":{ "RequestInfo":{ "context":"MapReduce Service Check (batch 3 of 3)", "command":"MAPREDUCE2_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"MAPREDUCE2" } ] } } ] }, { "batch_settings":{ "batch_separation_in_seconds":1, "task_failure_tolerance":1 } } ] } } ] This is returned by the api { "resources" : [ { "href" : "http://<ambari-server>:8080/api/v1/clusters/<clustername>/request_schedules/68", "RequestSchedule" : { "id" : 68 } } ] } This is what it looks like in Ambari Payload to run all Service Checks Please see this gist: https://gist.github.com/mr-jstraub/0b55de318eeae6695c3f#payload-to-run-all-service-checks

jstraub · ‎01-26-2016

Just to avoid confusion, there is no Secondary Namenode, there will only be a Namenode-1 and Namenode-2, one of these two namenodes is always the active one, the other one will be a standby NN. During the Blueprint rollout Ambari will execute several steps (see here) to initialize the Journalnodes, format ZK Znode and distribute all the metadata (fsimage,etc.). These steps are only executed once. If you restart an active Namenode, it will transition to a standby state first and make the other NN the active one, once thats done it restarts, so the metadata is not reinitialized again.

jstraub · ‎01-26-2016

If you install HDFS HA via blueprint, you dont have to initialize anything manually afterwards, its all done during the blueprint rollout. You can monitor the status of Ambari Agents or hosts via http://<ambari server>:8080/api/v1/hosts/<hostname> This will return a lot of information about the host, e.g. disk info, running services (inkl. process id), last heartbeat of ambari agent, health status, etc. { "href" : "http://example.com:8080/api/v1/hosts/horton01.example.com", "Hosts" : { "cluster_name" : "bigdata", "cpu_count" : 2, "desired_configs" : null, "disk_info" : [ { "available" : "5922140", "device" : "/dev/vda1", "used" : "13670764", "percent" : "70%", "size" : "20641404", "type" : "ext3", "mountpoint" : "/" ... ], "host_health_report" : "", "host_name" : "horton01.cloud.hortonworks.com", "host_state" : "HEALTHY", "host_status" : "HEALTHY", "ip" : "172.24.68.17", .... .... "agentTimeStampAtReporting" : 1453826633797, "serverTimeStampAtReporting" : 1453826633829, "liveServices" : [ ... ] }, "umask" : 18, .... }, "last_heartbeat_time" : 1453826643874, "last_registration_time" : 1452849291890, "os_arch" : "x86_64", "os_family" : "redhat6", "os_type" : "centos6", "ph_cpu_count" : 2, "public_host_name" : "horton01.example.com", "rack_info" : "/14", "recovery_report" : { "summary" : "DISABLED", "component_reports" : [ ] }, "recovery_summary" : "DISABLED", "total_mem" : 7543576 }, "alerts_summary" : { "CRITICAL" : 0, "MAINTENANCE" : 0, "OK" : 18, "UNKNOWN" : 0, "WARNING" : 1 }

jstraub · ‎01-26-2016

What is the size of the dataset you are trying to process? Could you please paste you YARN/MapReduce container configurations as well as your hardware specs.?

jstraub · ‎01-23-2016

. @Vladimir Zlatkin I just found a way to execute all service checks with one call 🙂 To bulk start service checks we have to use the same API/method that is used to trigger a rolling restart of Datanodes. The "request_schedules" api starts all defined commands in the specified order, we can even specify a pause between the commands. Start bulk Service checks: curl -ivk -H "X-Requested-By: ambari" -u <user>:<password> -X POST -d @payload.json http://myexample.com:8080/api/v1/clusters/bigdata/request_schedules Payload.json [ { "RequestSchedule":{ "batch":[ { "requests":[ { "order_id":1, "type":"POST", "uri":"/api/v1/clusters/bigdata/requests", "RequestBodyInfo":{ "RequestInfo":{ "context":"HDFS Service Check (batch 1 of 11)", "command":"HDFS_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"HDFS" } ] } }, { "order_id":2, "type":"POST", "uri":"/api/v1/clusters/bigdata/requests", "RequestBodyInfo":{ "RequestInfo":{ "context":"YARN Service Check (batch 2 of 11)", "command":"YARN_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"YARN" } ] } }, { "order_id":3, "type":"POST", "uri":"/api/v1/clusters/bigdata/requests", "RequestBodyInfo":{ "RequestInfo":{ "context":"MapReduce Service Check (batch 3 of 11)", "command":"MAPREDUCE2_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"MAPREDUCE2" } ] } }, { "order_id":4, "type":"POST", "uri":"/api/v1/clusters/bigdata/requests", "RequestBodyInfo":{ "RequestInfo":{ "context":"HBase Service Check (batch 4 of 11)", "command":"HBASE_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"HBASE" } ] } },{ "order_id":5, "type":"POST", "uri":"/api/v1/clusters/bigdata/requests", "RequestBodyInfo":{ "RequestInfo":{ "context":"Hive Service Check (batch 5 of 11)", "command":"HIVE_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"HIVE" } ] } }, { "order_id":6, "type":"POST", "uri":"/api/v1/clusters/bigdata/requests", "RequestBodyInfo":{ "RequestInfo":{ "context":"WebHCat Service Check (batch 6 of 11)", "command":"WEBHCAT_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"WEBHCAT" } ] } }, { "order_id":7, "type":"POST", "uri":"/api/v1/clusters/bigdata/requests", "RequestBodyInfo":{ "RequestInfo":{ "context":"PIG Service Check (batch 7 of 11)", "command":"PIG_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"PIG" } ] } }, { "order_id":8, "type":"POST", "uri":"/api/v1/clusters/bigdata/requests", "RequestBodyInfo":{ "RequestInfo":{ "context":"Falcon Service Check (batch 8 of 11)", "command":"FALCON_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"FALCON" } ] } }, { "order_id":9, "type":"POST", "uri":"/api/v1/clusters/bigdata/requests", "RequestBodyInfo":{ "RequestInfo":{ "context":"Storm Service Check (batch 9 of 11)", "command":"STORM_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"STORM" } ] } }, { "order_id":10, "type":"POST", "uri":"/api/v1/clusters/bigdata/requests", "RequestBodyInfo":{ "RequestInfo":{ "context":"Oozie Service Check (batch 10 of 11)", "command":"OOZIE_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"OOZIE" } ] } }, { "order_id":11, "type":"POST", "uri":"/api/v1/clusters/bigdata/requests", "RequestBodyInfo":{ "RequestInfo":{ "context":"Zookeeper Service Check (batch 11 of 11)", "command":"ZOOKEEPER_QUORUM_SERVICE_CHECK" }, "Requests/resource_filters":[ { "service_name":"ZOOKEEPER" } ] } } ] }, { "batch_settings":{ "batch_separation_in_seconds":1, "task_failure_tolerance":1 } } ] } } ] Result This is returned by the api { "resources" : [ { "href" : "http://myexample.com:8080/api/v1/clusters/bigdata/request_schedules/68", "RequestSchedule" : { "id" : 68 } } ] } Ambari operations

jstraub · ‎01-23-2016

@Neeraj Sabharwal Yes, but I am afraid not without a little bit of additional work. Maybe copying the database and adjusting some values like repo id, ranger address, etc. is an alternative to look into (not recommended though!). Here is the API-way 🙂 You can access all policies of a repository (e.g. hdfs/hadoop) by using: http://<ranger_address>:6080/service/plugins/policies/download/<clustername>_hadoop For example: curl -ivk -H "Content-type:application/json" -u <user>:<password> http://<ranger_address>:6080/service/plugins/policies/download/bigdata_hadoop This will return: { "serviceName":"bigdata_hadoop", "serviceId":1, "policyVersion":23, "policyUpdateTime":1450245444000, "policies":[ { "id":2, "guid":"1448089401967_197_71", "isEnabled":true, "createdBy":"Admin", "updatedBy":"Admin", "createTime":1448118201000, "updateTime":1449582864000, "version":5, "service":"bigdata_hadoop", "name":"Ranger_audits", "description":"", "resourceSignature":"6dbd7c49e533baa8082b48895acabf20", "isAuditEnabled":false, "resources":{ "path":{ "isRecursive":true, "values":[ "/apps/solr/ranger_audits" ], "isExcludes":false } }, "policyItems":[ { "users":[ "solr" ], "groups":[ ], "delegateAdmin":false, "accesses":[ { "isAllowed":true, "type":"read" }, { "isAllowed":true, "type":"write" }, { "isAllowed":true, "type":"execute" } ], "conditions":[ ] } ] }, { ... ... } ... ], ... ... ... } After downloading all policies of a repo you can use the Rest calls I mentioned here => https://community.hortonworks.com/questions/10826/rest-api-url-to-configure-ranger-objects.html to recreate the policies in your other cluster. Note: Make sure the users from Cluster1 are available in Cluster2 as well, otherwise Ranger will throw an exception when you create a policy for a user that doesn't exist. Thats it 🙂

Online	Offline
Last Visited	‎08-18-2019 08:21 AM

Member Since	‎09-15-2015 02:21 PM
Last Visited	‎08-18-2019 08:21 AM
Posts	457
Kudos received	472

Cloudera Community

Re: NiFi: How do I see the flowfile attributes nam...

Re: NiFi: JSON Array split

Re: Securing Solr with Ranger ERROR 500

Re: Is Ambari Infra open source?

Re: After disabling kerberos , ZKfailover not comi...

Re: Ambari Metrics Collector Start Failed on 3 No...

Re: HIVE / SparkSQL '.parquet not a SequenceFile '

Re: Ranger Admin install fails with Access Denied ...

Re: Is there a way to execute Ambari service check...

Ambari API - Run all service checks (bulk)

Re: Ambari blueprints and Namenode metadata

Re: Ambari blueprints and Namenode metadata

Re: Need to understand why Job taking long time in...

Re: Is there a way to execute Ambari service check...

Re: Is there a way to export ranger policies from ...