About egarelnabi

egarelnabi · ‎11-01-2016

The new HDP 2.5 Sandbox has been released. If you re-download it you shouldn't face this issue any more. http://hortonworks.com/downloads/

egarelnabi · ‎10-08-2016

Administrator Operations The operations described in this section require superuser privileges. Allow Snapshots: Allowing snapshots of a directory to be created. If the operation completes successfully, the directory becomes snapshottable. Command: hadoop dfsadmin -allowSnapshot $path Arguments: path – The path of the snapshottable directory. See also the corresponding Java API void allowSnapshot(Path path) in HdfsAdmin. Disallow Snapshots: Disallowing snapshots of a directory to be created. All snapshots of the directory must be deleted before disallowing snapshots. Command: hadoop dfsadmin -disallowSnapshot $path Arguments: path – The path of the snapshottable directory. See also the corresponding Java API void disallowSnapshot(Path path) in HdfsAdmin. User Operations The section describes user operations. Note that HDFS superuser can perform all the operations without satisfying the permission requirement in the individual operations. Create Snapshots: Create a snapshot of a snapshottable directory. This operation requires owner privilege to the snapshottable directory. Command: hadoop dfs -createSnapshot $path $snapshotName Arguments: path The path of the snapshottable directory. snapshotName The snapshot name, which is an optional argument. When it is omitted, a default name is generated using a timestamp with the format "'s'yyyyMMdd-HHmmss.SSS", e.g. "s20130412-151029.033". See also the corresponding Java API Path createSnapshot(Path path) and Path createSnapshot(Path path, String snapshotName) in FileSystem. The snapshot path is returned in these methods. Delete Snapshots: Delete a snapshot of from a snapshottable directory. This operation requires owner privilege of the snapshottable directory. Command: hadoop dfs -deleteSnapshot $path $snapshotName Arguments: path The path of the snapshottable directory. snapshotName The snapshot name. See also the corresponding Java API void deleteSnapshot(Path path, String snapshotName) in FileSystem. Rename Snapshots: Rename a snapshot. This operation requires owner privilege of the snapshottable directory.. Command: hadoop dfs -renameSnapshot $path $oldName $newName Arguments: path The path of the snapshottable directory. oldName The old snapshot name. newName The new snapshot name. See also the corresponding Java API void renameSnapshot(Path path, String oldName, String newName) in FileSystem. Get Snapshottable Directory Listing: Get all the snapshottable directories where the current user has permission to take snapshots. Command: hadoop lsSnapshottableDir $path $snapshotName Arguments: path The path of the snapshottable directory. snapshotName The snapshot name. See also the corresponding Java API SnapshottableDirectoryStatus[] getSnapshottableDirectoryListing() in DistributedFileSystem. Get Snapshots Difference Report: Get the differences between two snapshots. This operation requires read access privilege for all files/directories in both snapshots. Command: hadoop snapshotDiff $path $fromSnapshot $toSnapshot Arguments: path The path of the snapshottable directory. fromSnapshot The name of the starting snapshot. toSnapshot The name of the ending snapshot. **See Also** HDFS Snapshots - 1) Overview

egarelnabi · ‎10-08-2016

HDFS Snapshots Overview HDFS Snapshots are read-only point-in-time copies of the file system. Snapshots can be taken on a subtree of the file system or the entire file system. Some common use cases of snapshots are data backup, protection against user errors and disaster recovery. The implementation of HDFS Snapshots is efficient in the following ways: 1) Snapshot creation is instantaneous. The cost is O(1) excluding the inode lookup time. 2) Additional memory is used only when modifications are made relative to a snapshot. Memory usage is O(M), where M is the number of modified files/directories. 3) Blocks in datanodes are not copied. The snapshot files record the block list and the file size. 4) Snapshots do not adversely affect regular HDFS operations, and there is a minor performance impact from accessing snapshotted data depending on the number of modifications. The snapshot data is computed by subtracting the modifications from the current data (snapshot data = current data – modifications). Also, modifications are recorded in reverse chronological order so that the current data can be accessed directly. **See Also** HDFS Snapshots - 2) Operations

plevinson · ‎03-09-2017

@Eyad Garelnabi: There is a document called "Ambari Upgrade Best Practices," http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_ambari-upgrade-best-practices/content/index.html, that was developed with folks in the field, and the issues that many of our customers face. It's a relatively new document, and I hope you will find it useful.

egarelnabi · ‎10-08-2016

Do the Upgrade Preparation ● For each target cluster find out as much details as possible by reviewing the following: ● Hardware configuration, operating systems and network topology ● Current deployment, cluster topology, configuration of each component ● Security configuration ● User access and management ● Current data ingestion process (if applicable) ● Current running applications and clients connecting into the cluster ● Find the applicable Upgrade Guide from docs.hortonworks.com ● Select validation applications ● Prepare an Upgrade Log to keep track of any upgrade issues and their workarounds ● [Optional but recommended] Prepare one or more Lab (virtual) clusters and install the current HDP stack and Ambari. Use these clusters for mock upgrades and rollbacks to troubleshoot any upgrade issues. Upgrading Clusters ● To upgrade a single cluster use the Upgrade Procedure given below. ● [Optional but recommended] Mock (lab) cluster upgrade: Attempt an upgrade on a Lab cluster. Some steps of the Upgrade Procedure can be skipped in order to concentrate on critical parts. ● Test (Dev) cluster upgrade: For upgrading an important, production cluster it is strongly recommended to attempt the upgrade first on a test cluster (eg. Dev) similar to the production cluster: running the current versions of HDP and Ambari, having similar topology and the components and configuration like the production cluster but on a smaller number of nodes. ● Log every issue encountered during lab and test upgrades and its workaround, so that we minimize any down time during the main cluster upgrade. ● Main (Production) cluster upgrade ● Book the upgrade date and time in advance ● Estimate cluster down-time based on results of the test upgrade. Note that regardless of the preparation and any test upgrades some new issues will appear. ● Inform all interested parties ● Confirm that the Support is on stand-by ● Do the upgrade A Single Cluster Upgrade Procedure Prepare the Cluster for the Upgrade ● Run identified validation applications before the upgrade, and record results and execution times for each of them ● Get ready for the upgrade: Correct any errors and/or alerts and warnings on the cluster ● Check the state of the HDFS filesystem and finalize it if not already finalized ● Capture the HDFS status and save the HDFS namespace ● Backup NameNode metadata and all DBs supporting the cluster (Ambari, Hive metastore, Oozie, Ranger, Hue) Perform Upgrade ● Execute cluster upgrade using the official HDP upgrade document ● Review new properties, in particular pay attention to changed property values, changed property names, and new meaning of existing properties (if any) Post Upgrade Validation ● Run the Smoke test for each service and troubleshoot any issues ● Run validation applications after the upgrade and record results and execution times ● If any validation application is failing or execution times are much longer than before the upgrade, review and adjust cluster properties, repeating validation applications until they are stable and don’t run slower than before the upgrade ● Record in the Upgrade Log any issues encountered and workarounds. Final Steps ● Install new HDP Components, not used before the upgrade (if any), run smoke test for each of them and troubleshoot any issues ● Finalize HDFS upgrade ● Configure HA of selected components (like NN, RM, HiveServer2, HBase, Oozie) ● Perform Ambari Takeover of HDP components not being managed by Ambari earlier ● Enable Kerberos Security: the KDC and existing principals and keytabs can be reused, add principals for new components ● LDAP integration (Ambari, KDC, Ranger) **See Also** HDP Upgrade Best Practices - 1) Plan and Asses HDP Upgrade Best Practices - 3) Documentation and Learnings

egarelnabi · ‎10-08-2016

Plan and Assess This is a purely planning step. The expected deliverable is an Upgrade Plan. Gather all details about existing environment to plan for the upgrade path and associated upgrade tasks. 1) Determine Upgrade Path Based on the current and target version of the HDP stack, and whether Ambari is used or not, select the supported upgrade guide from Hortonworks documentation site. Identify key requirement if Namenode HA or other HA needs to be disabled or Security needs to be disabled. Current version: ● HDP Stack version ● Ambari version (if Ambari is used) ● OS Version Target version: ● HDP Stack version ● Ambari version (if Ambari is used) Below are some useful links HDP Stacks Managed by Different Ambari Versions: https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.0.0/bk_ambari-installation/content/determine_stack_compatibility.html Upgrading to Ambari 2.4: https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.0.1/bk_ambari-upgrade/content/upgrading_ambari.html Upgrading HDP Using Ambari: https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.0.1/bk_ambari-upgrade/content/upgrading_hdp_stack.html Upgrading HDP Manually (without Ambari): https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_command-line-upgrade/content/ch_upgrade_2_4.html 2) Review Known Issues in Target Version Release Review the following items: Behavioral Changes that will affect applications ● Unsupported features ● Known Issues ● New features added to release HDP 2.5 Release Notes: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_release-notes/content/ch_relnotes_v250.html HDP 2.5 Known Issues: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_release-notes/content/known_issues.html 3) Select Validation Applications Select two groups of validation applications. First group: Industrial benchmarks like Teragen & Terasort, TestDFSIO, Hive TPC-DS, and HBase performance tests. As the minimum use Teragen & Terasort with multiple mappers for Teragen and multiple reducers for Terasort. Second group (optional): User defined validation applications . Identify representative applications (together with the input data) which are being used most often. Be sure to include at least one for every used Hadoop component like MapReduce, Hive, Pig, HBase, Oozie, Storm, Kafka and others. 4) Finalize Project Management Items ● Scope: Identify clusters to be upgraded and components to upgrade and newly install (if any). ● HR: Staff upgrade teams. Also, some validation applications can be run by developers themselves. ● Time: Identify upgrade tasks, timeline and task owners. ● QA: Carefully identify validation tasks ● Risk: Estimate down-time for each cluster upgrade. ● Resources: Prepare the cluster on which the upgrade will be tested (eg., Dev). When upgrading production clusters it is strongly recommended to attempt the upgrade first on a test cluster. **See Also** HDP Upgrade Best Practices - 2) Do the Upgrade HDP Upgrade Best Practices - 3) Documentation and Learnings

venkata_gangava · ‎04-26-2017

Hey, We are using HDP2.3.6. We are geeting below error when we configured Ranger to store audit on Solr. 2017-04-25 09:16:23,366 WARN [org.apache.ranger.audit.queue.AuditBatchQueue0]: provider.BaseAuditHandler (BaseAuditHandler.java:logFailedEvent(374)) - failed to log audit event: {"repoType":3,"repo":"hdpt01_hive","reqUser":"hadooptest","evtTime":"2017-04-25 09:16:21.124","access":"USE","resType":"@null","action":"SHOWDATABASES","result":1,"policy":6,"enforcer":"ranger-acl","sess":"06802e00-eda7-4bd2-a812-7e2ed2621e24","cliType":"HIVESERVER2","cliIP":"","reqData":"show schemas","agentHost":"hivehost","logType":"RangerAudit","id":"d8b3d307-0035-4613-a7ff-872fa1c46a9e","seq_num":0,"event_count":1,"event_dur_ms":0} org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://hostname:8886/solr/ranger_audit: Expected mime type application/octet-stream but got text/html. <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> <title>Error 401 Authentication required</title> </head> <body><h2>HTTP ERROR 401</h2> <p>Problem accessing /solr/ranger_audit/update. Reason: <pre> Authentication required</pre></p><hr><i><small>Powered by Jetty://</small></i><hr/> </body> </html> The cluster is kerberized. Below error is seen when accessing teh ambari-infra-solr UI http://hostname:8886/solr/ GSSException: Failure unspecified at GSS-API level (Mechanism level: Specified version of key is not available (44))

cstanca · ‎10-20-2016

@Eyad Garelnabi Documentation may need to be updated, but with the new kernels, swappiness does not need to be set to 0. Read this article from @emaxwell: https://community.hortonworks.com/articles/33522/swappiness-setting-recommendation.html

slachterman · ‎09-01-2016

No, one can only whitelist the hosts and groups that can be impersonated by the proxy user. You may want to define a new group in this case.

akanto · ‎09-02-2016

Hi, Please try to download it with: curl -O -k https://public-repo-1.hortonworks.com/HDP/cloudbreak/cloudbreak-2016-07-06-12-51.img Regarding the import process you can see further instructions here (it was written to 1.3.0 but it is still valid just replace the image name): http://sequenceiq.com/cloudbreak-docs/latest/openstack/ Attila

Online	Offline
Last Visited	‎08-14-2019 09:54 AM

Member Since	‎10-06-2015 09:21 PM
Last Visited	‎08-14-2019 09:54 AM
Posts	273
Kudos received	202

Cloudera Community

Re: Is it possible to import a complete new taxono...

Re: Is it possible in Apache Atlas to add key-valu...

Re: Do we have tag carry forward in atlas hdp2.6.1...

Re: With ATLAS, which format attribute Date is acc...

Re: Spark streaming support for stream analytics m...

Re: Unable to delete folders on Virtualbox 2.5

HDFS Snapshots - 2) Operations

HDFS Snapshots - 1) Overview

Re: HDP Upgrade Best Practices - 3) Documentation ...

HDP Upgrade Best Practices - 2) Do the Upgrade

HDP Upgrade Best Practices - 1) Plan and Assess

Re: Securing Solr for Ranger Audit Logs

Re: OS Configurations for Better Hadoop Performanc...

Re: Restricting Secure Impersonation / Proxy Users

Re: Which version of Cloudbreak is the latest, 1....