Member since
10-06-2015
273
Posts
202
Kudos Received
81
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4043 | 10-11-2017 09:33 PM | |
3564 | 10-11-2017 07:46 PM | |
2570 | 08-04-2017 01:37 PM | |
2210 | 08-03-2017 03:36 PM | |
2238 | 08-03-2017 12:52 PM |
11-01-2016
01:14 PM
The new HDP 2.5 Sandbox has been released. If you re-download it you shouldn't face this issue any more. http://hortonworks.com/downloads/
... View more
10-08-2016
04:18 PM
2 Kudos
Administrator Operations
The operations described in this section require superuser
privileges.
Allow Snapshots: Allowing snapshots of a directory to
be created. If the operation completes successfully, the directory becomes
snapshottable.
Command:
hadoop dfsadmin -allowSnapshot $path
Arguments:
path – The path of the snapshottable directory.
See also the corresponding Java API void
allowSnapshot(Path path) in HdfsAdmin.
Disallow Snapshots: Disallowing snapshots of a
directory to be created. All snapshots of the directory must be deleted
before disallowing snapshots.
Command:
hadoop dfsadmin -disallowSnapshot $path
Arguments:
path – The path of the snapshottable directory.
See
also the corresponding Java API void disallowSnapshot(Path
path) in HdfsAdmin.
User Operations
The section describes user operations. Note that HDFS
superuser can perform all the operations without satisfying the permission
requirement in the individual operations.
Create Snapshots: Create a snapshot of a
snapshottable directory. This operation requires owner privilege to the
snapshottable directory.
Command:
hadoop dfs -createSnapshot $path $snapshotName
Arguments:
path
The path of the snapshottable directory.
snapshotName
The snapshot name, which is an optional argument. When it
is omitted, a default name is generated using a timestamp with the
format "'s'yyyyMMdd-HHmmss.SSS", e.g.
"s20130412-151029.033".
See also the corresponding Java API Path
createSnapshot(Path path) and Path createSnapshot(Path path,
String snapshotName) in FileSystem. The snapshot path is
returned in these methods.
Delete Snapshots: Delete a snapshot of from a
snapshottable directory. This operation requires owner privilege of the
snapshottable directory.
Command:
hadoop dfs -deleteSnapshot $path $snapshotName
Arguments:
path
The path of the snapshottable directory.
snapshotName
The snapshot name.
See also the corresponding Java API void
deleteSnapshot(Path path, String snapshotName) in FileSystem.
Rename Snapshots: Rename a snapshot. This operation
requires owner privilege of the snapshottable directory..
Command:
hadoop dfs -renameSnapshot $path $oldName $newName
Arguments:
path
The path of the snapshottable directory.
oldName
The old snapshot name.
newName
The new snapshot name.
See also the corresponding Java API void
renameSnapshot(Path path, String oldName, String newName) in FileSystem.
Get Snapshottable Directory Listing: Get all the
snapshottable directories where the current user has permission to take
snapshots.
Command:
hadoop lsSnapshottableDir $path $snapshotName
Arguments:
path
The path of the snapshottable directory.
snapshotName
The snapshot name.
See also the corresponding Java
API SnapshottableDirectoryStatus[]
getSnapshottableDirectoryListing() in DistributedFileSystem.
Get Snapshots Difference Report: Get the differences
between two snapshots. This operation requires read access privilege for
all files/directories in both snapshots.
Command:
hadoop snapshotDiff $path $fromSnapshot $toSnapshot
Arguments:
path
The path of the snapshottable directory.
fromSnapshot
The name of the starting snapshot.
toSnapshot
The name of the ending snapshot.
**See Also**
HDFS Snapshots - 1) Overview
... View more
Labels:
10-08-2016
04:18 PM
4 Kudos
HDFS Snapshots Overview
HDFS Snapshots are read-only point-in-time copies of the
file system. Snapshots can be taken on a subtree of the file system or the
entire file system. Some common use cases of snapshots are data backup,
protection against user errors and disaster recovery.
The implementation of HDFS Snapshots is efficient in the following ways:
1) Snapshot creation is
instantaneous. The cost is
O(1) excluding the inode lookup
time.
2) Additional memory is used
only when modifications are made relative to a snapshot. Memory usage is
O(M),
where
M is the number of modified files/directories.
3) Blocks in datanodes are not copied. The snapshot
files record the block list and the file size.
4) Snapshots do not adversely affect regular HDFS
operations, and there is a minor performance impact from accessing snapshotted data depending on the number of modifications. The snapshot data is computed by subtracting the modifications from the current data (snapshot data = current data – modifications). Also, modifications are recorded in reverse chronological order so
that the current data can be accessed directly.
**See Also**
HDFS Snapshots - 2) Operations
... View more
Labels:
03-09-2017
06:30 PM
@Eyad Garelnabi: There is a document called "Ambari Upgrade Best Practices," http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_ambari-upgrade-best-practices/content/index.html, that was developed with folks in the field, and the issues that many of our customers face. It's a relatively new document, and I hope you will find it useful.
... View more
10-08-2016
04:17 PM
1 Kudo
Do the Upgrade Preparation
● For each target cluster find out as much details as
possible by reviewing the following:
● Hardware configuration, operating systems and network
topology
● Current deployment, cluster topology, configuration of
each component
● Security configuration
● User access and management
● Current data ingestion process (if applicable)
● Current running applications and clients connecting
into the cluster
● Find the applicable Upgrade Guide from
docs.hortonworks.com
● Select validation applications
● Prepare an Upgrade Log to keep track of any upgrade
issues and their workarounds
● [Optional but recommended] Prepare one or more Lab
(virtual) clusters and install the
current HDP stack and Ambari. Use
these clusters for mock upgrades and rollbacks to troubleshoot any upgrade
issues. Upgrading Clusters
● To upgrade a single cluster use the Upgrade Procedure
given below.
● [Optional but recommended] Mock (lab) cluster upgrade:
Attempt an upgrade on a Lab cluster. Some steps of the Upgrade Procedure can be
skipped in order to concentrate on critical parts.
● Test (Dev) cluster upgrade: For upgrading an important,
production cluster it is strongly recommended to attempt the upgrade first on a
test cluster (eg. Dev) similar to the production cluster: running the current
versions of HDP and Ambari, having
similar topology and the components and configuration like the production
cluster but on a smaller number of nodes.
● Log every issue encountered during lab and test
upgrades and its workaround, so that we minimize any down time during the main
cluster upgrade.
● Main (Production) cluster upgrade
● Book the upgrade date and time in advance
● Estimate cluster down-time based on results of the test
upgrade. Note that regardless of the preparation and any test upgrades some new
issues will appear.
● Inform all interested parties
● Confirm that the Support is on stand-by
● Do the upgrade A Single Cluster Upgrade Procedure Prepare the Cluster for the Upgrade
● Run identified validation applications before the upgrade,
and record results and execution times for each of them
● Get ready for the upgrade: Correct any errors and/or
alerts and warnings on the cluster
● Check the state of the HDFS filesystem and finalize it
if not already finalized
● Capture the HDFS status and save the HDFS namespace
● Backup NameNode metadata and all DBs supporting the
cluster (Ambari, Hive metastore, Oozie, Ranger, Hue) Perform Upgrade
● Execute cluster upgrade using the official HDP upgrade
document
● Review new properties, in particular pay attention to
changed property values, changed property names, and new meaning of existing
properties (if any) Post Upgrade Validation
● Run the Smoke test for each service and troubleshoot
any issues
● Run validation applications after the upgrade and record results and execution
times
● If any validation application is failing or execution
times are much longer than before the upgrade, review and adjust cluster
properties, repeating validation applications until they are stable and don’t
run slower than before the upgrade
● Record in the Upgrade Log any issues encountered and
workarounds. Final Steps
● Install new HDP Components, not used before the upgrade
(if any), run smoke test for each of them and troubleshoot any issues
● Finalize HDFS upgrade
● Configure HA of selected components (like NN, RM,
HiveServer2, HBase, Oozie)
● Perform Ambari Takeover of HDP components not being
managed by Ambari earlier
● Enable Kerberos Security: the KDC and existing
principals and keytabs can be reused, add principals for new components
● LDAP integration (Ambari, KDC, Ranger) **See Also**
HDP Upgrade Best Practices - 1) Plan and Asses
HDP Upgrade Best Practices - 3) Documentation and Learnings
... View more
10-08-2016
04:17 PM
1 Kudo
Plan and Assess
This is a purely planning step. The expected deliverable is an
Upgrade Plan.
Gather all details about existing environment to plan for the upgrade path and
associated upgrade tasks.
1) Determine Upgrade Path
Based on the current and target version of the HDP stack, and
whether Ambari is used or not, select the supported upgrade guide from
Hortonworks documentation site. Identify key requirement if Namenode HA or
other HA needs to be disabled or Security needs to be disabled.
Current version:
● HDP Stack version
● Ambari version (if Ambari is used)
● OS Version
Target version:
● HDP Stack version
● Ambari version (if Ambari is used)
Below are some useful links
HDP Stacks Managed by Different Ambari Versions:
https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.0.0/bk_ambari-installation/content/determine_stack_compatibility.html
Upgrading to Ambari 2.4:
https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.0.1/bk_ambari-upgrade/content/upgrading_ambari.html
Upgrading HDP Using Ambari:
https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.0.1/bk_ambari-upgrade/content/upgrading_hdp_stack.html
Upgrading HDP Manually (without Ambari):
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_command-line-upgrade/content/ch_upgrade_2_4.html
2) Review Known Issues in Target Version Release
Review the following items:
Behavioral Changes that will affect applications
● Unsupported features
● Known Issues
● New features added to release
HDP 2.5 Release Notes:
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_release-notes/content/ch_relnotes_v250.html
HDP 2.5 Known Issues:
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_release-notes/content/known_issues.html
3) Select Validation Applications
Select two groups of validation applications.
First group: Industrial benchmarks like Teragen &
Terasort, TestDFSIO, Hive TPC-DS, and HBase performance tests. As the minimum
use Teragen & Terasort with multiple mappers for Teragen and multiple
reducers for Terasort.
Second group (optional): User defined validation
applications
. Identify representative applications (together with the input
data) which are being used most often. Be sure to include at least one for
every used Hadoop component like MapReduce, Hive, Pig, HBase, Oozie, Storm,
Kafka and others.
4) Finalize Project Management Items
●
Scope: Identify clusters to be upgraded and components
to upgrade and newly install (if any).
●
HR: Staff upgrade teams. Also, some validation applications can be run by developers
themselves.
●
Time: Identify upgrade tasks, timeline and task owners.
●
QA: Carefully identify validation tasks
●
Risk: Estimate down-time for each cluster upgrade.
●
Resources: Prepare the cluster on which the upgrade will be tested (eg., Dev). When upgrading production clusters
it is strongly recommended to attempt the upgrade first on a test cluster.
**See Also**
HDP Upgrade Best Practices - 2) Do the Upgrade
HDP Upgrade Best Practices - 3) Documentation and Learnings
... View more
04-26-2017
06:07 PM
Hey, We are using HDP2.3.6. We are geeting below error when we configured Ranger to store audit on Solr. 2017-04-25 09:16:23,366 WARN [org.apache.ranger.audit.queue.AuditBatchQueue0]: provider.BaseAuditHandler (BaseAuditHandler.java:logFailedEvent(374)) - failed to log audit event: {"repoType":3,"repo":"hdpt01_hive","reqUser":"hadooptest","evtTime":"2017-04-25 09:16:21.124","access":"USE","resType":"@null","action":"SHOWDATABASES","result":1,"policy":6,"enforcer":"ranger-acl","sess":"06802e00-eda7-4bd2-a812-7e2ed2621e24","cliType":"HIVESERVER2","cliIP":"","reqData":"show schemas","agentHost":"hivehost","logType":"RangerAudit","id":"d8b3d307-0035-4613-a7ff-872fa1c46a9e","seq_num":0,"event_count":1,"event_dur_ms":0}
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://hostname:8886/solr/ranger_audit: Expected mime type application/octet-stream but got text/html. <html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<title>Error 401 Authentication required</title>
</head>
<body><h2>HTTP ERROR 401</h2>
<p>Problem accessing /solr/ranger_audit/update. Reason:
<pre> Authentication required</pre></p><hr><i><small>Powered by Jetty://</small></i><hr/>
</body>
</html> The cluster is kerberized. Below error is seen when accessing teh ambari-infra-solr UI http://hostname:8886/solr/ GSSException: Failure unspecified at GSS-API level (Mechanism level: Specified version of key is not available (44))
... View more
10-20-2016
09:00 PM
@Eyad Garelnabi Documentation may need to be updated, but with the new kernels, swappiness does not need to be set to 0. Read this article from @emaxwell: https://community.hortonworks.com/articles/33522/swappiness-setting-recommendation.html
... View more
09-01-2016
02:30 PM
1 Kudo
No, one can only whitelist the hosts and groups that can be impersonated by the proxy user. You may want to define a new group in this case.
... View more
09-02-2016
10:31 AM
Hi, Please try to download it with: curl -O -k https://public-repo-1.hortonworks.com/HDP/cloudbreak/cloudbreak-2016-07-06-12-51.img
Regarding the import process you can see further instructions here (it was written to 1.3.0 but it is still valid just replace the image name): http://sequenceiq.com/cloudbreak-docs/latest/openstack/ Attila
... View more