Member since
07-31-2013
1924
Posts
460
Kudos Received
311
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
956 | 07-09-2019 12:53 AM | |
4118 | 06-23-2019 08:37 PM | |
5459 | 06-18-2019 11:28 PM | |
5491 | 05-23-2019 08:46 PM | |
1927 | 05-20-2019 01:14 AM |
03-25-2020
12:25 AM
Hi, Posting to acknowledge this as an ongoing, current issue with the CDP Data Catalog service wherein it lacks support for the 7.1.0 lake versions. I am unable to find a workaround to it as the 7.0.2 lake version selection has been removed (like you've noticed), which could've been a potential route. Our internal teams are aware and are working on getting the data catalog service working against 7.1.0 very soon (internal reference for the issue is CDPDSS-365 if you'd like to discuss this over a support case/etc.) Sorry for the inconvenience!
... View more
01-09-2020
01:29 AM
Can you paste the contents of the all files in the following directory from the Ranger host please? /var/run/cloudera-scm-agent/process/1546333400-ranger-RANGER_ADMIN-SetupRangerCommand/logs/* The missing property (db_password) is written by a control script that should log some information to these files and it'll help us determine a cause if we had their contents. I'm assuming that in your CM - Ranger - Configuration page the value for field 'ranger.jpa.jdbc.password' is set to a valid value. Also, do you perhaps have an @ (at) character in your password? If yes, could you try a different password without that character? You may be hitting a bug (OPSAPS-53645 is its internal ID, fixed in future releases) that did not support that password character in the original CDP 7.0 release.
... View more
10-05-2019
12:43 AM
The original issue described here is not applicable to your version. In your case it could simply be a misconfiguration that's causing oozie to not load the right hive configuration required to talk to the hive service. Try enabling debug logging on the oozie server if you are unable to find an error in it. Also try to locate files or jars in your workflow that may be supplying an invalid hive client XML.
... View more
07-09-2019
12:53 AM
2 Kudos
Yes that is correct, and the motivations/steps-to-use are reflected here too: https://www.cloudera.com/documentation/enterprise/6/latest/topics/cm_s3guard.html Note: On your point of 'load data from S3 into HDFS', it is better stated as simply 'read data from S3', where HDFS gets used as a transient storage (where/when required). There does not need to be a 'download X GiB data from S3 to HDFS first, only then begin jobs' step, as distributed jobs can read off of S3 via s3a:// URLs in the same way they do from HDFS hdfs://.
... View more
07-04-2019
07:17 PM
Try deleting away /etc/default/cloudera-*, /etc/cloudera-*, /var/lib/cloudera-* entirely, and erase all cloudera-* packages via yum (on all involved hosts). After this, attempt the installer again. This will allow the default embedded configs to be written and used for DB initialization, vs. preserving whatever has been left over.
... View more
07-03-2019
07:46 PM
Regarding 403 vs. 405 you're right on the specific difference, but all 4xx errors pertain to denying client request in some form, so it is not indicative of a feature not fully working. Am not sure on what the 35216 port is (logs can help tell what's starting it) - there should be no need for a forwarding port/proxy, but perhaps it is the method of deployment doing that? Direct access to the HMaster port appears to be denying TRACE operations in your tests.
... View more
07-02-2019
10:04 PM
The file locations have changed over versions (Apache HBase underwent a modular restructure into client/server code), but what's important is that the constraint is still present in the moved sources and (seemingly, for '46645' port) works. What UI/service is served over 35216 by the HMaster process, if you try to directly check via a browser/GET? I'm assuming 46645 serves the actual HMaster UI instead?
... View more
07-02-2019
06:50 AM
That version should carry the changes required for constraining TRACE (HBASE-10473) I just tried standing a pseudo-distributed cluster over an Apache HBase 1.3.1 tarball, and it gives me the same result of a 403 when a TRACE request is attempted. Are you certain the end-point you're targeting is the HBase Master (does a GET of /jmx over it return Master-metrics?), and can you share your full curl -v output?
... View more
07-02-2019
02:15 AM
What distribution and version of HBase are you using in your deployment here? On most recent CDH5 and CDH6 HBase Master web ports, performing a TRACE method request responds with a 403 unauthorized response.
... View more
06-30-2019
07:00 PM
For single node clusters, the default configuration is often set insufficiently to run any multi-container job. See https://community.cloudera.com/t5/Batch-Processing-and-Workflow/JOB-Stuck-in-Accepted-State/td-p/29494 for raising this up. For the second issue, "dr.who"-run jobs, this is more serious and is a result of your Azure instance's network access rules being permittive. These unidentified jobs are effectively run by hack-bots that discover your cluster is accessible and without security over the internet. See https://community.cloudera.com/t5/Batch-Processing-and-Workflow/What-is-Dr-who-user-100s-of-yarn-jobs-are-getting-triggered/m-p/68026#M3657 for prior discussion on this. Typically, limiting internet-external access to your cluster's ports will help cease this.
... View more
06-23-2019
08:37 PM
1 Kudo
This looks like a case of edit logs getting reordered. As @bgooley noted, it is similar to HDFS-12369, where the OP_CLOSE is appearing after OP_DELETE causing the file to be absent when replaying the edits. The simplest fix, depending on if this is the only file instance of the reordered issue in your edit logs, would be to run the NameNode manually in an edits-recovery mode and "skip" this edit when it catches the error. The rest of the edits should apply normally and let you start up your NameNode. The recovery mode of NameNode is detailed at https://blog.cloudera.com/blog/2012/05/namenode-recovery-tools-for-the-hadoop-distributed-file-system/ If you're using CM, you'll need to use the NameNode's most recent generated configuration directory under /var/run/cloudera-scm-agent/process/ on the NameNode host as the HADOOP_CONF_DIR, while logged in as 'hdfs' user, before invoking the manual NameNode startup command. Once you've followed the prompts and the NameNode appears to start up, quit out/kill it to restart from Cloudera Manager normally. If you have a Support subscription, I'd recommend filing a case for this, as the process could get more involved depending on how widespread this issue is.
... View more
06-23-2019
08:25 PM
One approach would be to have API-performed rolling restarts with config change-sets applied at each schedule. If latency (small drop in available nodes sustained for the duration of batched restarts) isn't an issue, this could work. If locality isn't important, then you can also achieve this by marking a set of nodes as entirely unavailable via the NodeManager state support, leaving only the rest at full capacity. This sounds like a scenario best served by Cloud (workload-driven cluster runtimes), offered by Cloudera Altus (or upcoming CDP) and/or Director. Some links that may be helpful: - API in CM to perform rolling restart of a specific service: https://archive.cloudera.com/cm6/6.2.0/generic/jar/cm_api/apidocs/resource_ServicesResource.html#resource_ServicesResource_ClustersResourceV32_ServicesResourceV32_rollingRestart_POST - API in CM to apply config changes to a service role group: https://archive.cloudera.com/cm6/6.2.0/generic/jar/cm_api/apidocs/resource_RoleConfigGroupsResource.html#resource_RoleConfigGroupsResource_ClustersResourceV32_ServicesResourceV32_RoleConfigGroupsResource_updateConfig_PUT - Cloudera Altus: https://www.cloudera.com/documentation/altus/topics/alt_intr_overview.html
... View more
06-18-2019
11:28 PM
1 Kudo
It could be passed by either modes, hence the request for the CLI used. The property to modify on the client configuration (via CM properties or via -D early CLI args) is called 'mapreduce.map.memory.mb', and the administrative limit is defined in the Resource Manager daemon configuration via 'yarn.scheduler.maximum-allocation-mb'
... View more
06-18-2019
07:36 AM
Please share your full Sqoop CLI. The error you are receiving suggests that the configuration passed to this specific Sqoop job carried a parameter asking for Map memory to be higher than what the administrator has configured as a limit a Map task may request. As a result, the container request is rejected. Lowering the request memory size of map tasks will let it pass through this check.
... View more
06-07-2019
01:51 AM
1 Kudo
Yes, it is indicated in the same document: """ Use the command disable_peer ("<peerID>") to disable replication for a specific peer. This will stop replication to the peer, but the logs will be kept for future reference. Note: This log accumulation is a powerful side effect of the disable_peer command and can be used to your advantage. See Initiating Replication When Data Already Exists. """ - https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_bdr_hbase_replication.html#topic_20_11_5
... View more
06-06-2019
06:52 PM
@Megs - The steps listed at https://www.cloudera.com/documentation/enterprise/upgrade/topics/ug_jdk8.html should help you achieve this with Cloudera Manager
... View more
06-05-2019
06:27 PM
While the host you are running the command on, and where you checked your java version, is running JDK 8, your cluster's NodeManagers are running with JDK 7, and therefore this will not work. Sqoop generates and compiles a program to represent DB records for purpose of serialization. This is done with the JDK version at the command-invocation host. The compiled binaries are then shipped to NodeManager hosts for execution, where they expect to run with the same or higher JDK version. You must either: - Drop your JDK version on the host where you are running the Sqoop CLI to JDK7, or, - Update the JDK / Explicitly specify the JDK on all cluster hosts to use JDK8 and restart the cluster.
... View more
06-05-2019
06:24 PM
@Reavidence, HTTPFS with Kerberos requires SPNEGO authentication to be used. Per https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_sg_httpfs_security.html, for curl (after kinit) this can be done by passing the below two parameters: """ The '--negotiate' option enables SPNEGO in curl. The '-u :' option is required but the username is ignored (the principal that has been specified for kinit is used). """
... View more
06-03-2019
07:33 PM
Please follow the entire discussion above - the parameter is an advanced one and has no direct field. You'll need to use the safety valve to apply it by using the property name directly. P.s. It is better etiquette to open a new topic than bump ancient ones.
... View more
05-27-2019
06:35 AM
1 Kudo
Small note that's relevant to this (older) topic: When copying over Cells from one fetched Scan/Get Result to another Put object with the altered key, do not add the Cell objects as-is via Put::addCell(…) API. You'll need to instead copy the value portions exclusively. A demo program for a single key operation would look like this: public static void main(String[] args) throws Exception {
Configuration conf = HBaseConfiguration.create();
Connection connection = ConnectionFactory.createConnection(conf);
Table sourceTable = connection.getTable(TableName.valueOf("old_table"));
Table destinationTable = connection.getTable(TableName.valueOf("new_table"));
Result result = sourceTable.get(new Get("old-key".getBytes()));
Put put = new Put("new-key".getBytes());
for (Cell cell: result.rawCells()) {
put.addColumn(cell.getFamilyArray(), cell.getQualifierArray(), cell.getTimestamp(), cell.getValueArray());
}
destinationTable.put(put);
} The reason to avoid Put::addCell(…) is that the Cell objects from Result will still carry the older key and you'll receive a WrongRowIOException if you attempt to use it with a Put object initiated with a changed key.
... View more
05-23-2019
08:46 PM
1 Kudo
For HBase MOBs, this can serve as a good starting point as most of the changes are administrative and the writer API remains the same as regular cells: https://www.cloudera.com/documentation/enterprise/latest/topics/admin_hbase_mob.html For SequenceFiles, a good short snippet can be found here: https://github.com/sakserv/sequencefile-examples/blob/master/test/main/java/com/github/sakserv/sequencefile/SequenceFileTest.java#L65-L70 and for Parquet: https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/example/ExampleParquetWriter.java More general reading for the file formats: https://blog.cloudera.com/blog/2011/01/hadoop-io-sequence-map-set-array-bloommap-files/ and https://parquet.apache.org/documentation/latest/
... View more
05-20-2019
01:14 AM
You can apply the queries directly on that external table. Hive will use HDFS for any transient storage it requires as part of the query stages. Of course, if it is a set of queries overall, you can also store all the intermediate temporary tables on HDFS in the way you describe, but the point am trying to make is that you do not need to copy the original data as-is, just allow Hive to read off of S3/write into S3 at the points that matter.
... View more
05-19-2019
08:48 PM
- Do you observe this intermittency from only specific client/gateway hosts? - Does your cluster apply firewall rules between the cluster hosts? One probable reason behind the intermittent 'Connection refused' from KMS could be that it is frequently (auto)restarting. Checkout its process stdout messages and service logs to confirm if there's a kill causing it to be restarted by the CM Agent supervisor.
... View more
05-19-2019
06:17 PM
You can do this via two methods: Container files, or HBase MOBs. Which is the right path depends on your eventual, dominant read pattern for this data. If your analysis will require loading up only a small range of images out of the total dataset, or individual images, then HBase is a better fit with its key based access model, columnar storage and caches. If instead you will require processing these images in bulk, then large container files (such as Sequence Files (with BytesWritable or equivalent), Parquet Files (with BINARY/BYTE_ARRAY types), etc. that can store multiple images into a single file, and allow for fast, sequential reads of all images in bulk.
... View more
05-19-2019
06:06 PM
Would you be able to attach the contents of /tmp/scm_prepare_node.vQZe0yDf/scm_prepare_node.log (or any/all '/tmp/**/scm_prepare_node.log' files) from the host the install failed on (node5 in this case)?
... View more
05-19-2019
06:04 PM
You do not need to pull files into HDFS as a step in your processing, as CDH provides inbuilt connectors to pull input/write output directly from S3 storage (s3a:// URIs, backed by some configurations that provide credentials and targets). This page is a good starting reference to setting up S3 access over Cloud installations: https://www.cloudera.com/documentation/director/latest/topics/director_s3_object_storage.html - make sure to checkout the page links from the opening paragraph too.
... View more
05-15-2019
06:52 PM
1 Kudo
The Disk Balancer sub-system is local to each DataNode and can be triggered on distinct hosts in parallel. The only time you should receive that exception is if the targeted DN's hdfs-site.xml does not carry the property that enables disk balancer, or when the DataNode is mid-shutdown/restart. How have you configured disk balancer for your cluster? Did you follow the configuration approach presented at https://blog.cloudera.com/blog/2016/10/how-to-use-the-new-hdfs-intra-datanode-disk-balancer-in-apache-hadoop/? What is your CDH and CM version?
... View more
05-15-2019
12:39 AM
1 Kudo
The docker version (5.13.x) of quickstart uses a package based installation by default.
... View more
05-14-2019
07:45 PM
Yes it is supported in the same manner as with DistCp (the same exclude config needs to be used if you have a realm trust situation matching YARN-3021 [1]). Your error however indicates that you do not have a valid Kerberos TGT on the host you're running the command on when it tries to communicate with the destination HDFS as part of job prep [2]. This is well before any MR or token work comes into play. Are you able to perform a 'hdfs dfs -ls hdfs://remote-fs/any/path' command successfully after your kinit from the same shell you're using to submit the Export Snapshot job? [1] - https://issues.apache.org/jira/browse/YARN-3021 [2] - https://github.infra.cloudera.com/CDH/hbase/blob/cdh5.15.2-release/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#L959
... View more
05-14-2019
07:37 PM
Look for an exception in logs preceding the "Failed open of region=" handling failure message on your RegionServer. One situation may be that a HFile is un-openable under the region (for varied reasons), and will require being sidelined (removed away) for bringing the region back online.
... View more