Member since
07-31-2013
1924
Posts
460
Kudos Received
311
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
964 | 07-09-2019 12:53 AM | |
4148 | 06-23-2019 08:37 PM | |
5505 | 06-18-2019 11:28 PM | |
5523 | 05-23-2019 08:46 PM | |
1937 | 05-20-2019 01:14 AM |
04-26-2022
04:36 AM
Have you solved it ?
... View more
11-13-2021
06:51 AM
Hello @xgxshtc We observed you have posted the concerned ask in a New Post [1] as the concerned Post is ~4Years Old. While the Current Post is Unresolved, We shall wait on your Team's review on [1] before confirming the Solution on the Current Post as well. Regards, Smarak [1] https://community.cloudera.com/t5/Support-Questions/Hbase-regionserver-shutdown-after-few-hours/m-p/330070/highlight/false#M230589
... View more
04-07-2021
09:50 AM
@swapko as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.
... View more
03-03-2021
11:19 PM
1 Kudo
Yarn resourcemanager keeps writing status of each running/finished application in the statestore. Statestore usually are managed in either zookeeper or in localFS based on our configurations. When the RM turns from standby to active it looks for the latest commits made by the other RM and loads them. If this information is lost at any given point, RM will fail to load the application information.
... View more
02-23-2021
04:41 AM
Yes, you can download the hdfs client configuration from Cloudera Manager, but this is not possible always, when you are working on different department or any bureaucratic issue... And if you make any change HDFS configuration, you must download this configuration again. Is not a scalable solution in a big environments, the best solution is working on the same cluster (gateway if is possible), but if you have an external Flume Agents, there no exist a properly and scalable solution I think.
... View more
02-22-2021
06:46 AM
that output comes after the reduce function not map
... View more
02-19-2021
03:02 AM
I have the same issue as @lmdrone . Hadoop 'find' command only supports two expressions and cloudera has removed org.apache.solr.hadoop.HdfsFindTool utils. How do we filter files based on modified time? Please bring back "org.apache.solr.hadoop.HdfsFindTool"
... View more
07-13-2020
01:47 AM
A very late reply to this topic, just to document the similar error I had using a Kafka client from a different Kerberos realm. [2020-07-13 09:47:08,678] ERROR [Consumer clientId=consumer-1, groupId=console-consumer-57017] Connection to node -1 failed authentication due to: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Fail to create credential. (63) - No service creds)]) occurred when evaluating SASL token received from the Kafka Broker. Kafka Client will go to AUTHENTICATION_FAILED state. (org.apache.kafka.clients.NetworkClient) Debugging showed: error code is 7 error Message is Server not found in Kerberos database crealm is REALM1.DOMAIN.COM cname is rzuidhof@REALM1.DOMAIN.COM sname is krbtgt/REALM2.DOMAIN.COM@REALM1.DOMAIN.COM Situation is a HDP cluster being access using a client on a host joined to a different (IPA) domain. No trust. This works without trust, I think trust is only needed to use accounts from a different domain but we used keytabs and interactive kinit from REALM1 in REALM2 to access services in REALM1. All that was needed to get this to work was one additional line in /etc/krb5.conf on the REALM2 servers under [domain_realm] realm1.domain.com = REALM1.DOMAIN.COM We already had under [libdefaults]: dns_lookup_realm = true dns_lookup_kdc = true We also arranged DNS forwarding, but no reverse lookups.
... View more
04-24-2020
02:30 PM
@bgooley In CDH 6.3.x, this appears to have changed and the "https.py" file is slightly different now. It accepts the cipher_list as a configuration item. The way we secured Port 900 is by doing these steps: 1) Check to see if RC4 (and other weak ciphers) are open on Port 9000: openssl s_client -cipher RC4 -connect <server>:9000 -msg 2) Edit the "/etc/cloudera-scm-agent/config.ini" file 3) Under the "[Security]" section of the config.ini file, we added these lines: # Custom Cipher List to close vulnerabilities for port 9000 cipher_list=HIGH:!DSS:!DH:!ADH:!DES:!3DES:!SHA1:!RC4:!aNULL:!eNULL:!EXPORT:!SSLv2:!SSLv3:!TLSv1 4) Restart the Cloudera CM-Agent: sudo service cloudera-scm-agent restart 5) Wait a minute or so and then rerun the OpenSSL command and RC4 (and other weak ciphers, if you test them) are closed: openssl s_client -cipher RC4 -connect <server>:9000 -msg It would be great if Cloudera could add this to their documentation on how to add this additional security to the CM Agent.
... View more
03-25-2020
12:25 AM
Hi, Posting to acknowledge this as an ongoing, current issue with the CDP Data Catalog service wherein it lacks support for the 7.1.0 lake versions. I am unable to find a workaround to it as the 7.0.2 lake version selection has been removed (like you've noticed), which could've been a potential route. Our internal teams are aware and are working on getting the data catalog service working against 7.1.0 very soon (internal reference for the issue is CDPDSS-365 if you'd like to discuss this over a support case/etc.) Sorry for the inconvenience!
... View more
02-14-2020
04:51 PM
Here is exact permissions for /tmp/logs 1. Assume user1 is a valid local OS user. The following folder structure comprises of a proper JobHistory functionality: drwxrwxrwt - hdfs supergroup 0 2014-09-15 17:01 /tmp drwxrwxrwt - mapred hadoop 0 2014-09-18 12:02 /tmp/logs drwxrwx--- - user1 hadoop 0 2014-09-18 12:03 /tmp/logs/user1 drwxrwx--- - user1 hadoop 0 2014-09-18 12:03 /tmp/logs/user1/logs Here is an example entry for incorrect permissions: drwxrwx--- - hive supergroup 0 2014-09-18 12:00 /tmp/logs/user1/logs/ 2. Adjust the /tmp/logs/ folders recursively to reflect the ownership and permissions similar to the above: Example commands to update the customer's permissions in HDFS: sudo -u hdfs hadoop fs -chown mapred:hadoop /tmp/logs sudo -u hdfs hadoop fs -chown -R :hadoop /tmp/logs/*
... View more
02-06-2020
04:08 AM
Hi, You also need to check on below configuration (If any). 1. Dynamic Resource Pool Configuration > Resource Pools - Check if jobs are exceeding any max values respective of the queue it's being submitted. 2. Dynamic Resource Pool Configuration > User Limits - Check if t he maximum number of applications a user can submit simultaneously is crossing the default value (5) or the specified value.
... View more
01-23-2020
07:35 AM
It's a bug in Oozie. CoordActionCheckXCommand doesn't take care of SUSPENDED state. It only handles SUCCEEDED, FAILED and KILLED. protected Void execute() throws CommandException {
try {
InstrumentUtils.incrJobCounter(getName(), 1, getInstrumentation());
Status slaStatus = null ;
CoordinatorAction.Status initialStatus = coordAction.getStatus();
if (workflowJob.getStatus() == WorkflowJob.Status.SUCCEEDED) {
coordAction.setStatus(CoordinatorAction.Status.SUCCEEDED);
// set pending to false as the status is SUCCEEDED coordAction.setPending(0);
slaStatus = Status.SUCCEEDED;
}
else {
if (workflowJob.getStatus() == WorkflowJob.Status.FAILED) {
coordAction.setStatus(CoordinatorAction.Status.FAILED);
slaStatus = Status.FAILED;
// set pending to false as the status is FAILED coordAction.setPending(0);
}
else {
if (workflowJob.getStatus() == WorkflowJob.Status.KILLED) {
coordAction.setStatus(CoordinatorAction.Status.KILLED);
slaStatus = Status.KILLED;
// set pending to false as the status is KILLED coordAction.setPending(0);
}
else {
LOG.warn( "Unexpected workflow " + workflowJob.getId() + " STATUS " + workflowJob.getStatus());
coordAction.setLastModifiedTime( new Date());
CoordActionQueryExecutor.getInstance().executeUpdate(
CoordActionQueryExecutor.CoordActionQuery.UPDATE_COORD_ACTION_FOR_MODIFIED_DATE,
coordAction);
return null ;
}
}
}
... View more
01-09-2020
01:35 AM
Hi Harsh, I was able to add ranger in CDP after going through cloudera documentation. I added ranger using Postgres DB, earlier i was trying it using Mysql DB. So issues is resolved for me. Thanks
... View more
- Tags:
- rsh
01-01-2020
04:24 AM
I was also facing same issue like you. Then I had followed these steps and it worked for me : set hive.support.concurrency=true; set hive.enforce.bucketing=true; set hive.exec.dynamic.partition.mode=nonstrict; set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set hive.compactor.initiator.on=true; set hive.compactor.worker.threads=2; Then I changed added hive.in.test property=true in the hive-site.xml file in /usr/lib/hive location. After that I restarted the Hive from HUE and then ran the update command and it worked for me.
... View more
12-23-2019
06:36 PM
Hi @Harsh J , I just deleted around 80% of my data with " DELETE from table_name where register <= '2018-12-31'" My disks are pretty full (around 90%). After the deletion nothing happened (about freeing space). I restart Cloudera (Kudu, Impala, HDFS, etc.) and nothing. I add this two lines to Kudu configuration (in "Master Advanced Configuration Snippet (Safety Valve) for gflagfile" and "Tablet Server Advanced Configuration Snippet (Safety Valve) for gflagfile"): ``` unlock_experimental_flags=true flush_threshold_secs=120 ``` After restart Kudu, wait for the 120 secs.. nothing.
... View more
11-19-2019
06:23 PM
It happened to me when I was installing cloudera 6.3.1, What solved to me was: 1. run: sed -i 's/SELINUX=enforcing/SELINUX=permissive/' /etc/selinux/config 2. config /etc/hosts: (just an exemple, set the host of all machines) hostnamectl set-hostname master1.hadoop-test.com
echo "10.99.0.175 master1.hadoop-test.com master1" >> /etc/hosts
sed -i 's/\r//' /etc/hosts
echo "HOSTNAME=master1.hadoop-test.com" >> /etc/sysconfig/network 3. reboot then: 4. wget <a href="https://archive.cloudera.com/cm6/6.3.1/cloudera-manager-installer.bin" target="_blank">https://archive.cloudera.com/cm6/6.3.1/cloudera-manager-installer.bin</a> 5. chmod u+x cloudera-manager-installer.bin 6. ./cloudera-manager-installer.bin
... View more
10-28-2019
10:40 AM
Since Hadoop 2.8, it is possible to make a directory protected and so all its files cannot be deleted, using : fs.protected.directories property. From documentation: "A comma-separated list of directories which cannot be deleted even by the superuser unless they are empty. This setting can be used to guard important system directories against accidental deletion due to administrator error." It does not exactly answer the question but it is a possibility.
... View more
10-24-2019
10:43 PM
The application might be expecting the log folder to be there in order to generate logs in it. Seems like your problem can be solved by creating the folder in the driver node: /some/path/to/edgeNode/ I hope you also know that you have mentioned the log4j file only for driver program. In order for executors to generate logs you may need to specify the following option in spark-submit "spark.executor.extraJavaOptions=-Dlog4j.configuration=driver_log4j.properties"
... View more
10-16-2019
05:58 AM
hello, @amirmam Did you manage to solve? I have the same problem with the current version of CDH 6.1.1 thanks!
... View more
10-07-2019
10:35 AM
Hi Dwill, Is it worked for you Sqoop import with ssl enabled oracle db ? I got the same kind of requirement to use sqoop import with ssl enabled db and I am trying connecting through oracle wallet but getting network adapter issues. Could you please provide me the steps if it is working fine for you ? Thank you.
... View more
10-05-2019
12:43 AM
The original issue described here is not applicable to your version. In your case it could simply be a misconfiguration that's causing oozie to not load the right hive configuration required to talk to the hive service. Try enabling debug logging on the oozie server if you are unable to find an error in it. Also try to locate files or jars in your workflow that may be supplying an invalid hive client XML.
... View more
09-25-2019
03:03 AM
Hi Harsha, Thanks for the explanation. In extension to the topic, I need small clarification - we recently implemented sentry on impala, based on below KB [1] article, we can't execute " Invalidate all metadata and rebuild index" and "Perform incremental metadata update" , since we don't have access to all the DB's, it's fair as well. Now my question is - 1. I am not able to see new DB in Hue impala, I can see the same from beeline or impala shell. How to fix or solve this ? 2. I can execute invalidate metadata on table from impala shell but I have 50+ DB's and 10's of tables in each db. Is there any option to run invalidate metadata ion DB level instead of individual table? [1] https://my.cloudera.com/knowledge/INVALIDATE-METADATA--Sentry-Enabled--ERROR? id=71141 Thanks Krishna
... View more
08-23-2019
02:55 AM
removing hidden files worked for me. thanks a lot
... View more
08-20-2019
01:09 AM
Hi @lsouvleros, as you already pointed out: this is influenced by a number of factors and widely influenced by your use case and existing organizational context. Comparing to an HDFS in a classic compute/storage-coupled Hadoop cluster, some of the discussions from here do also apply: https://www.cloudera.com/documentation/enterprise/latest/topics/cm_sdx_vpc.html. This is, because Isilon is a network-attached storage and - similar to using Cloudera Virtual clusters - this has some implications on performance, especially for workloads with high-performance requirements. I have also seen environments where using Isilon instead of HDFS had impact on Impala performance. In terms of reliability and stability, you can argue each way - depending on your architecture. However, a multi-datacenter-deployment is likely to be more easy to realize with Isilon, due to its enterprise-proof replication and failover capabilities. In terms of efficiently using storage space, Isilon will have advantages. However, the higher cost compared to JBOD-based HDFS might make this point irrelevant. For scalability, I guess it depends again on your organizational setup. You can easily scale up Isilon by buying more boxes from EMC. There are certainly really large Isilon deployments out there. On the other hand, scaling HDFS is also not hard and can help you to realize huge deployments. In the end it will be a tradeoff of higher costs with Isilon but with more easy management vs. lower costs by higher efforts with HDFS. This is my personal opinion and both EMC and Cloudera might have stronger arguments for their respective storage (e.g. [EMC link]). You can also look for the latest announcement for the blog. Regards, Benjamin
... View more
08-08-2019
03:36 PM
Hey Harsh Thanks for responding. As multiple client are requesting data to hbase, at some point, sometimes user don’t get data, EOF exception or connection interruptions occur. We are not able to track the record of requested data and size of input and output data sending to end user. Regards Vinay K
... View more
08-04-2019
09:20 AM
Zookeeper works on quorum. And quorum holds the majority of servers rules. If you have 3 servers and one is down, then Majority of servers are working. you can read further Zookeeper Quorum
... View more
07-31-2019
04:26 AM
Done keyword is missing in your script. In your loop body use done at last.
... View more