Member since
09-02-2016
523
Posts
89
Kudos Received
42
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2309 | 08-28-2018 02:00 AM | |
2160 | 07-31-2018 06:55 AM | |
5070 | 07-26-2018 03:02 AM | |
2433 | 07-19-2018 02:30 AM | |
5863 | 05-21-2018 03:42 AM |
05-08-2018
03:48 AM
@hendry Pls apply invalidate metadata and try again INVALIDATE METADATA [[db_name.]table_name]
... View more
05-02-2018
04:44 AM
@balajivsn you can stop the cluster using CM -> Top left Cluster menu -> Stop Yes, It will stop all the avilable services Ex: Hue, Hive, Spark, Flume, Yarn, HDFS, Zookeeper, etc. And it won't disturb your host & Cloudera Management Service. Note: You don't need to separately handle daemons like namenode on this
... View more
04-30-2018
12:31 PM
@balajivsn The link that you are referring is belongs to 5.4.x, please refer the below link (5.14.x) for little more details There are two types of backup 1. HDFS Metadata backup https://www.cloudera.com/documentation/enterprise/5-14-x/topics/cm_mc_hdfs_metadata_backup.html Need to follow all the steps including "Stop the cluster. It is particularly important that the NameNode role process is not running so that you can make a consistent backup" 2. NameNode Metadata backup https://www.cloudera.com/documentation/enterprise/5-14-x/topics/cm_mc_nn_metadata_backup.html can be done using $ hdfs dfsadmin -fetchImage backup_dir Now to answer your question, If you see the first link, it says "Cloudera recommends backing up HDFS metadata before a major upgrade". So In the real-time production cluster, we perform the HDFS metadata backup, major upgrade during the downtime. So the given steps are recommended way for consistent backup. But if your situation is just a mater of namenode back-up in a regular interval, then I belive you are correct.. you can switch-on the safe mode and take a backup and leave the safe mode. (or) you can try the option from the 2nd link Note: Please make sure to test it in lower environments before apply in prod
... View more
04-30-2018
06:05 AM
@Alan-H There could be different ways, but I tried the below steps and it is working for me Step1: using select class with hardcoded value create table default.mytest(col1 string, col2 int);
insert into default.mytest
select 'For testing single quote\'s', 1;
insert into default.mytest
select 'For testing double quote\"s', 2;
select * from default.mytest; Step2: using select class by passing value in parameter set hivevar:col1 = 'For testing single quote\'s';
set hivevar:col2 = 3;
insert into default.mytest
select ${hivevar:col1}, ${hivevar:col2};
select * from default.mytest; Step3: using select class by passing value in parameter set hivevar:col1 = 'For testing double quote\"s';
set hivevar:col2 = 4;
insert into default.mytest
select ${hivevar:col1}, ${hivevar:col2};
select * from default.mytest; Step4: drop table default.mytest;
... View more
04-30-2018
04:10 AM
@bhaveshsharma03 In fact there is no standard answer for this question as it is purly based on your business model, cluster size, sqoop export/import frequency, data volume, hardware capacity, etc I can give few points based my experience, hope it may help you 1. 75% of the sqoop scripts (non-priority) will use the default mappers for various reasons as we don't want to use all the available resources for just sqoop alone. 2. Also we don't want to apply all the possible performance tuning methods on those non-priority jobs, as it may disturb the RDBMS (source/target) too. 3. Get in touch with RDBMS owner to see their non-busy hours, identify the priority sqoop scripts (based on your business model), apply the performance tuning methods on the priroity scripts based on data volume (not only rows, 100s of column also matters). Repeat it if you have more than one Databases. 4. Regarding who is responsible... in most of the cases, If you have small cluster being used by very few teams, then developers and admin can work together but if you have a very large cluster being used by so many teams, then it is out of admin's scope.... again it depends
... View more
04-24-2018
02:25 AM
@ps40 The below link is for enterprise edision, I believe it should be same for other edisions too https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_vd.html 1. so the first point is, According to the above link Ubuntu Xenial 16.04 will be supported by CDH 5.12.2 or above. So if you have decided to upgrade Ubuntu then you have to upgarde CDH/CM as well 2. the second point is, according to the below link, "If you are upgrading CDH or Cloudera Manager as well as the OS, upgrade the OS first" https://www.cloudera.com/documentation/enterprise/5-11-x/topics/cm_ag_upgrading_os.html hope it may give some insights!!
... View more
04-17-2018
07:56 AM
@dpugazhe Below are the usual excercise that we follow to reduce the log history, but ... a. it is purly depending upon your client's business, if they are not demanding to keep longer log hisotry then you can try this b. i've given below few samples, you don't need to reduce history for all the logs, pls do your own research to see which history file is taking more space and take action by reducing the max limit and size of the history file CM -> HDFS -> Configuration -> search for the below 1. navigator.client.max_num_audit_log -> 'The default value is 10' - you can reduce it to 8 or 6 (it is recommended to have more history in general) 2. navigator.audit_log_max_file_size -> 'The default value is 100 MB' - you can reduce it to 80MB or 50MB Note: You can try both --or-- any one 3. DataNode Max Log Size -> ' the default value is 200 MB' - you can reduce as needed 4. DataNode Maximum Log File Backups -> ' the default value is 10' - you can reduce as needed 5. NameNode Max Log Size -> 'the default value is 200 MB' - you can reduce as needed 6. NameNode Maximum Log File Backups -> 'the default value is 300' - you can reduce as needed NOTE: I am repeating again, please consider the point a & b before you take action
... View more
04-17-2018
06:01 AM
1 Kudo
@ronnie10 The issue that you are getting is not related to kerberos I think you don't have access for the /user/root under the below path, please try to access your own home dir, it may help you 'http://192.168.1.7:14000/webhdfs/v1/user/root/t?op=LISTSTATUS'
... View more
04-16-2018
05:21 AM
@ludof yes, in general developers will not have access to create a keytab.. you have to contact your admin for the same (mostly admin should have permission to create the one for you, but there are some organization with a dedicated security team to handle LDAP, AD, Kerberos, etc.. it depends upon your organization, but you have to start with your admin)
... View more
04-16-2018
05:00 AM
@ludof Pls try to follow the input from the below link (read all the comment till end by pressing show more).. a similar issue has been discussed here.. it may help you https://stackoverflow.com/questions/44376334/how-to-fix-delegation-token-can-be-issued-only-with-kerberos-or-web-authenticat
... View more