About SK1

SK1 · ‎05-12-2017

When I upgraded hdp from 2.5.3 to 2.6.0.3 then I am getting following warning in Ambari: HTTP 503 response from http://localhost:21000/api/atlas/admin/status in 0.000s (HTTP Error 503: Service Unavailable) Also following error in application.log file p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Menlo} span.s1 {font-variant-ligatures: no-common-ligatures} span.s2 {font-variant-ligatures: no-common-ligatures; background-color: #e6e600} 2017-05-12 00:06:43,877 INFO - [qtp2050835901-1065 - 0d5633f1-091f-47ac-949e-ed1f3583122e:] ~ Audit: UNKNOWN/127.0.0.1-127.0.0.1 performed request GET http://localhost:21000/api/atlas/admin/status (127.0.0.1) at time 2017-05-12T04:06Z (AUDIT:104) 2017-05-12 00:07:44,005 INFO - [qtp2050835901-1066 - 329b25a9-f9c9-4142-ae11-42f9faed4e7b:] ~ Audit: UNKNOWN/127.0.0.1 performed request GET http://localhost:21000/api/atlas/admin/status (127.0.0.1) at time 2017-05-12T04:07Z (AuditFilter:91) 2017-05-12 00:07:44,006 INFO - [qtp2050835901-1066 - 329b25a9-f9c9-4142-ae11-42f9faed4e7b:] ~ Audit: UNKNOWN/127.0.0.1-127.0.0.1 performed request GET http://localhost:21000/api/atlas/admin/status (127.0.0.1) at time 2017-05-12T04:07Z (AUDIT:104) 2017-05-12 00:08:43,872 INFO - [qtp2050835901-1066 - 2e95c961-12cc-4353-a360-32c1f40f6678:] ~ Audit: UNKNOWN/127.0.0.1 performed request GET http://localhost:21000/api/atlas/admin/status (127.0.0.1) at time 2017-05-12T04:08Z (AuditFilter:91)

SK1 · ‎05-04-2017

There was a missing configuration in source cluster in oozie service. oozie.service.HadoopAccessorService.hadoop.configurations=*={{hadoop_conf_dir}},m1.hdp22:8050=/hdptmp/testconfig,m2.hdp22:8050=/hdptmp/testconfig,m1.hdp22:8020=/hdptmp/testconfig,m2.hdp22:8020=/hdptmp/testconfig

SK1 · ‎04-24-2017

I am trying schedule falcon feed and getting following error. Note: It is giving an error after I tried to run falcon cluster on NN HA. and it was running fine if I do not make Namenode HA in falcon cluster. falcon entity -type feed -name replicationFeed2 -schedule ERROR: Bad Request;default/E0803 : E0803: IO error, null CausedBy: E0803: IO error, null Following is falcon logs error. 2017-04-24 06:09:45,351 ERROR - [1068218748@qtp-461591680-39 - aaa42726-6436-4d13-ad2f-7576deb01baf:s0998dnz:POST//entities/schedule/feed/replicationFeed2] ~ Entity schedule failed for feed: replicationFeed2 (AbstractSchedulableEntityManager:104) org.apache.falcon.FalconException: E0803 : E0803: IO error, null at org.apache.falcon.workflow.engine.OozieWorkflowEngine.dryRunInternal(OozieWorkflowEngine.java:252) at org.apache.falcon.workflow.engine.OozieWorkflowEngine.schedule(OozieWorkflowEngine.java:181)

SK1 · ‎03-14-2017

@Ravi Mutyala Yes I saw that and forgot to mention but I wanted to know the impact if I would disable it ?

SK1 · ‎03-14-2017

Hi Team, I have configured kerberos and integrated it with our AD, also updated krb5.conf file with some additional details but when I restart ambari or services then krb5.conf getting change so any idea how I can prevent it not to update ? Thanks in advance.

SK1 · ‎03-14-2017

Any help ?

SK1 · ‎03-12-2017

I have following value in my krb5.conf [libdefaults] renew_lifetime = 7d forwardable = true default_realm = HADOOPADMIN.COM ticket_lifetime = 24h dns_lookup_realm = false dns_lookup_kdc = false #default_tgs_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5 #default_tkt_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5 [domain_realm] m1.hdp22 = HADOOPADMIN.COM adserver.ad.com = AD.COM [logging] default = FILE:/var/log/krb5kdc.log admin_server = FILE:/var/log/kadmind.log kdc = FILE:/var/log/krb5kdc.log [realms] HADOOPADMIN.COM = { admin_server = m1.hdp22 kdc = m1.hdp22 } AD.COM = { kdc = adserver.ad.com:88 master_kdc = adserver.ad.com:88 kpasswd = adserver.ad.com:464 kpasswd_server = adserver.ad.com:464 }

SK1 · ‎03-12-2017

Hi Team, I have integerated my kerberos cluster to AD, but when I am executing hadoop command then getting following error. security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before 17/03/12 06:32:35 WARN ipc.Client: Couldn't setup connection for sonu@AD.COM to m1.hdp22/192.168.56.41:8020 javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Fail to create credential. (63) - No service creds)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:558) at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:373) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:727) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:723) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415)

SK1 · ‎02-28-2017

Sometime we get a situation where we have to get lists of all long running and based on threshold we need to kill them.Also sometime we need to do it for a specific yarn queue. In such situation following script will help you to do your job. #!/bin/bash if [ "$#" -lt 1 ]; then echo "Usage: $0 <max_life_in_mins>" exit 1 fi yarn application -list 2>/dev/null | grep "report" | grep "RUNNING" | awk '{print $1}' > job_list.txt for jobId in `cat job_list.txt` do finish_time=`yarn application -status $jobId 2>/dev/null | grep "Finish-Time" | awk '{print $NF}'` if [ $finish_time -ne 0 ]; then echo "App $jobId is not running" exit 1 fi time_diff=`date +%s`-`yarn application -status $jobId 2>/dev/null | grep "Start-Time" | awk '{print $NF}' | sed 's!$!/1000!'` time_diff_in_mins=`echo "("$time_diff")/60" | bc` echo "App $jobId is running for $time_diff_in_mins min(s)" if [ $time_diff_in_mins -gt $1 ]; then echo "Killing app $jobId" yarn application -kill $jobId else echo "App $jobId should continue to run" fi done [yarn@m1.hdp22 ~]$ ./kill_application_after_some_time.sh 30 (pass x tim in mins) App application_1487677946023_5995 is running for 0 min(s) App application_1487677946023_5995 should continue to run

SK1 · ‎02-27-2017

Sometime you may have a requirement where you need last accessed time of file in hdfs, then you may get it via following ways: Option 1: Ranger Audit: You can get it via ranger audit but I would not prefer it because output of audit file would be confusing due to many tmp files and dirs which present in audit db. So now to solve this requirement and to fulfill my purpose I have used java APIs and done my work very efficient and in a vary cleaned way. Option 2: Use java and build a small program. With the help of java program you can get some other useful information about hdfs file also in few number of lines. Step 1 : Create java program : package com.saurabh; import java.io.*; import java.util.*; import java.net.*; import java.nio.file.Files; import java.nio.file.Paths; import org.apache.hadoop.fs.*; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapred.*; import org.apache.hadoop.util.*; // For Date Conversion from long to human readable. import java.text.DateFormat; import java.text.SimpleDateFormat; import java.util.ArrayList; import java.util.Calendar; import java.util.Date; import java.util.concurrent.TimeUnit; public class Accesstime { public static void main(String[] args) throws Exception { System.out.println("usage: hadoop jar accessTime.jar <local file-path>"); System.out.println("********************************************************************"); System.out.println("Owner,LastAccessed(Days),LastAccessed(Date),FileName"); System.out.println("********************************************************************"); final String delimiter = ","; List<String> inputLines = new ArrayList<String>(); if (args.length != 0) { try { FileSystem fs = FileSystem.get(new Configuration()); Scanner myScanner = new Scanner(new File(args[0])); FileStatus status; while (myScanner.hasNextLine()) { String line = myScanner.nextLine(); status=fs.getFileStatus(new Path(line)); DateFormat df = new SimpleDateFormat("yyyy-MM-dd"); String owner = status.getOwner(); long lastAccessTimeLong = status.getAccessTime(); Date lastAccessTimeDate = new Date(lastAccessTimeLong); Date date = new Date(); String currentDate = df.format(date); // System.out.println(currentDate + " " + df.format(lastAccessTimeDate)); long diff = date.getTime() - lastAccessTimeDate.getTime(); inputLines.add(owner+delimiter+TimeUnit.DAYS.convert(diff, TimeUnit.MILLISECONDS)+delimiter+df.format(lastAccessTimeDate)+delimiter+line); } Comparator<String> comp = new Comparator<String>() { public int compare(String line1, String line2) { return (-1*(Long.valueOf(line1.split(delimiter)[1].trim()) .compareTo( Long.valueOf(line2.split(delimiter)[1] .trim())))); } }; Collections.sort(inputLines, comp); Iterator itr = inputLines.iterator(); // System.out.println("--------Printing Array List-----------"); while (itr.hasNext()) { System.out.println(itr.next()); } } catch (Exception e) { System.out.println("File not found"); e.printStackTrace(); } }else{ System.out.println("Please provide the absolute file path."); } } } Step 2: Export to jar and then copy jar file to your cluster. Step 3: Create one local file with absolute hdfs file path. [root@m1 ~]# cat input.txt /user/raghu/wordcount_in/words.txt /user/raghu/wordcount_out/_SUCCESS /user/raghu/wordcount_out/part-r-00000 Step 4: Now run your java jar file and it will give you all required details(Owner,LastAccessed(Days),LastAccessed(Date),FileName😞 [root@m1 ~]# hadoop jar accessTime.jar input.txt usage: hadoop jar accessTime.jar <local file-path> ******************************************************************** Owner,LastAccessed(Days),LastAccessed(Date),FileName ******************************************************************** saurkuma,20,2017-02-06,/user/raghu/wordcount_out/_SUCCESS raghu,16,2017-02-10,/user/raghu/wordcount_in/words.txt raghu,16,2017-02-10,/user/raghu/wordcount_out/part-r-00000

Online	Offline
Last Visited	‎01-15-2021 02:12 AM

Member Since	‎05-29-2017 06:13 AM
Last Visited	‎01-15-2021 02:12 AM
Posts	408
Kudos received	123

Cloudera Community

Re: select * from table is not returning any rows ...

Re: Falcon feed throwing an error during feed sche...

Re: we are getting below error/exception while exe...

Re: bin/solr status throwing an error "clusterstat...

Re: How to solve buffer memory issue in Solr

Atlas Metadata Server error HTTP 503 response from...

Re: Falcon feed throwing an error during feed sche...

Falcon feed throwing an error during feed schedule...

Re: ambari restart and service restart updating kr...

ambari restart and service restart updating krb5.c...

Re: security.UserGroupInformation: Not attempting ...

Re: security.UserGroupInformation: Not attempting ...

security.UserGroupInformation: Not attempting to r...

script to kill yarn application if it is running m...

How to get last access time of any files in hdfs