Member since
05-29-2017
408
Posts
123
Kudos Received
9
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2785 | 09-01-2017 06:26 AM | |
1697 | 05-04-2017 07:09 AM | |
1459 | 09-12-2016 05:58 PM | |
2060 | 07-22-2016 05:22 AM | |
1625 | 07-21-2016 07:50 AM |
05-04-2017
07:09 AM
There was a missing configuration in source cluster in oozie service. oozie.service.HadoopAccessorService.hadoop.configurations=*={{hadoop_conf_dir}},m1.hdp22:8050=/hdptmp/testconfig,m2.hdp22:8050=/hdptmp/testconfig,m1.hdp22:8020=/hdptmp/testconfig,m2.hdp22:8020=/hdptmp/testconfig
... View more
03-14-2017
01:35 PM
If you disable it, you need to manage krb5.conf on your own, either with puppet/chef or some other way. Ambari will not overwrite whatever is present in /etc/krb5.conf.
... View more
03-14-2017
04:21 PM
Hello @Saurabh, If you look the error message closely, it says 'No service creds'. Since you are running hadoop command, this most probably means that the NameNode service keytab is either missing or not good. For both the cases, please check NameNode log for any error during service startup. To verify the service keytabs, try running these on NameNode: su - hdfs
kinit -kt /etc/security/keytabs/nn.service.keytab nn/<nn-host-fqdn>@REALM
The last command should give you a correct TGT for NN service principal, that would show that NN service keytab is good. Lastly, you can try to regenerate the keytabs for all the services. Hope this helps !
... View more
02-28-2017
07:52 PM
4 Kudos
Sometime we get a situation where we have to get lists of all long
running and based on threshold we need to kill them.Also sometime we
need to do it for a specific yarn queue. In such situation following
script will help you to do your job. #!/bin/bash
if [ "$#" -lt 1 ]; then
echo "Usage: $0 <max_life_in_mins>"
exit 1
fi
yarn application -list 2>/dev/null | grep "report" | grep "RUNNING" | awk '{print $1}' > job_list.txt
for jobId in `cat job_list.txt`
do
finish_time=`yarn application -status $jobId 2>/dev/null | grep "Finish-Time" | awk '{print $NF}'`
if [ $finish_time -ne 0 ]; then
echo "App $jobId is not running"
exit 1
fi
time_diff=`date +%s`-`yarn application -status $jobId 2>/dev/null | grep "Start-Time" | awk '{print $NF}' | sed 's!$!/1000!'`
time_diff_in_mins=`echo "("$time_diff")/60" | bc`
echo "App $jobId is running for $time_diff_in_mins min(s)"
if [ $time_diff_in_mins -gt $1 ]; then
echo "Killing app $jobId"
yarn application -kill $jobId
else
echo "App $jobId should continue to run"
fi
done
[yarn@m1.hdp22 ~]$ ./kill_application_after_some_time.sh 30 (pass x tim in mins) App application_1487677946023_5995 is running for 0 min(s) App application_1487677946023_5995 should continue to run
... View more
Labels:
02-27-2017
06:43 PM
1 Kudo
Sometime you may have a requirement where you need last accessed time of file in hdfs, then you may get it via following ways: Option 1: Ranger Audit: You can get it via ranger audit but I would not prefer it because output of audit file would be confusing due to many tmp files and dirs which present in audit db. So now to solve this requirement and to fulfill my purpose I have used java APIs and done my work very efficient and in a vary cleaned way. Option 2: Use java and build a small program. With the help of java program you can get some other useful information about hdfs file also in few number of lines. Step 1 : Create java program : package com.saurabh;
import java.io.*;
import java.util.*;
import java.net.*;
import java.nio.file.Files;
import java.nio.file.Paths;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;
// For Date Conversion from long to human readable.
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Calendar;
import java.util.Date;
import java.util.concurrent.TimeUnit;
public class Accesstime {
public static void main(String[] args) throws Exception {
System.out.println("usage: hadoop jar accessTime.jar <local file-path>");
System.out.println("********************************************************************");
System.out.println("Owner,LastAccessed(Days),LastAccessed(Date),FileName");
System.out.println("********************************************************************");
final String delimiter = ",";
List<String> inputLines = new ArrayList<String>();
if (args.length != 0) {
try {
FileSystem fs = FileSystem.get(new Configuration());
Scanner myScanner = new Scanner(new File(args[0]));
FileStatus status;
while (myScanner.hasNextLine())
{
String line = myScanner.nextLine();
status=fs.getFileStatus(new Path(line));
DateFormat df = new SimpleDateFormat("yyyy-MM-dd");
String owner = status.getOwner();
long lastAccessTimeLong = status.getAccessTime();
Date lastAccessTimeDate = new Date(lastAccessTimeLong);
Date date = new Date();
String currentDate = df.format(date);
// System.out.println(currentDate + " " + df.format(lastAccessTimeDate));
long diff = date.getTime() - lastAccessTimeDate.getTime();
inputLines.add(owner+delimiter+TimeUnit.DAYS.convert(diff, TimeUnit.MILLISECONDS)+delimiter+df.format(lastAccessTimeDate)+delimiter+line);
}
Comparator<String> comp = new Comparator<String>() {
public int compare(String line1, String line2) {
return (-1*(Long.valueOf(line1.split(delimiter)[1].trim())
.compareTo(
Long.valueOf(line2.split(delimiter)[1]
.trim()))));
}
};
Collections.sort(inputLines, comp);
Iterator itr = inputLines.iterator();
// System.out.println("--------Printing Array List-----------");
while (itr.hasNext()) {
System.out.println(itr.next());
}
} catch (Exception e) {
System.out.println("File not found");
e.printStackTrace();
}
}else{
System.out.println("Please provide the absolute file path.");
}
}
} Step 2: Export to jar and then copy jar file to your cluster. Step 3: Create one local file with absolute hdfs file path. [root@m1 ~]# cat input.txt /user/raghu/wordcount_in/words.txt /user/raghu/wordcount_out/_SUCCESS /user/raghu/wordcount_out/part-r-00000 Step 4: Now run your java jar file and it will give you all required details(Owner,LastAccessed(Days),LastAccessed(Date),FileName😞 [root@m1 ~]# hadoop jar accessTime.jar input.txt usage: hadoop jar accessTime.jar <local file-path> ******************************************************************** Owner,LastAccessed(Days),LastAccessed(Date),FileName ******************************************************************** saurkuma,20,2017-02-06,/user/raghu/wordcount_out/_SUCCESS raghu,16,2017-02-10,/user/raghu/wordcount_in/words.txt raghu,16,2017-02-10,/user/raghu/wordcount_out/part-r-00000
... View more
Labels:
02-27-2017
03:46 PM
Here is the solution: https://community.hortonworks.com/articles/82106/ambari-manually-disable-kerberos-authentication-fo.html
... View more
02-20-2017
10:44 AM
1 Kudo
@Saurabh You can try the following code: import java.io.*;
import java.util.*;
import java.net.*;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;
public class FileStatusChecker {
public static void main (String [] args) throws Exception {
try{
FileSystem fs = FileSystem.get(new Configuration());
FileStatus[] status = fs.listStatus(new Path("hdfs://sandbox.hortonworks.com:8020/testing/ambari-server.log")); // you need to pass in your hdfs path
for (int i=0;i<status.length;i++){
String path = status[i].getPath().toString();
String owner = status[i].getOwner();
System.out.println("\n\t PATH: " + path + "\t OWNER: " +owner);
}
} catch(Exception e){
System.out.println("File not found");
e.printStackTrace();
}
}
} . Here in the above code you can pass either a specific file newPath("hdfs://sandbox.hortonworks.com:8020/testing/ambari-server.log") Or a directory as well: newPath("hdfs://sandbox.hortonworks.com:8020/testing")
... View more
02-07-2017
06:42 AM
@Saurabh You can check /var/log/messages to see if installation has started. Also if you want to check how much data has been downloaded, yum keeps package in yum cache while downloading, you can run 'du -sh' in watch command to check the status. Example. Before downloading package [root@prodnode1 ~]# ls -lrt /var/cache/yum//x86_64/6/Updates-ambari-2.4.0.1/packages/
total 0 Download started [root@prodnode1 ~]# /usr/bin/yum -d 0 -e 0 -y install ambari-metrics-hadoop-sink Status of cache directory [root@prodnode1 ~]# ls -lrt /var/cache/yum//x86_64/6/Updates-ambari-2.4.0.1/packages/
total 4552
-rw-r--r--. 1 root root 4660232 Aug 30 20:49 ambari-metrics-hadoop-sink-2.4.0.1-1.x86_64.rpm After installation is complete, package gets removed from cached location. You can run something like below to keep watch over download [root@prodnode1 ~]# watch du -sh /var/cache/yum//x86_64/6/Updates-ambari-2.4.0.1/packages/ambari-metrics-hadoop-sink-2.4.0.1-1.x86_64.rpm Hope this what you were looking for. Please do let us know if you have any further questions! 🙂
... View more
02-04-2017
04:00 AM
Thanks a lot @swagle. I have commented this backup call function and now it is not creating backup copy. Thanks once again. # update properties in a section-less properties file # Cannot use ConfigParser due to bugs in version 2.6 def update_properties(propertyMap): conf_file = search_file(AMBARI_PROPERTIES_FILE, get_conf_dir()) #backup_file_in_temp(conf_file) if propertyMap is not None and conf_file is not None: properties = Properties() try: with open(conf_file, 'r') as file: properties.load(file) except (Exception), e: print_error_msg('Could not read "%s": %s' % (conf_file, e)) return -1 for key in propertyMap.keys(): properties.removeOldProp(key) properties.process_pair(key, str(propertyMap[key])) for key in properties.keys(): if not propertyMap.has_key(key): properties.removeOldProp(key) with open(conf_file, 'w') as file: properties.store_ordered(file)
... View more
02-02-2017
10:07 AM
@Jay SenSharma: Can you please help me to get dir last accessed time also. Above one is not working for dir.
... View more