Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

locating HBase log file created by ImportTsv

locating HBase log file created by ImportTsv

Contributor

When loading a csv file into an HBase table, some bad lines are dropped. How I can identify which lines are dropped?


A similar case is here. https://community.hortonworks.com/questions/73985/no-data-shown-in-hbase-after-importtsv.html


Running ImportTsv generates big log files, as mentioned here. https://community.hortonworks.com/articles/4942/import-csv-data-into-hbase-using-importtsv.html.

Maybe the log file can help me but I do not know where those log files are. I expect they are in hadoop storage, rather than linux local storage. So I look into the my hadoop path /app-logs/hdfs/logs/application_1557438882545_0077. The folder name "application_1557438882545_0077" is the mapreduce job id. There is one file inside this folder and the file name is name1.abc.local_45454_1562965335601. This file is not in a human-readable format.

3 REPLIES 3

Re: locating HBase log file created by ImportTsv

If you inspect the Mapper log files, you should be able to find mention of an unparseable row when one is processed. You may have to increase the log level from INFO to DEBUG.

Each Mapper is assigned an InputSplit which will be a contiguous group of lines from the input files that you specified (e.g. fileA lines 50 through 200). You can also use this information to work backwards.

Highlighted

Re: locating HBase log file created by ImportTsv

Contributor

@Josh Elser

Thanks for the reply. Where can I find those mapper log files? I tried the job tracker UI (port 8088) via the Ambari link "Yarn - ResourceManagerUI", but could not find any log record. I did locate the map reduce jo/application id. On the "logs" tab for this job, I picked the only one attempt "appattempt_1557438882545_0078_00001", then I got the message "No container data available!"

Re: locating HBase log file created by ImportTsv

Contributor

By the way, on the Job Tracker UI page for this job, I saw a link to "Log", but that log page does not look like a mapper log, it contains sections for different log types, such as directory info, launch_container.sh, … , stderr, stdout, and syslog.

In addition, on the Ambari - YARN - Configs - Advance page, the "Enable Log Aggregation" is set to be enabled.

Don't have an account?
Coming from Hortonworks? Activate your account here