Member since
10-11-2013
25
Posts
5
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
8708 | 05-11-2017 03:07 PM | |
2429 | 08-13-2014 03:46 AM | |
1767 | 12-10-2013 07:51 AM |
05-11-2017
03:07 PM
3 Kudos
Have u tried this: hive.server2.parallel.ops.in.session=true https://github.com/cloudera/hue/issues/515 Regards, Andrey
... View more
08-13-2014
03:46 AM
Hi Pavel, I had the same problem. You can try this: 1) Open /opt/cloudera/parcels/CDH-5.1.0-1.cdh5.1.0.p0.53/lib/hue/apps/search/src/search/views.py 2) Add this before LOG = logging.getLogger(__name__) : import sys reload(sys) # Reload does the trick! sys.setdefaultencoding('UTF8') 3) Restart Hue service. Hope this helps. Update: Here is better solution: https://issues.cloudera.org/browse/HUE-2279 P.S. По аватарке думается, что вы из России. Так ли это? Regrds, Andrey
... View more
12-10-2013
07:51 AM
Resolved. I was using WholeFileInputFormat and used value.getBytes() instead of value.copyBytes().
... View more
12-10-2013
03:23 AM
Hello, I have written java Map-only program for sending files from folder on HDFS to Solr. It is working fine for all files, except *.docx. (program v1). I also modified this program to run as simple java program without hadoop. It takes files from ext3 filesystem and sends to Solr. (program v2). Using program v2 I was able to send my *.docx file to solr, but if I put it in HDFS and start MapReduce program (v1) i can't index it. I get such error:: org.apache.solr.common.SolrException: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser@42e20459 at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:225) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1909) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:739) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:380) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:169) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.solr.servlet.ProxyUserFilter.doFilter(ProxyUserFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.solr.servlet.SolrHadoopAuthenticationFilter$2.doFilter(SolrHadoopAuthenticationFilter.java:140) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:384) at org.apache.solr.servlet.SolrHadoopAuthenticationFilter.doFilter(SolrHadoopAuthenticationFilter.java:145) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.solr.servlet.HostnameFilter.doFilter(HostnameFilter.java:86) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser@42e20459 at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219) ... 30 more Caused by: org.apache.poi.openxml4j.exceptions.InvalidOperationException: Can't open the specified file: '/var/lib/solr/apache-tika-242864397475446795.tmp' at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:103) at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:207) at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:70) at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:82) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) ... 33 more Also in folder /var/lib/solr/ there are no file apache-tika-242864397475446795.tmp Any suggestions?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Solr
-
HDFS
-
MapReduce
-
Security
10-28-2013
10:44 AM
No, I stoped all jobes before changing configuration in MapReduce service, and restarted all cluster. I also checked folder mapred/local/taskTracker/hdfs/jobcache and i am sure it was empty. Thank you for the link, but I found nothing about jobcache folder. Also, after the job failes or completed job folder are deleted from jobcache folder whith all attemp_xxxx folders.
... View more
10-28-2013
08:36 AM
Hey Chris, I'm sorry, I was wrong. It did not help. I have set MapReduce Service Configuration Safety Valve for mapred-site.xml to <property> <name>mapreduce.tasktracker.local.cache.numberdirectories</name> <value>2000</value> </property> And now I see that I have no space on disk. In folder mapred/local/taskTracker/hdfs/jobcache/job_201310281050_0001 there are more then 6000 files. (On the other host there are more then 5000). P.S. I cheked job.xml and there are value mapreduce.tasktracker.local.cache.numberdirectories and it is set to 2000. Thanks Andrey
... View more
10-26-2013
12:29 AM
Hey Chris, I went to Services>MapReduce>Configuration(View and Edit) and past in search mapreduce.tasktracker.local.cache.numberdirectories. I found nothing. Also I typed only cache and only local, and also nothing. Thanks Markovich
... View more
10-26-2013
12:24 AM
Hey Chris, Yes, new workflows work until JT failover. I am runnning my jobs from command line now. I have another strange thing. Here is output of my MapReduce Job: 13/10/25 18:00:46 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 13/10/25 18:00:56 INFO input.FileInputFormat: Total input paths to process : 12451 13/10/25 18:01:06 INFO mapred.JobClient: Running job: job_201310251009_0004 13/10/25 18:01:07 INFO mapred.JobClient: map 0% reduce 0% 13/10/25 18:03:48 INFO mapred.JobClient: map 1% reduce 0% 13/10/25 18:08:25 INFO mapred.JobClient: map 2% reduce 0% 13/10/25 18:12:58 INFO mapred.JobClient: map 3% reduce 0% 13/10/25 18:17:21 INFO mapred.JobClient: map 4% reduce 0% 13/10/25 18:21:53 INFO mapred.JobClient: map 5% reduce 0% 13/10/25 18:26:26 INFO mapred.JobClient: map 6% reduce 0% 13/10/25 18:30:59 INFO mapred.JobClient: map 7% reduce 0% 13/10/25 18:35:26 INFO mapred.JobClient: map 8% reduce 0% 13/10/25 18:39:51 INFO mapred.JobClient: map 9% reduce 0% 13/10/25 18:44:29 INFO mapred.JobClient: map 10% reduce 0% 13/10/25 18:48:53 INFO mapred.JobClient: map 11% reduce 0% 13/10/25 18:53:30 INFO mapred.JobClient: map 12% reduce 0% 13/10/25 18:57:52 INFO mapred.JobClient: map 13% reduce 0% 13/10/25 19:02:11 INFO mapred.JobClient: map 14% reduce 0% 13/10/25 19:06:32 INFO mapred.JobClient: map 15% reduce 0% 13/10/25 19:10:59 INFO mapred.JobClient: map 16% reduce 0% 13/10/25 19:15:19 INFO mapred.JobClient: map 17% reduce 0% 13/10/25 19:19:38 INFO mapred.JobClient: map 18% reduce 0% 13/10/25 19:23:55 INFO mapred.JobClient: map 19% reduce 0% 13/10/25 19:28:27 INFO mapred.JobClient: map 20% reduce 0% 13/10/25 19:32:46 INFO mapred.JobClient: map 21% reduce 0% 13/10/25 19:37:16 INFO mapred.JobClient: map 22% reduce 0% 13/10/25 19:41:44 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10. Trying to fail over immediately. 13/10/25 19:41:44 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 1 fail over attempts. Trying to fail over after sleeping for 1200ms. 13/10/25 19:41:45 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 2 fail over attempts. Trying to fail over after sleeping for 1403ms. 13/10/25 19:41:47 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 3 fail over attempts. Trying to fail over after sleeping for 2140ms. 13/10/25 19:41:49 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 4 fail over attempts. Trying to fail over after sleeping for 1934ms. 13/10/25 19:41:55 INFO mapred.JobClient: map 0% reduce 0% 13/10/25 19:44:50 INFO mapred.JobClient: map 1% reduce 0% 13/10/25 19:49:18 INFO mapred.JobClient: map 2% reduce 0% 13/10/25 19:53:51 INFO mapred.JobClient: map 3% reduce 0% 13/10/25 19:58:23 INFO mapred.JobClient: map 4% reduce 0% 13/10/25 20:02:53 INFO mapred.JobClient: map 5% reduce 0% 13/10/25 20:07:23 INFO mapred.JobClient: map 6% reduce 0% 13/10/25 20:11:54 INFO mapred.JobClient: map 7% reduce 0% 13/10/25 20:16:25 INFO mapred.JobClient: map 8% reduce 0% 13/10/25 20:20:54 INFO mapred.JobClient: map 9% reduce 0% 13/10/25 20:25:16 INFO mapred.JobClient: map 10% reduce 0% 13/10/25 20:29:54 INFO mapred.JobClient: map 11% reduce 0% .... .... ... 13/10/26 03:28:34 INFO mapred.JobClient: map 54% reduce 0% 13/10/26 03:32:51 INFO mapred.JobClient: map 55% reduce 0% 13/10/26 03:34:58 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10. Trying to fail over immediately. 13/10/26 03:34:58 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 1 fail over attempts. Trying to fail over after sleeping for 862ms. 13/10/26 03:34:59 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 2 fail over attempts. Trying to fail over after sleeping for 1076ms. 13/10/26 03:35:00 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 3 fail over attempts. Trying to fail over after sleeping for 1243ms. 13/10/26 03:35:02 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 4 fail over attempts. Trying to fail over after sleeping for 983ms. 13/10/26 03:35:06 INFO mapred.JobClient: map 0% reduce 0% 13/10/26 03:38:08 INFO mapred.JobClient: map 1% reduce 0% 13/10/26 03:42:36 INFO mapred.JobClient: map 2% reduce 0% 13/10/26 03:47:05 INFO mapred.JobClient: map 3% reduce 0% 13/10/26 03:51:34 INFO mapred.JobClient: map 4% reduce 0% 13/10/26 03:56:01 INFO mapred.JobClient: map 5% reduce 0% 13/10/26 04:00:36 INFO mapred.JobClient: map 6% reduce 0% 13/10/26 04:05:01 INFO mapred.JobClient: map 7% reduce 0% 13/10/26 04:09:31 INFO mapred.JobClient: map 8% reduce 0% 13/10/26 04:13:55 INFO mapred.JobClient: map 9% reduce 0% 13/10/26 04:18:27 INFO mapred.JobClient: map 10% reduce 0% 13/10/26 04:23:09 INFO mapred.JobClient: map 11% reduce 0% 13/10/26 04:27:42 INFO mapred.JobClient: map 12% reduce 0% 13/10/26 04:31:59 INFO mapred.JobClient: map 13% reduce 0% 13/10/26 04:36:21 INFO mapred.JobClient: map 14% reduce 0% 13/10/26 04:40:49 INFO mapred.JobClient: map 15% reduce 0% 13/10/26 04:45:20 INFO mapred.JobClient: map 16% reduce 0% 13/10/26 04:49:38 INFO mapred.JobClient: map 17% reduce 0% 13/10/26 04:53:59 INFO mapred.JobClient: map 18% reduce 0% 13/10/26 04:58:21 INFO mapred.JobClient: map 19% reduce 0% 13/10/26 05:02:42 INFO mapred.JobClient: map 20% reduce 0% 13/10/26 05:06:57 INFO mapred.JobClient: map 21% reduce 0% 13/10/26 05:11:20 INFO mapred.JobClient: map 22% reduce 0% 13/10/26 05:15:36 INFO mapred.JobClient: map 23% reduce 0% 13/10/26 05:19:55 INFO mapred.JobClient: map 24% reduce 0% 13/10/26 05:24:26 INFO mapred.JobClient: map 25% reduce 0% 13/10/26 05:28:48 INFO mapred.JobClient: map 26% reduce 0% 13/10/26 05:33:16 INFO mapred.JobClient: map 27% reduce 0% 13/10/26 05:37:43 INFO mapred.JobClient: map 28% reduce 0% 13/10/26 05:42:09 INFO mapred.JobClient: map 29% reduce 0% 13/10/26 05:46:36 INFO mapred.JobClient: map 30% reduce 0% 13/10/26 05:48:53 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10. Trying to fail over immediately. 13/10/26 05:48:53 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 1 fail over attempts. Trying to fail over after sleeping for 1184ms. 13/10/26 05:48:55 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 2 fail over attempts. Trying to fail over after sleeping for 1239ms. 13/10/26 05:48:56 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 3 fail over attempts. Trying to fail over after sleeping for 2134ms. 13/10/26 05:48:58 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 4 fail over attempts. Trying to fail over after sleeping for 1728ms. 13/10/26 05:49:01 INFO mapred.JobClient: map 0% reduce 0% 13/10/26 05:52:05 INFO mapred.JobClient: map 1% reduce 0% 13/10/26 05:56:44 INFO mapred.JobClient: map 2% reduce 0% 13/10/26 06:01:27 INFO mapred.JobClient: map 3% reduce 0% 13/10/26 06:06:02 INFO mapred.JobClient: map 4% reduce 0% 13/10/26 06:10:34 INFO mapred.JobClient: map 5% reduce 0% 13/10/26 06:15:13 INFO mapred.JobClient: map 6% reduce 0% 13/10/26 06:19:49 INFO mapred.JobClient: map 7% reduce 0% 13/10/26 06:21:46 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10. Trying to fail over immediately. 13/10/26 06:21:46 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 1 fail over attempts. Trying to fail over after sleeping for 571ms. 13/10/26 06:21:47 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 2 fail over attempts. Trying to fail over after sleeping for 1910ms. 13/10/26 06:21:49 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 3 fail over attempts. Trying to fail over after sleeping for 1982ms. 13/10/26 06:21:51 WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10 after 4 fail over attempts. Trying to fail over after sleeping for 1868ms. 13/10/26 06:21:56 INFO mapred.JobClient: map 0% reduce 0% 13/10/26 06:24:52 INFO mapred.JobClient: map 1% reduce 0% 13/10/26 06:29:42 INFO mapred.JobClient: map 2% reduce 0% 13/10/26 06:34:32 INFO mapred.JobClient: map 3% reduce 0% 13/10/26 06:39:12 INFO mapred.JobClient: map 4% reduce 0% 13/10/26 06:43:48 INFO mapred.JobClient: map 5% reduce 0% The job was running while one JT (active) failed (? or was forced off by the other standby JT) and the second standby JT became active. It's all fine, they just switched roles , but the job restarted. I started from 0% of mappers. And so every time JT switch roles. Maybe I did not properly configured failover controller ? This is very strange message, I can't find anything about it. (WARN retry.RetryInvocationHandler: Exception while invoking getTaskCompletionEvents of class $Proxy10. Trying to fail over immediately.) In JT logs here are only WARN: WARN mapreduce.Counters Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead Thanks Andrey
... View more
10-25-2013
04:18 AM
Hi, Chris! I am running the workflow from inside HUE. Thanks, for reply. May be you can also help me with this question (http://community.cloudera.com/t5/Batch-Processing-and-Workflow/How-to-limit-jobcache-foledr-size/m-p/2361#U2361) ? Thanks, Andrey
... View more
10-24-2013
01:03 PM
Hello, I have a hadoop cluster with CDH 4.4.0 and HDFS HA and JobTracker HA. I tried to lunch oozie workflow but got this: JA006: Call From xxxx to xxxx failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused I found this: https://issues.cloudera.org/browse/HUE-1631 But I don't know how to fix my problem. Where to configure oozie to work with JobTracker HA?
... View more
Labels:
10-17-2013
01:40 AM
Hello hadoop experts, I have new problem and new question for you. I have a lot of ( approximately 70,000 )small files, a total of 40 GB. I've developed Map only java program to analysis this files. I have no reducers and have no output. Only counters. I've a single node setup of CDH 4.4.0. I've got hard disk with 150GB. And all Hadoop environment uses it (logs,libs,hdfs data files and so on). So after I put all the files in HDFS, I was left with a little less than 100GB of free space. I have successfully started my job, and 10 hours later Hadoop fell down due to the fact that it hasn't enough space to write the log. I looked on my disk and found than folder mapred/local/taskTracker/hdfs/jobcache/job_xxxx_xxx occupies all available space. Hadoop has processed approximately 10,000 of files, so that in folder was approximately 10,000 subfolders each of which contains only one file job.xml (weight 8MB). So 8mb * 10,000 ~ 78 Gb. And here is my question: How can I process 70,000 of files? (I will need approximately 550GB of free space to process 40 GB of small files!) Is it possible to configure Hadoop to cleanup after every map? Regards, Markovich
... View more
Labels:
10-15-2013
11:24 AM
Thanks, this solved my problem.
... View more
10-11-2013
10:37 AM
Hello. I have a lot of different files *.doc, *.pdf and so on. I wanted to process them with mapReduce. I put them in HDFS and then started java MapReduce program using Hue. If files are well formated and doesn't have brackets "(){}[]" in their name all goes fine. But if there is a file OPN_last_[age.PDF I get this errors: Failing Oozie Launcher, Main class [distr.fors.ru.Index], main() threw exception, Illegal file pattern: Unclosed character class near index 17
OPN_last_[age.PDF
^
java.io.IOException: Illegal file pattern: Unclosed character class near index 17
OPN_last_[age.PDF
^
at org.apache.hadoop.fs.GlobFilter.init(GlobFilter.java:70)
at org.apache.hadoop.fs.GlobFilter.<init>(GlobFilter.java:49)
at org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1670)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1627)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:211)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:248)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:992)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:566)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596)
at distr.fors.ru.Index.run(Index.java:78)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at distr.fors.ru.Index.main(Index.java:39)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:495)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.util.regex.PatternSyntaxException: Unclosed character class near index 17
OPN_last_[age.PDF
^
at org.apache.hadoop.fs.GlobPattern.error(GlobPattern.java:167)
at org.apache.hadoop.fs.GlobPattern.set(GlobPattern.java:151)
at org.apache.hadoop.fs.GlobPattern.<init>(GlobPattern.java:42)
at org.apache.hadoop.fs.GlobFilter.init(GlobFilter.java:66)
... 32 more If there is a file like this: {2011-01-27} (3769330).pdf I get such error: Input Pattern hdfs://fd-bigdata.distr.fors.ru:8020/{2011-01-27} (3769330).pdf matches 0 files at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:231) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:248) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080) at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:992) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596) at distr.fors.ru.Index.run(Index.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at distr.fors.ru.Index.main(Index.java:37) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:495) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.Child.main(Child.java:262) I realy need to process such files. What can I make to solve such problems? P.S. I am using the latest CDH 4.4.0.
... View more