Member since
09-12-2017
21
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4101 | 11-27-2017 07:49 AM |
09-18-2018
12:48 PM
Hello, In Nifi I would like to unzip a file containing log.gz files. I use a UnpackContent processor with zip value for Packaging Format field. Unfortunatly it does not work and I encounter the following error : 2018-09-17 15:16:18,481 ERROR [Timer-Driven Process Thread-5] o.a.n.processors.standard.UnpackContent UnpackContent[id=09876d75-09bc-1704-0000-0000220050c4] Unable to unpack Stan
dardFlowFileRecord[uuid=9350821c-f4ac-4ad7-8cca-d32087a621c5,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1537197378339-415, container=default, section=415]
, offset=871972, length=55962],offset=0,name=backup-proxy-logs-2018-07-29-0001.zip,size=55962] due to org.apache.nifi.processor.exception.ProcessException: IOException thrown fro
m UnpackContent[id=09876d75-09bc-1704-0000-0000220050c4]: org.apache.commons.compress.archivers.zip.UnsupportedZipFeatureException: unsupported feature data descriptor used in en
try cloud_17338_20180729000000.log.gz; routing to failure: {}
org.apache.nifi.processor.exception.ProcessException: IOException thrown from UnpackContent[id=09876d75-09bc-1704-0000-0000220050c4]: org.apache.commons.compress.archivers.zip.Un
supportedZipFeatureException: unsupported feature data descriptor used in entry cloud_17338_20180729000000.log.gz
at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2590)
at org.apache.nifi.processors.standard.UnpackContent$ZipUnpacker$1.process(UnpackContent.java:383)
at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2175)
at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2145)
at org.apache.nifi.processors.standard.UnpackContent$ZipUnpacker.unpack(UnpackContent.java:356)
at org.apache.nifi.processors.standard.UnpackContent.onTrigger(UnpackContent.java:255)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1122)
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:128)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.commons.compress.archivers.zip.UnsupportedZipFeatureException: unsupported feature data descriptor used in entry cloud_17338_20180729000000.log.gz
at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:417)
at java.io.InputStream.read(InputStream.java:101)
at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
at org.apache.nifi.processors.standard.UnpackContent$ZipUnpacker$1$1.process(UnpackContent.java:386)
at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2571) If someone could help me I would appreciate a lot Thanks in advacne LR
... View more
Labels:
- Labels:
-
Apache NiFi
11-20-2017
10:09 AM
Hello, I think that the problem come from a threshold of 1000 written in FileOutputCommitterContainer.java : indeed: on one side I have this error …17/10/21 02:12:45 INFO mapreduce.Job: Job job_1505194606915_0236 failed with state FAILED due to: Job commit failed: org.apache.hive.hcatalog.common.HCatException : 2012 : Moving of data failed during commit : Could not find a unique destination path for move: file = hdfs://vpbshadop202.mickey.int:8020/data/hive/crim.db/atlas_stats_clob/_SCRATCH0.04665097541205321/part-m-00000 , src = hdfs://vpbshadop202.mickey.int:8020/data/hive/crim.db/atlas_stats_clob/_SCRATCH0.04665097541205321, dest = hdfs://vpbshadop202.mickey.int:8020/data/hive/crim.db/atlas_stats_clob at org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.getFinalPath(FileOutputCommitterContainer.java:662) at org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.moveTaskOutputs(FileOutputCommitterContainer.java:515)… and in FileOutputCommitterContainer.java, I can see Could not find a unique destination path for move when counter = maxAppendAttempts = APPEND_COUNTER_WARN_THRESHOLD = 1000 and in the other side I have: hdfs dfs -ls /data/hive/crim.db/atlas_stats_clob/part-m* | wc -l
999 is there a way to increase this threshold ?
... View more
11-16-2017
01:31 PM
I did an export/import in another table: export table atlas_stats_clob to '/data/hive/export/';
import table atlas_imported from '/data/hive/export/data/'; and try again with the same sqoop import option in the new table: sqoop import-D oraoop.locations=hadop202.mickey.int-D mapred.map.max.attempts=1-D oraoop.import.consistent.read=false-D oraoop.timestamp.string=false--connect jdbc:oracle:thin:@//CRIDB101:1521/appli --username sqoop -password '******' --table=ATLAS_STATS_20171114 --columns=APPLICATION,USERNAME,OFFICE,STAT_TYPE,STAT_KEY,STAT_INFO,TIME_STAMP,REQUESTER,DETAIL_INFO_1,DETAIL_INFO_2,DETAIL_INFO_3,DETAIL_INFO_4,OWNER,STATS_ID,DB_NAME,PARAMS --where "sqoop = 'Z'" --hcatalog-database=crim --hcatalog-table=atlas_imported --num-mappers=2 --split-by=TIME_STAMP but I have the same issue, is there a limit in the number of files for a table? :
... View more
11-14-2017
03:31 PM
Hello, We have an hadoop cluster with 3 nodes in which we had a sqoop import job that worked very well until few days ago. The number of files of the external table in 999 files (is it a maximum number ?). This is the import command : sqoop import -D oraoop.locations=hadop202.mickey.int -D mapred.map.max.attempts=1 -D oraoop.import.consistent.read=false -D oraoop.timestamp.string=false --connect jdbc:oracle:thin:@//CRIDB101:1521/appli --username sqoop -password '******' --table=ATLAS_STATS_20171114 --columns=APPLICATION,USERNAME,OFFICE,STAT_TYPE,STAT_KEY,STAT_INFO,TIME_STAMP,REQUESTER,DETAIL_INFO_1,DETAIL_INFO_2,DETAIL_INFO_3,DETAIL_INFO_4,OWNER,STATS_ID,DB_NAME,PARAMS --where "sqoop = 'Z'" --hcatalog-database=crim --hcatalog-table=atlas_stats_clob --num-mappers=2 --split-by=TIME_STAMP and this the error we get : 17/11/14 16:17:31 INFO mapreduce.Job: Job job_1510660800260_0022 failed with state FAILED due to: Job commit failed: org.apache.hive.hcatalog.common.HCatException : 2012 : Moving of data failed during commit : Could not find a unique destination path for move: file = hdfs://hadop202.mickey.int:8020/data/hive/crim.db/atlas_stats_clob/_SCRATCH0.46847233143209766/part-m-00000 , src = hdfs://hadop202.mickey.int:8020/data/hive/crim.db/atlas_stats_clob/_SCRATCH0.46847233143209766, dest = hdfs://hadop202.mickey.int:8020/data/hive/crim.db/atlas_stats_clob
Thanks for help
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Sqoop
10-10-2017
02:48 PM
2 Kudos
Sorry we forgot to modify the environment variable and now it works ! Thanks a lot for your help 🙂
... View more
10-10-2017
12:44 PM
Ok so JDK installed and jar on the machine. But still the same error .... 😞
... View more
10-10-2017
11:56 AM
Ok thanks, I'm going to try "Manually Installing Oracle JDK 1.7 or 1.8"..
... View more
10-10-2017
08:54 AM
I already tried it : /usr/lib/jvm/jre-1.8.0-openjdk/bin/jar: No such file or directory So we tried using unzip but it is the same
... View more
10-10-2017
08:38 AM
Well, JAVA_HOME is already set in /etc/profile The actual value is /usr/lib/jvm/jre-1.8.0-openjdk-1.8.0.141-1.b16.el7_3.x86_64 The value given above /usr/lib/jvm/jre-1.8.0-openjdk is a link to /usr/lib/jvm/jre-1.8.0-openjdk-1.8.0.141-1.b16.el7_3.x86_64 The 3 nodes are exactly the same they have the same version of all components
... View more