Created 02-26-2018 10:19 PM
I am creating some code to inspect a zip file with multiple sub folders and I want to look through the files in the zip and then pass the original content.
I am trying to do this without writing the file to disk.
Any ideas for how I could do this all in ram? or the nifi content repositories?
I also used a guide to write the code in groovy and I was wondering how I could pass original content, in groovy?
John
Created 02-26-2018 10:32 PM
What kinds of operations are you trying to perform on the files in the ZIP?
Created 02-26-2018 10:46 PM
@Matt Burgess I want to look at an xml doc inside the zip within a directory in the zip. And grab a value of a tag. The thing is I want the original flowfile to pass along with the tag value.
Thanks Matt!
Created 02-27-2018 12:34 AM
You know the path of the XML doc? I'm still looking at memory vs temp disk storage, if you can fill in this blank I hope to have an answer for you (in Groovy probably lol) tomorrow 🙂
Created 03-05-2018 08:35 PM
Hey @Matt Burgess just wondering if you had a chance to point me in the right direction. 🙂 I appreciate the help!
Created 02-27-2018 08:33 PM
Yeah! the path is ./docProps/app.xml Its taking apart a docx/xlsx/pptx document and looking at the tag that tells the version of office it was created with. Any office document is really just a zip file. Hence what I'm trying to do 🙂 Thanks for your help!
Created 03-05-2018 07:29 PM
Doing a trick like this to determine the MS version of a file would also be useful to me in my Nifi flow, so I'm keeping an eye on this.
Created 03-09-2018 06:12 PM
import zipfile
from org.apache.nifi.processor.io import InputStreamCallback
class ReadVersion(InputStreamCallback)
  def __init__(self):
    self.ff = None
    self.version = ''
    self.error = ''
  def process(self,inputStream):
    try:
      zipname = self.ff.getAttribute('filename')
      zippath = self.ff.getAttribute('absolute.path')
      zfile = zipfile.ZipFile(zippath+zipname)
      for name in zfile.namelist():
        if (name == 'docProps/app.xml'):
          inFile = zfile.open(name)
          inContents = infile.read()
          loc = inContents.find('<AppVersion>1')
          if (loc != -1):
            keyChar = inContents[loc+13:loc+14]
            if (keyChar == '2'):
              self.version = '2007'
            elif (keyChar == '4'):
              self.version = '2010'
            elif (keyChar == '5'):
              self.version = '2013'
            elif (keyChar == '6'):
              self.version = '2016'
            else:
              log.warn('Unexpected AppVersion value: ',inContents[loc+12:loc+14])
    except:
      log.warn('exception thrown (is this really a zip file?)')
      self.error = 'error'
ff = session.get()
if (ff != None):
  callback = ReadVersion()
  callback.ff = ff
  session.read(ff, callback)
  if (callback.version != ''):
    ff = session.putAttribute(ff,'MSVersion',callback.version)
    session.transfer(ff, REL_SUCCESS)
  if (callback.error == 'error'):
    session.transfer(ff, REL_FAILURE)
					
				
			
			
				
			
			
			
			
			
			
			
		 
					
				
				
			
		
