Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How do I use hdfs file with Wand for Image conversion.

Highlighted

How do I use hdfs file with Wand for Image conversion.

New Contributor

I am trying to convert pdf files to Image and then use pytesseract to ocr the files. I was able to do it successfully on the files which are present in the linux local path but not with hdfs path.



>>> wi(filename = 'hdfs://boboda02.boobo.com:8020/bda/claimsops/raw/Personal_Umbrella_test/09_29_2015_090902.pdf',resolution = 300)

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

File "/home/sam/my_env_1/lib/python2.7/site-packages/Wand-0.4.2-py2.7.egg/wand/image.py", line 2534, in __init__

File "/home/sam/my_env_1/lib/python2.7/site-packages/Wand-0.4.2-py2.7.egg/wand/image.py", line 2601, in read

File "/home/sam/my_env_1/lib/python2.7/site-packages/Wand-0.4.2-py2.7.egg/wand/resource.py", line 222, in raise_exception

wand.exceptions.MissingDelegateError: no decode delegate for this image format `//boboDA02.boobo.COM' @ error/constitute.c/ReadImage/501



Don't have an account?
Coming from Hortonworks? Activate your account here