I want to write a UDF python for pig, to read lines from the file called like #'prefix.csv' spol. LLC Oy OOD and match the names and if finds any matches, then replaces it with white space. here is my python code def list_files2(name, f): fin = open(f, 'r') for line in fin: final = name extra = 'nothing' if (name != name.replace(line.strip(), ' ')): extra = line.strip() final = name.replace(line.strip(), ' ').strip() return final, extra,'insdie if' return final, extra, 'inside for' Running this code in python, >print list_files2('LLC nakisa', 'prefix.csv' ) >print list_files2('AG company', 'prefix.csv' ) returns ('nakisa', 'LLC', 'insdie if') ('AG company', 'nothing', 'inside for') which is exactly what I need. But when I register this code as a UDF in apache pig for this sample list: nakisa company LLC three Oy AG Lans Test OOD pig returns wrong answer on the third line: ((nakisa company,LLC,insdie if)) ((three,Oy,insdie if)) ((A G L a n s,,insdie if)) ((Test,OOD,insdie if)) The question is why UDF enters the if loop for the third entry which does not have any match in the prefix.csv file?
... View more
Hi everyone, I have installed cloudera live on Ubuntu 14.04 with docker, using the below command: $ sudo docker run --hostname=quickstart.cloudera --privileged=true -t -i -p 8888:8888 -p 80:80 --name cloudera cloudera/quickstart /usr/bin/docker-quickstart the problem is, each time I switch on the computer, hue is not accessible on localhost:8888. So, I have to remove the cloudera from the docker and enter the above command again. Then all the scripts and data are gone. I tried $docker stop clouder, but it does not fix the problem. still hue is not accessible. How can I safely terminate the hue or docker that I do not lose the data after each session? thanks
... View more