Support Questions
Find answers, ask questions, and share your expertise

Pig examples are not workiing in zeppelin,PIG examples not working in Zeppelin

Contributor

I have integrated Zepplin with CDH 5.8. My %pig paragraph works but %pig.query fails even though its same as in zeppelin tutorial..

%pig

bankText = load 'hdfs://nameservice1/user/rbodolla/emp.txt' using PigStorage(';');

bank = foreach bankText generate $0 as age, $1 as job, $2 as marital, $3 as education, $5 as balance; bank = filter bank by age != '"age"';

bank = foreach bank generate (int)age, REPLACE(job,'"','') as job, REPLACE(marital, '"', '') as marital, (int)(REPLACE(balance, '"', '')) as balance;

Took 0 sec. Last updated by anonymous at March 23 2017, 10:11:08 AM.

%pig.query

bank_data = filter bank by age < ${maxAge=40};

b = group bank_data by age;

foreach b generate group, COUNT($1);

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias paragraph_20170323_042907_1682133128 at org.apache.pig.PigServer.openIterator(PigServer.java:1019) at org.apache.zeppelin.pig.PigQueryInterpreter.interpret(PigQueryInterpreter.java:104) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:94) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:489) at org.apache.zeppelin.scheduler.Job.run(Job.java:175) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias paragraph_20170323_042907_1682133128 at org.apache.pig.PigServer.storeEx(PigServer.java:1122) at org.apache.pig.PigServer.store(PigServer.java:1081) at org.apache.pig.PigServer.openIterator(PigServer.java:994) ... 12 more Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 6000: Output Location Validation Failed for: 'hdfs://nameservice1/tmp/temp1361302589/tmp1259787257 More info to follow: sl73caedbd016.visa.com;sl73caedbd017.visa.com at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:95) at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53) at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52) at org.apache.pig.newplan.logical.relational.LogicalPlan.validate(LogicalPlan.java:212) at org.apache.pig.PigServer$Graph.compile(PigServer.java:1851) at org.apache.pig.PigServer$Graph.access$300(PigServer.java:1527) at org.apache.pig.PigServer.storeEx(PigServer.java:1117) ... 14 more Caused by: java.net.UnknownHostException: sl73caedbd016.visa.com;sl73caedbd017.visa.com at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:668) at sun.net.NetworkClient.doConnect(NetworkClient.java:175) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264) at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191) at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177) at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:186) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:127) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:216) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:284) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:165) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:348) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:786) at org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86) at org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2046) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:121) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:142) at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:69) ... 29 more

,

I have installed zeppelin and integrated it with CDH 5.8.3. First paragraph runs fine but second fails.

%pig

bankText = load 'hdfs://nameservice1/user/rbodolla/emp.txt' using PigStorage(';');

bank = foreach bankText generate $0 as age, $1 as job, $2 as marital, $3 as education, $5 as balance; bank = filter bank by age != '"age"';

bank = foreach bank generate (int)age, REPLACE(job,'"','') as job, REPLACE(marital, '"', '') as marital, (int)(REPLACE(balance, '"', '')) as balance;

%pig.query

bank_data = filter bank by marital=='${marital=SINGLE,SINGLE|MARRIED}';

b = group bank_data by age;

foreach b generate group, COUNT($1);

Error for second paragraph:

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias paragraph_20170323_043352_558291068 at org.apache.pig.PigServer.openIterator(PigServer.java:1019) at org.apache.zeppelin.pig.PigQueryInterpreter.interpret(PigQueryInterpreter.java:104) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:94) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:489) at org.apache.zeppelin.scheduler.Job.run(Job.java:175) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias paragraph_20170323_043352_558291068 at org.apache.pig.PigServer.storeEx(PigServer.java:1122) at org.apache.pig.PigServer.store(PigServer.java:1081) at org.apache.pig.PigServer.openIterator(PigServer.java:994) ... 12 more Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 6000: Output Location Validation Failed for: 'hdfs://nameservice1/tmp/temp1563995376/tmp2051692764 More info to follow: sl73caedbd016.visa.com;sl73caedbd017.visa.com at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:95) at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53) at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52) at org.apache.pig.newplan.logical.relational.LogicalPlan.validate(LogicalPlan.java:212) at org.apache.pig.PigServer$Graph.compile(PigServer.java:1851) at org.apache.pig.PigServer$Graph.access$300(PigServer.java:1527) at org.apache.pig.PigServer.storeEx(PigServer.java:1117) ... 14 more Caused by: java.net.UnknownHostException: sl73caedbd016.visa.com;sl73caedbd017.visa.com at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:668) at sun.net.NetworkClient.doConnect(NetworkClient.java:175) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264) at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191) at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032) at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177) at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:186) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:127) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:216) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:284) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:165) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:348) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:786) at org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86) at org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2046) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:121) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:142) at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:69) ... 29 more

2 REPLIES 2

Contributor

The issue was that Zepplin doesn't work with KMS HA. Had to remove HA for this error to go but have below error now.

ERROR [2017-03-23 06:24:47,974] ({pool-2-thread-2} Job.java[run]:188) - Job failed java.lang.NoSuchMethodError: org.codehaus.jackson.map.ObjectMapper.writerWithDefaultPrettyPrinter()Lorg/codehaus/jackson/map/ObjectWriter; at org.apache.hadoop.crypto.key.kms.KMSClientProvider.writeJson(KMSClientProvider.java:211) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:448)

Contributor

This has been resolved after adding latest jackson-core-2.5.3.jar,jackson-core-asl-1.9.13.jar,jackson-mapper-asl-1.9.13.jar to pig interpreter path. By default, Zeppelin doesn't copy these latest files.

; ;