Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

First Pig Script Failing - "Path segment is null"

Highlighted

First Pig Script Failing - "Path segment is null"

New Contributor

I am running Ambari with HDP cluster 2.3.

The tutorial I am following is: http://hortonworks.com/hadoop-tutorial/how-to-use-basic-pig-commands/

I am not using the tutorial data, I am using my own.

JSON_DATA = LOAD 'hdfs://ha-host:8020/user/flume/tether/2016/05/10/09/30/tether_events.1462890658317' USING PigStorage(',');
DESCRIBE JSON_DATA;

The file exists above fully.

hadoop fs -ls hdfs://ha-host:8020/user/flume/tether/2016/05/10/09/30/tether_events.1462890658317

-rw-r--r--   3 flume hdfs      25663 2016-05-10 09:31 hdfs://ha-host:8020/user/flume/tether/2016/05/10/09/30/tether_events.1462890658317

In grunt, I can run similar command and it returns the same.

This is the stack trace for running the pig script on Ambari in Pig View:

java.lang.IllegalArgumentException: Path segment is null

java.lang.IllegalArgumentException: Path segment is null
	at com.sun.jersey.api.uri.UriBuilderImpl.appendPath(UriBuilderImpl.java:547)
	at com.sun.jersey.api.uri.UriBuilderImpl.appendPath(UriBuilderImpl.java:542)
	at com.sun.jersey.api.uri.UriBuilderImpl.path(UriBuilderImpl.java:267)
	at com.sun.jersey.api.client.WebResource.path(WebResource.java:390)
	at org.apache.ambari.view.pig.templeton.client.TempletonApi.checkJob(TempletonApi.java:127)
	at org.apache.ambari.view.pig.resources.jobs.JobResourceManager.retrieveJobStatus(JobResourceManager.java:240)
	at org.apache.ambari.view.pig.resources.jobs.JobService.getJob(JobService.java:101)
	at sun.reflect.GeneratedMethodAccessor11249.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
	at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
	at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
	at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
	at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
	at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
	at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)
	at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)
	at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
	at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
	at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
	at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:540)
	at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:715)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:770)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1496)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:330)
	at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:118)
	at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:84)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:113)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:103)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:113)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:54)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:45)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.apache.ambari.server.security.authorization.AmbariAuthorizationFilter.doFilter(AmbariAuthorizationFilter.java:182)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.authentication.www.BasicAuthenticationFilter.doFilter(BasicAuthenticationFilter.java:150)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:87)
	at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:342)
	at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:192)
	at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:160)
	at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:237)
	at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:167)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467)
	at org.apache.ambari.server.api.MethodOverrideFilter.doFilter(MethodOverrideFilter.java:72)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467)
	at org.apache.ambari.server.api.AmbariPersistFilter.doFilter(AmbariPersistFilter.java:47)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467)
	at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82)
	at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:294)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1467)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:429)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
	at org.apache.ambari.server.controller.AmbariHandlerList.processHandlers(AmbariHandlerList.java:209)
	at org.apache.ambari.server.controller.AmbariHandlerList.processHandlers(AmbariHandlerList.java:198)
	at org.apache.ambari.server.controller.AmbariHandlerList.handle(AmbariHandlerList.java:132)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
	at org.eclipse.jetty.server.Server.handle(Server.java:370)
	at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
	at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
	at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
	at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
	at java.lang.Thread.run(Thread.java:745)

I have not a clue why this is not working. My best guess is that Pig has no way of reaching into HDFS without some sort of config parameter set? Any help is appreciated. Thanks

4 REPLIES 4
Highlighted

Re: First Pig Script Failing - "Path segment is null"

Can you include a small sample of a few rows of that file? I see that the alias name suggests the file is of JSON format, but you are using PigStorage as if it is a CSV file; what is the file format itself? Can you open a very simple CSV file such as the following?

a,1
b,2
c,3

I'm assuming the box you are running Pig from is a machine that Ambari put the bits and configs on (i.e. is listed in its Hosts page) instead of a place where you installed the bits yourself. If the later, you can use Ambari's "generate configs" feature to create the necessary XML files so it can find HDFS.

Highlighted

Re: First Pig Script Failing - "Path segment is null"

@daniel vernon

Make sure that the user that the script is being executed as is a member of group 'hadoop.proxyuser.HTTP.groups'

In core-site.xml, set below properties and try

hadoop.proxyuser.HTTP.groups=*

hadoop.proxyuser.HTTP.hosts=*

Let me know if that fix your problem.

Re: First Pig Script Failing - "Path segment is null"

HI ,

I faced similer experience previously. Try to load with sample schema in your LOAD statement for testing . In case your input files having more then 25 fileds such kind of experience had.

Highlighted

Re: First Pig Script Failing - "Path segment is null"

@daniel vernon

Please share your pig view settings

Don't have an account?
Coming from Hortonworks? Activate your account here