Member since 
    
	
		
		
		11-23-2015
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                28
            
            
                Posts
            
        
                16
            
            
                Kudos Received
            
        
                1
            
            
                Solution
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2812 | 01-27-2016 08:32 AM | 
			
    
	
		
		
		10-17-2016
	
		
		08:09 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @Padmanabhan Vijendran   
	Actually I did not, since the need passed. However my question was more about access to Hive. In case of HDFS it should be more simple.  
	In your java code you need to have something like this: 
 if (System.getenv("HADOOP_TOKEN_FILE_LOCATION") != null) {
            jobConf.set("mapreduce.job.credentials.binary", System.getenv("HADOOP_TOKEN_FILE_LOCATION"));
}
  to tell your java app where to find delegation token needed for HDFS access.  Hope this helps,  Pavel 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-09-2016
	
		
		07:06 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi Larry,  yes, the Apache HttpClient works like a charm.  Thanks,  Pavel 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-06-2016
	
		
		06:25 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi,  we are trying to use WebHDFS over Knox to access HDFS on our secured cluster from java. We are able to list files/folders there, but we are still struggling with the file creation.  The problem is probably in Oracle's Java library, where the streaming does not seem to be supported when authentication is required:.   In sun.net.www.protocol.http.HttpURLConnection.getInputStream0()  there is something like  if
(j == 401) {  /*
1635 */           if (streaming()) {  /*
1636 */             disconnectInternal();  /*
1637 */thrownew
HttpRetryException("cannot retry due to server
authentication, in streaming mode",
401);  /* 
    */           }  The streaming is not needed for some operation, like list/delete (and therefore it works), but it is required for file creation.  Any suggestions how to handle this?  Thanks a lot,  Pavel 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Hadoop
 - 
						
							
		
			Apache Knox
 
			
    
	
		
		
		04-26-2016
	
		
		07:38 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks, we are working on something similar.   I have one question/comment to the 'compact' stage. The execution flow as presented here means that table 'reporting_table' disappears for significant amount of time, before it is filled again. This could break queries running against this table. Is there a way how to make this switch (almost) seamless? It also may require to keep the older data not to break already running queries.  Thanks,   Pavel 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-16-2016
	
		
		02:13 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi,  I am wondering if there is a general way how to determine what YARN applications were started by some application and vice-versa if some YARN application was started by some other.   My use case is Oozie and sqoop where Oozie runs some launchers that in turn start some MR jobs to do the actual ingest. It is possible to browse through the logs to get ID of spawned application, but keep thinking that there should be some better way how to do it. This kind of relation must be stored somewhere, since when the Oozie workflow is killed all child processes are killed as well almost immediately.  Thanks for any hints,  Regards,  Pavel  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache YARN
 
			
    
	
		
		
		03-14-2016
	
		
		04:46 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi @Sowmya Ramesh,  thanks for your reply.  You had definitely more luck with google since I could not find anything useful related to this exception and Falcon in particular.   I do not have the xml for failed request yet (logging added and waiting for the issue to happen again), but in general it looks like this:  <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <feed name="test-2-ALL-RELATIONSHIP" xmlns="uri:falcon:feed:0.1"> <frequency>days(1)</frequency> <clusters> <cluster name="dev-cluster" type="source"> <validity start="2016-03-14T00:00Z" end="2016-03-29T00:00Z"/> <retention limit="months(12)" action="delete"/> </cluster> </clusters> <table uri="catalog:test_2:ALL_RELATIONSHIP#mg_version=${YEAR}-${MONTH}-${DAY}-${HOUR}-${MINUTE}"/> <ACL owner="user@domain.COM"/> <schema location="/none" provider="none"/> <properties> <property name="queueName" value="mglauncher"/> </properties> </feed>
  however I doubt that the error is caused by incorrect xml since it is generated automatically and the same operation is usually successful and fails only sometimes.  There is a code in org.apache.falcon.resource.AbstractEntityManager.deserializeEntity() method that does some logging when parsing fails:  if (LOG.isDebugEnabled() && xmlStream.markSupported()) { 
 try {
    xmlStream.reset();
    String xmlData = getAsString(xmlStream);
    LOG.debug("XML DUMP for ({}): {}", entityType, xmlData, e); 
  } catch (IOException ignore) {
     // ignore   
  }            
}
  but I could not find anything like ""XML DUMP for" in our Falcon log. Is this fragment in log4j.xml Falcon conf file     <logger name="org.apache.falcon" additivity="false">
       <level value="debug"/>
       <appender-ref ref="FILE"/>
   </logger>
  enough to get this messages into log? I am not familiar with implementation so I am not sure whether the stream supports marking or not.  Regards and thanks for any input,  Pavel 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-03-2016
	
		
		04:54 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 Hi,  we are using org.apache.falcon.client.FalconClient API to update Falcon process from java:  falconClient.update( EntityType.PROCESS.name(), <some-id>, <file-name>,true,doAs);  where the local <file-name> is created like this:  ...
Marshaller marshaller = entityType.getMarshaller();
final File createTempFile = File.createTempFile(entityType.name().toLowerCase() + "_" + id, ".xml");
LOGGER.debug("Generated entity: {}", entity.toString());
marshaller.marshal(entity, createTempFile);
return createTempFile.getPath();  and sometimes the update fails with this error:  javax.xml.bind.UnmarshalException:
[org.xml.sax.SAXParseException; Premature end of file.]
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.createUnmarshalException(AbstractUnmarshallerImpl.java:335)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.createUnmarshalException(UnmarshallerImpl.java:523)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:220)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:189)
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:157)
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:204)
at org.apache.falcon.entity.parser.EntityParser.parse(EntityParser.java:94)
... 61 more
Caused by: org.xml.sax.SAXParseException; Premature end of file.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLVersionDetector.determineDocVersion(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:216)
... 65 more   most of the updates pass, this error happens only sometimes, so I believe the file is created correctly on the client side and the error is caused possibly by some performance issue or race condition.  Have you seen this behavior?  Thanks,  Pavel 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Falcon
 
			
    
	
		
		
		02-22-2016
	
		
		07:12 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Guillermo Ortiz I would say it is Oozie/Kerberos problem. If I would like to call HBase from Oozie (there is probably not a task for it), I would end up with the same problem. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-22-2016
	
		
		07:07 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 The problem is that our system does not have access to users password or keytab. It uses kerberos authentication and than Haddop proxy user to access various Hadoop services. So it is not possible for us to do kinit again on a data node or use password (in file or directly). 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-16-2016
	
		
		08:10 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @Guillermo Ortiz Not really, I have split the original java action into two Oozie actions; the first one is hive action where I get what I need from from hive (using temporary external table) and the second java actions where the data are further processed. Currently I use hive action, but it should be trivial to replace it with hive2 action in future when needed.  And yes, according to my knowledge it necessary to have valid kerberos token (kinit does not have to happen in java though) or use delegation token to connect to  Kerberized hive from java. 
						
					
					... View more