Member since 
    
	
		
		
		04-25-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                579
            
            
                Posts
            
        
                609
            
            
                Kudos Received
            
        
                111
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2922 | 02-12-2020 03:17 PM | |
| 2136 | 08-10-2017 09:42 AM | |
| 12470 | 07-28-2017 03:57 AM | |
| 3407 | 07-19-2017 02:43 AM | |
| 2520 | 07-13-2017 11:42 AM | 
			
    
	
		
		
		12-24-2016
	
		
		06:43 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 SYMPTOM:
oozie-hive job is running very slow, sometimes jobs are stuck is in final stage and was not able to complete  ROOT CAUSE:  Oozie prepares the hive-site.xml for hive action, it has the mapred parameter mapreduce.job.reduces set to 1 by default,the reason for this is oozie prepare action configuration after reading core-site.xml,hdfs-site.xml,mapred-site.xml etc.with the setting of mapreduce.job.reduces=1 the job is running with single reducer hence taking a lot of time to complete.  WORKAROUND:  set mapreduce.job.reduces to -1  RESOLUTION:  there is oozie fix https://issues.apache.org/jira/browse/OOZIE-2205 enhance actionConf which are passed to hive-action. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		12-24-2016
	
		
		05:08 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 SYMPTOM:  hive CLI is in hung state for so long, an impatient user did CRTL+C to exit of it and complain about the hive CLI slowness.  ROOT CAUSE:   user is running hive CLI in Kerberos enabled security, we asked them to enable debug logging at the console using hive --hiveconf hive.root.logger=debug,console and saw the following GSS exception due to ticket expiration.  WARN hive.metastore: Failed to connect to the MetaStore Server...
org.apache.thrift.transport.TTransportException: GSS initiate failed
	at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
	at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316)
	at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
	at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
	at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
	at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:426)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236)
	at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
	at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1531)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
	at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3000)
	at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3019)
	at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1237)
	at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)
	at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)
	at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:484)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:680)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:624)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
  during hive cli startup it tries to connect to metastore but do not have a valid TGT hence failing with the GSS exception.  WORKAROUND:   NA   RESOLUTION:   to fail it fast we can use following properties in hive configuration   hive.metastore.connect.retries- no of times Client will try to connect to Metastore by default it is set to 24. hive.metastore.client.connect.retry.delay- the delay after failure this is by default set to 5s. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		12-24-2016
	
		
		05:33 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 You may try   VBoxManage internalcommands repairhd --format OVA --filename <image>  Use the "--dry-run" option to check what the tool would do.  Once image repaired again try to import it.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-03-2017
	
		
		09:41 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Alicia, please see my answer above on Oct 24. If you are running Spark on YARN you will have to go through the YARN RM UI to get to the Spark UI for a running job. Link for YARN UI is available from Ambari YARN service. For a completed job, you will need to go through Spark History Server. Link for Spark history server is available from the Ambari Spark service. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-24-2016
	
		
		09:11 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 while investigating a performance issue with topology assignments, I figured out these high-level steps which storm uses for topologies assignments.  1. for backward compatibility for the old topologies Backward ClientJarTransformerRunner starts and invoke StormShadeTransformer which transformed the jar write into /tmp/<some_random_string>.jar  2. StormSubmitter start uploading topology jar to nimbus inbox using NimbusClient  o.a.s.StormSubmitter - Uploading topology jar /tmp/27ed633ac9aa11e6a850fa163e19dd06.jar to assigned location: /hadoop/storm/nimbus/inbox/stormjar-b1eca4ae-d021-4e93-aaf1-986c9a5772ad.jar
Start uploading file '/tmp/27ed633ac9aa11e6a850fa163e19dd06.jar' to '/hadoop/storm/nimbus/inbox/stormjar-b1eca4ae-d021-4e93-aaf1-986c9a5772ad.jar'
o.a.s.StormSubmitter - Successfully uploaded topology jar to assigned location: /hadoop/storm/nimbus/inbox/stormjar-b1eca4ae-d021-4e93-aaf1-986c9a5772ad.jar
  3. nimbus client submit topology to Nimbus using thrift call  o.a.s.StormSubmitter - Submitting topology wordcount in distributed mode with conf {"storm.zookeeper.topology.auth.scheme":"digest","storm.zookeeper.topology.auth.payload":"-5184467572710101881:-6542959882697852797","topology.workers":3,"topology.debug":true}
o.a.s.StormSubmitter - Finished submitting topology: wordcount
  4. nimbus received topology submission from zookeeper  o.a.s.d.nimbus [INFO] Received topology submission for wordcount with conf {"topology.max.task.parallelism" nil, "topology.submitter.principal" "", "topology.acker.executors" nil, "topology.eventlogger.executors" 0, "topology.workers" 3, "topology.debug" true, "storm.zookeeper.superACL" nil, "topology.users" (), "topology.submitter.user" "storm", "topology.kryo.register" nil, "topology.kryo.decorators" (), "storm.id" "wordcount-1-1482564367", "topology.name" "wordcount"}
  5.nimbus create assignments in zookeeper and set a watch  2016-12-24 07:26:08.696 o.a.s.d.nimbus [INFO] Setting new assignment for topology id wordcount-1-1482564367: #org.apache.storm.daemon.common.Assignment{:master-code-dir "/hadoop/storm", :node->host {"3cb18e51-aa66-424c-8165-e9101ab134bb" "rkk3.hdp.local"}, :executor->node+port {[8 8] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [12 12] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [2 2] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [7 7] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [22 22] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [3 3] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [24 24] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [1 1] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [18 18] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [6 6] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [28 28] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [20 20] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [9 9] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [23 23] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [11 11] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [16 16] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [13 13] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [19 19] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [21 21] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [5 5] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [27 27] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [29 29] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [26 26] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [10 10] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [14 14] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [4 4] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [15 15] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [25 25] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [17 17] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700]}, :executor->start-time-secs {[8 8] 1482564368, [12 12] 1482564368, [2 2] 1482564368, [7 7] 1482564368, [22 22] 1482564368, [3 3] 1482564368, [24 24] 1482564368, [1 1] 1482564368, [18 18] 1482564368, [6 6] 1482564368, [28 28] 1482564368, [20 20] 1482564368, [9 9] 1482564368, [23 23] 1482564368, [11 11] 1482564368, [16 16] 1482564368, [13 13] 1482564368, [19 19] 1482564368, [21 21] 1482564368, [5 5] 1482564368, [27 27] 1482564368, [29 29] 1482564368, [26 26] 1482564368, [10 10] 1482564368, [14 14] 1482564368, [4 4] 1482564368, [15 15] 1482564368, [25 25] 1482564368, [17 17] 1482564368}, :worker->resources {["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700] [0.0 0.0 0.0], ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701] [0.0 0.0 0.0]}}  6. supervisors got watchevent and read from the assignments  2016-12-24 07:26:09.577 o.a.s.d.supervisor [DEBUG] All assignment: {6701 {:storm-id "wordcount-1-1482564367", :executors ([8 8] [12 12] [2 2] [22 22] [24 24] [18 18] [6 6] [28 28] [20 20] [16 16] [26 26] [10 10] [14 14] [4 4]), :resources [0.0 0.0 0.0]}, 6700 {:storm-id "wordcount-1-1482564367", :executors ([7 7] [3 3] [1 1] [9 9] [23 23] [11 11] [13 13] [19 19] [21 21] [5 5] [27 27] [29 29] [15 15] [25 25] [17 17]), :resources [0.0 0.0 0.0]}}  7. supervisors start downloading the topology jar  after download it start launching workers
  2016-12-24 07:26:12.728 o.a.s.d.supervisor [INFO] Launching worker with assignment {:storm-id "wordcount-1-1482564367", :executors [[7 7] [3 3] [1 1] [9 9] [23 23] [11 11] [13 13] [19 19] [21 21] [5 5] [27 27] [29 29] [15 15] [25 25] [17 17]], :resources #object[org.apache.storm.generated.WorkerResources 0x28e35c1e "WorkerResources(mem_on_heap:0.0, mem_off_heap:0.0, cpu:0.0)"]} for this supervisor 3cb18e51-aa66-424c-8165-e9101ab134bb on port 6700 with id ac690504-6b52-4c88-a5bd-50fa78992368 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		12-24-2016
	
		
		07:06 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 SYMPTOM:  hiveserver2 logs are filled with following exceptions:  2016-12-22 16:36:49,643 WARN  ipc.Client (Client.java:run(685)) - Exception encountered while connecting to the server :
 javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
     at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
     at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413)
     at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:563)
     at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:378)
     at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:732)
     at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:728)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:415)
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:727)
     at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:378)
     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1492)
     at org.apache.hadoop.ipc.Client.call(Client.java:1402)
     at org.apache.hadoop.ipc.Client.call(Client.java:1363)
     at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
     at com.sun.proxy.$Proxy23.getFileInfo(Unknown Source)
     at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:773)
     at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     at java.lang.reflect.Method.invoke(Method.java:606)
     at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
     at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
     at com.sun.proxy.$Proxy24.getFileInfo(Unknown Source)
     at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2162)
     at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1363)
     at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1359)
     at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
     at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1359)
     at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
     at org.apache.ranger.audit.destination.HDFSAuditDestination.getLogFileStream(HDFSAuditDestination.java:226)
     at org.apache.ranger.audit.destination.HDFSAuditDestination.logJSON(HDFSAuditDestination.java:123)
     at org.apache.ranger.audit.queue.AuditFileSpool.sendEvent(AuditFileSpool.java:890)
     at org.apache.ranger.audit.queue.AuditFileSpool.runDoAs(AuditFileSpool.java:838)
     at org.apache.ranger.audit.queue.AuditFileSpool$2.run(AuditFileSpool.java:759)
     at org.apache.ranger.audit.queue.AuditFileSpool$2.run(AuditFileSpool.java:757)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:356)
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1689)
     at org.apache.ranger.audit.queue.AuditFileSpool.run(AuditFileSpool.java:765)
     at java.lang.Thread.run(Thread.java:745)
 Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
     at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
     at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
     at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
     at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
     at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
     at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
     at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
  ROOT CAUSE:  hiveserver2 configured with ranger plugin which writes hdfs audit event to both database as well as hdfs. the hiveserver2 thread hiveServer2.async.multi_dest.batch_hiveServer2.async.multi_dest.batch.hdfs_destWriter is trying to write audit events on hdfs but due to TGT got expired.  WORKAROUND:  disable audit events write on hdfs.  RESOLUTION:  this has been fixed in https://issues.apache.org/jira/browse/RANGER-1136, so apply a patch to avoid this. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		12-23-2016
	
		
		06:56 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Kafka Producer (Python)  yum install -y python-pip
pip install kafka-python
//kafka producer sample code
vim kafka_producer.py
from kafka import KafkaProducer
from kafka.errors import KafkaError
producer = KafkaProducer(bootstrap_servers=['rkk1.hdp.local:6667'])
topic = "kafkatopic"
producer.send(topic, b'test message')
//run it
python kafka_consumer.py 
//test it
[root@rkk1 ~]# /usr/hdp/current/kafka-broker/bin/kafka-console-consumer.sh --zookeeper `hostname`:2181 --topic kafkatopic
{metadata.broker.list=rkk1.hdp.local:6667,rkk2.hdp.local:6667,rkk3.hdp.local:6667, request.timeout.ms=30000, client.id=console-consumer-41051, security.protocol=PLAINTEXT}
test message  Kafka Producer (Scala)  mkdir kafkaproducerscala
cd kafkaproducerscala/
mkdir -p src/main/scala
cd src/main/scala
vim KafkaProducerScala.scala
object KafkaProducerScala extends App {
 
       import java.util.Properties
 
         import org.apache.kafka.clients.producer._
 
           val  props = new Properties()
             props.put("bootstrap.servers", "rkk1:6667")
               props.put("acks","1")
                 props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer")
                   props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer")
 
                     val producer = new KafkaProducer[String, String](props)
 
                       val topic="kafkatopic"
 
 
                         for(i<- 1 to 50) {
                                 val record = new ProducerRecord(topic, "key"+i, "value"+i)
                                     producer.send(record)
                                       }
 
                                         producer.close()
 }
 
 cd -
 vim build.sbt
 val kafkaVersion = "0.9.0.0"
 scalaVersion := "2.11.7"
 
 libraryDependencies += "org.apache.kafka" % "kafka-clients" % kafkaVersion
 resolvers += Resolver.mavenLocal
 
 sbt package
 sbt run  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		12-23-2016
	
		
		06:37 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 SYMPTOM:   hivemetastore crashing with outofmemoryerror during ACID compactions.   ERROR [Thread-13]: compactor.Cleaner (Cleaner.java:run(140)) - Caught an exception in the main loop of compactor cleaner, java.lang.OutOfMemoryError: Java heap space 
ERROR [Thread-13]: compactor.Cleaner (Cleaner.java:run(140)) - Caught an exception in the main loop of compactor cleaner, java.lang.OutOfMemoryError: Java heap space   ROOT CAUSE:   Enabled heap dump on outofmemory, after Analysis the heap dump we found that there are lots of entries for FileSystem$Cache$Key,FileSystem objects which was causing a memory leak   WORKAROUND:   set fs.hdfs.impl.disable.cache=true   set fs.file.impl.disable.cache=true   RESOLUTION:   this has been fixed in https://issues.apache.org/jira/browse/HIVE-13151, so apply a patch to avoid this. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		09-11-2018
	
		
		08:02 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@Rajkumar Singh Can we override these value in PublishKafka processor instead of making change at broker/producer level? If no, then why this "PublishKafka_0_10 1.5.0.3.1.0.0-564" processor in NiFi  has field "Max Request Size" which allow us to modify. Default value of this field is 1MB. Thanks! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-23-2016
	
		
		06:18 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		3 Kudos
		
	
				
		
	
		
					
							 	often we have need to read the parquet file, parquet-meta data or parquet-footer, parquet tools is shipped with parquet-hadoop library which can help us to read parquet. these are simple steps to build parquet-tools and demonstrate use of it.  prerequisites: maven 3,git, jdk-7/8   // Building a parquet tools   git clone https://github.com/Parquet/parquet-mr.git 
cd parquet-mr/parquet-tools/ 
mvn clean package -Plocal   // know the schema of the parquet file   java -jar parquet-tools-1.6.0.jar schema sample.parquet   // Read parquet file   java -jar parquet-tools-1.6.0.jar cat sample.parquet   // Read few lines in parquet file   java -jar parquet-tools-1.6.0.jar head -n5 sample.parquet   // know the meta information of the parquet file   java -jar parquet-tools-1.6.0.jar meta sample.parquet 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
 
        













