Member since
04-25-2016
579
Posts
609
Kudos Received
111
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1125 | 02-12-2020 03:17 PM | |
887 | 08-10-2017 09:42 AM | |
7201 | 07-28-2017 03:57 AM | |
1328 | 07-19-2017 02:43 AM | |
1027 | 07-13-2017 11:42 AM |
12-25-2016
10:48 AM
@rama unfortunatly there is no way to restore partition in hive.
... View more
12-25-2016
08:16 AM
Problem Description: often we are in need to read and write underlying files from user defined reader and writer. if the the custom reader and writer are written in java or language run over JVM
then we are good to add those in hive_aux_jar or we can add them using add jar option at session level but when if it is written in native language and shipped as *.so file
then we will get java.lang.UnsatisfiedLinkError. we can workaround this problem after adding it to hive-env
1. open Ambari-->Hive-->Advanced-->Advanced hive-env-->hive-env template 2. modify {% if sqla_db_used or lib_dir_available %}
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:{{jdbc_libs_dir}}"
export JAVA_LIBRARY_PATH="$JAVA_LIBRARY_PATH:{{jdbc_libs_dir}}"
{% endif %}
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- Hive
- Issue Resolution
Labels:
12-24-2016
07:11 PM
@Alicia Alicia ya my guess by right you are running sandbox in NAT mode, you can still access the history server web ui on the address http://localhost:18080 because port forwarding rule is configured for it, can you try this and confirm. to access web ui on port 4040 you need to configure portforwarding rule in your virtualbox.
... View more
12-24-2016
06:43 PM
SYMPTOM:
oozie-hive job is running very slow, sometimes jobs are stuck is in final stage and was not able to complete ROOT CAUSE: Oozie prepares the hive-site.xml for hive action, it has the mapred parameter mapreduce.job.reduces set to 1 by default,the reason for this is oozie prepare action configuration after reading core-site.xml,hdfs-site.xml,mapred-site.xml etc.with the setting of mapreduce.job.reduces=1 the job is running with single reducer hence taking a lot of time to complete. WORKAROUND: set mapreduce.job.reduces to -1 RESOLUTION: there is oozie fix https://issues.apache.org/jira/browse/OOZIE-2205 enhance actionConf which are passed to hive-action.
... View more
- Find more articles tagged with:
- Governance & Lifecycle
- Hive
- Issue Resolution
- Oozie
Labels:
12-24-2016
06:28 PM
@Alicia Alicia it looks that you are running your sandbox is NAT mode thats why you are not able to access the host 10.0.2.15, could you please try pinging 10.0.2.15 from command line to see if you are able to connect
... View more
12-24-2016
06:00 PM
1 Kudo
@Alicia Alicia hope this help you,if yes then please accept best answer so that other can refer it.
... View more
12-24-2016
05:25 PM
1 Kudo
@Alicia Alicia one more thing to add here, if you want to see the application logs for the completed application you can access the following url http://<hostname_spark_history_server>:18080/
... View more
12-24-2016
05:22 PM
1 Kudo
@Alicia Alicia well I could not see any problem with the code it execute succesfully. did you see any event in driver logs like SparkUI: Bound SparkUI to 0.0.0.0, and started at http://172.26.81.127:4041
... View more
12-24-2016
05:08 PM
SYMPTOM: hive CLI is in hung state for so long, an impatient user did CRTL+C to exit of it and complain about the hive CLI slowness. ROOT CAUSE: user is running hive CLI in Kerberos enabled security, we asked them to enable debug logging at the console using hive --hiveconf hive.root.logger=debug,console and saw the following GSS exception due to ticket expiration. WARN hive.metastore: Failed to connect to the MetaStore Server...
org.apache.thrift.transport.TTransportException: GSS initiate failed
at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316)
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:426)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1531)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3000)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3019)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1237)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)
at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:484)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:680)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:624)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
during hive cli startup it tries to connect to metastore but do not have a valid TGT hence failing with the GSS exception. WORKAROUND: NA RESOLUTION: to fail it fast we can use following properties in hive configuration hive.metastore.connect.retries- no of times Client will try to connect to Metastore by default it is set to 24. hive.metastore.client.connect.retry.delay- the delay after failure this is by default set to 5s.
... View more
- Find more articles tagged with:
- Hive
- Issue Resolution
- issue-resolution
- Kerberos
- Security
Labels:
12-24-2016
04:52 PM
2 Kudos
@Julius Gamboa faced similar exception with virtualbox but after upgrading it I was able to import succesfully
... View more
12-24-2016
04:11 PM
1 Kudo
@Alicia Alicia are you able to access the driver ui after suggested change?
... View more
12-24-2016
03:50 PM
@Alicia Alicia I dont think that this application will run for a long, could you please introduce Thread sleep method in main method like this (jar build required in this case) def main(args: Array[String]) {
Thread.sleep(30000);
// Load file rating and parse itval Ratingfiles = sc.textFile("hdfs://sandbox.hortonworks.com:8020/tmp/ProjetFilm/ratings.dat") this will allow your app to remain there for atleast 30 secs and you can quickly grab the driver url from the driver logs.
... View more
12-24-2016
03:01 PM
@Alicia Alicia did you see any port bind exception in driver logs, spark ui try to bind 4040 and if it fail then try for 4041,4042 to 44. how long your application run there might be a chance your application exit quickly after submission. could you please post driver logs?
... View more
12-24-2016
02:20 PM
2 Kudos
@Alicia Alicia sbt package is to generate application jar then you need to submit this jar on spark cluster suggesting what master to use in local,yarn-client, yarn-cluster or standalone. have you submitted the application with spark-submit specifying --master parameter?
... View more
12-24-2016
09:11 AM
1 Kudo
while investigating a performance issue with topology assignments, I figured out these high-level steps which storm uses for topologies assignments. 1. for backward compatibility for the old topologies Backward ClientJarTransformerRunner starts and invoke StormShadeTransformer which transformed the jar write into /tmp/<some_random_string>.jar 2. StormSubmitter start uploading topology jar to nimbus inbox using NimbusClient o.a.s.StormSubmitter - Uploading topology jar /tmp/27ed633ac9aa11e6a850fa163e19dd06.jar to assigned location: /hadoop/storm/nimbus/inbox/stormjar-b1eca4ae-d021-4e93-aaf1-986c9a5772ad.jar
Start uploading file '/tmp/27ed633ac9aa11e6a850fa163e19dd06.jar' to '/hadoop/storm/nimbus/inbox/stormjar-b1eca4ae-d021-4e93-aaf1-986c9a5772ad.jar'
o.a.s.StormSubmitter - Successfully uploaded topology jar to assigned location: /hadoop/storm/nimbus/inbox/stormjar-b1eca4ae-d021-4e93-aaf1-986c9a5772ad.jar
3. nimbus client submit topology to Nimbus using thrift call o.a.s.StormSubmitter - Submitting topology wordcount in distributed mode with conf {"storm.zookeeper.topology.auth.scheme":"digest","storm.zookeeper.topology.auth.payload":"-5184467572710101881:-6542959882697852797","topology.workers":3,"topology.debug":true}
o.a.s.StormSubmitter - Finished submitting topology: wordcount
4. nimbus received topology submission from zookeeper o.a.s.d.nimbus [INFO] Received topology submission for wordcount with conf {"topology.max.task.parallelism" nil, "topology.submitter.principal" "", "topology.acker.executors" nil, "topology.eventlogger.executors" 0, "topology.workers" 3, "topology.debug" true, "storm.zookeeper.superACL" nil, "topology.users" (), "topology.submitter.user" "storm", "topology.kryo.register" nil, "topology.kryo.decorators" (), "storm.id" "wordcount-1-1482564367", "topology.name" "wordcount"}
5.nimbus create assignments in zookeeper and set a watch 2016-12-24 07:26:08.696 o.a.s.d.nimbus [INFO] Setting new assignment for topology id wordcount-1-1482564367: #org.apache.storm.daemon.common.Assignment{:master-code-dir "/hadoop/storm", :node->host {"3cb18e51-aa66-424c-8165-e9101ab134bb" "rkk3.hdp.local"}, :executor->node+port {[8 8] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [12 12] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [2 2] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [7 7] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [22 22] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [3 3] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [24 24] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [1 1] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [18 18] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [6 6] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [28 28] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [20 20] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [9 9] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [23 23] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [11 11] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [16 16] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [13 13] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [19 19] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [21 21] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [5 5] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [27 27] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [29 29] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [26 26] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [10 10] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [14 14] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [4 4] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701], [15 15] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [25 25] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700], [17 17] ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700]}, :executor->start-time-secs {[8 8] 1482564368, [12 12] 1482564368, [2 2] 1482564368, [7 7] 1482564368, [22 22] 1482564368, [3 3] 1482564368, [24 24] 1482564368, [1 1] 1482564368, [18 18] 1482564368, [6 6] 1482564368, [28 28] 1482564368, [20 20] 1482564368, [9 9] 1482564368, [23 23] 1482564368, [11 11] 1482564368, [16 16] 1482564368, [13 13] 1482564368, [19 19] 1482564368, [21 21] 1482564368, [5 5] 1482564368, [27 27] 1482564368, [29 29] 1482564368, [26 26] 1482564368, [10 10] 1482564368, [14 14] 1482564368, [4 4] 1482564368, [15 15] 1482564368, [25 25] 1482564368, [17 17] 1482564368}, :worker->resources {["3cb18e51-aa66-424c-8165-e9101ab134bb" 6700] [0.0 0.0 0.0], ["3cb18e51-aa66-424c-8165-e9101ab134bb" 6701] [0.0 0.0 0.0]}} 6. supervisors got watchevent and read from the assignments 2016-12-24 07:26:09.577 o.a.s.d.supervisor [DEBUG] All assignment: {6701 {:storm-id "wordcount-1-1482564367", :executors ([8 8] [12 12] [2 2] [22 22] [24 24] [18 18] [6 6] [28 28] [20 20] [16 16] [26 26] [10 10] [14 14] [4 4]), :resources [0.0 0.0 0.0]}, 6700 {:storm-id "wordcount-1-1482564367", :executors ([7 7] [3 3] [1 1] [9 9] [23 23] [11 11] [13 13] [19 19] [21 21] [5 5] [27 27] [29 29] [15 15] [25 25] [17 17]), :resources [0.0 0.0 0.0]}} 7. supervisors start downloading the topology jar after download it start launching workers
2016-12-24 07:26:12.728 o.a.s.d.supervisor [INFO] Launching worker with assignment {:storm-id "wordcount-1-1482564367", :executors [[7 7] [3 3] [1 1] [9 9] [23 23] [11 11] [13 13] [19 19] [21 21] [5 5] [27 27] [29 29] [15 15] [25 25] [17 17]], :resources #object[org.apache.storm.generated.WorkerResources 0x28e35c1e "WorkerResources(mem_on_heap:0.0, mem_off_heap:0.0, cpu:0.0)"]} for this supervisor 3cb18e51-aa66-424c-8165-e9101ab134bb on port 6700 with id ac690504-6b52-4c88-a5bd-50fa78992368
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- FAQ
- nimbus
- Storm
- supervisors
Labels:
12-24-2016
08:18 AM
5 Kudos
this looks a timestamp not date, you can store this as string in hive table and retrive it using to_date() funtion, or you can run some date transformation before inserting into hive table, it looks you are having RFC822 timestamp which you can convert into some hive known transformation like this, I am using a java program to public class RFC822TimeStampConverter {
public static void main(String[] args) {
String rfcDate = "Tue, Dec 20 10:04:31 2016";
String pattern = "EEE, MMM DD HH:mm:ss YYYY";
SimpleDateFormat format = new SimpleDateFormat(pattern);
try {
Date javaDate = format.parse(rfcDate);
System.out.println(javaDate.toString());
} catch (ParseException e) {
e.printStackTrace();
}
}
}
... View more
12-24-2016
07:06 AM
SYMPTOM: hiveserver2 logs are filled with following exceptions: 2016-12-22 16:36:49,643 WARN ipc.Client (Client.java:run(685)) - Exception encountered while connecting to the server :
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:563)
at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:378)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:732)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:728)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:727)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:378)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1492)
at org.apache.hadoop.ipc.Client.call(Client.java:1402)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy23.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:773)
at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy24.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2162)
at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1363)
at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1359)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1359)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at org.apache.ranger.audit.destination.HDFSAuditDestination.getLogFileStream(HDFSAuditDestination.java:226)
at org.apache.ranger.audit.destination.HDFSAuditDestination.logJSON(HDFSAuditDestination.java:123)
at org.apache.ranger.audit.queue.AuditFileSpool.sendEvent(AuditFileSpool.java:890)
at org.apache.ranger.audit.queue.AuditFileSpool.runDoAs(AuditFileSpool.java:838)
at org.apache.ranger.audit.queue.AuditFileSpool$2.run(AuditFileSpool.java:759)
at org.apache.ranger.audit.queue.AuditFileSpool$2.run(AuditFileSpool.java:757)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1689)
at org.apache.ranger.audit.queue.AuditFileSpool.run(AuditFileSpool.java:765)
at java.lang.Thread.run(Thread.java:745)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
ROOT CAUSE: hiveserver2 configured with ranger plugin which writes hdfs audit event to both database as well as hdfs. the hiveserver2 thread hiveServer2.async.multi_dest.batch_hiveServer2.async.multi_dest.batch.hdfs_destWriter is trying to write audit events on hdfs but due to TGT got expired. WORKAROUND: disable audit events write on hdfs. RESOLUTION: this has been fixed in https://issues.apache.org/jira/browse/RANGER-1136, so apply a patch to avoid this.
... View more
- Find more articles tagged with:
- Data Processing
- hiveserver2
- Issue Resolution
- Ranger
Labels:
12-23-2016
06:56 PM
Kafka Producer (Python) yum install -y python-pip
pip install kafka-python
//kafka producer sample code
vim kafka_producer.py
from kafka import KafkaProducer
from kafka.errors import KafkaError
producer = KafkaProducer(bootstrap_servers=['rkk1.hdp.local:6667'])
topic = "kafkatopic"
producer.send(topic, b'test message')
//run it
python kafka_consumer.py
//test it
[root@rkk1 ~]# /usr/hdp/current/kafka-broker/bin/kafka-console-consumer.sh --zookeeper `hostname`:2181 --topic kafkatopic
{metadata.broker.list=rkk1.hdp.local:6667,rkk2.hdp.local:6667,rkk3.hdp.local:6667, request.timeout.ms=30000, client.id=console-consumer-41051, security.protocol=PLAINTEXT}
test message Kafka Producer (Scala) mkdir kafkaproducerscala
cd kafkaproducerscala/
mkdir -p src/main/scala
cd src/main/scala
vim KafkaProducerScala.scala
object KafkaProducerScala extends App {
import java.util.Properties
import org.apache.kafka.clients.producer._
val props = new Properties()
props.put("bootstrap.servers", "rkk1:6667")
props.put("acks","1")
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer")
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer")
val producer = new KafkaProducer[String, String](props)
val topic="kafkatopic"
for(i<- 1 to 50) {
val record = new ProducerRecord(topic, "key"+i, "value"+i)
producer.send(record)
}
producer.close()
}
cd -
vim build.sbt
val kafkaVersion = "0.9.0.0"
scalaVersion := "2.11.7"
libraryDependencies += "org.apache.kafka" % "kafka-clients" % kafkaVersion
resolvers += Resolver.mavenLocal
sbt package
sbt run
... View more
- Find more articles tagged with:
- How-ToTutorial
- Kafka
- Sandbox & Learning
Labels:
12-23-2016
06:37 PM
SYMPTOM: hivemetastore crashing with outofmemoryerror during ACID compactions. ERROR [Thread-13]: compactor.Cleaner (Cleaner.java:run(140)) - Caught an exception in the main loop of compactor cleaner, java.lang.OutOfMemoryError: Java heap space
ERROR [Thread-13]: compactor.Cleaner (Cleaner.java:run(140)) - Caught an exception in the main loop of compactor cleaner, java.lang.OutOfMemoryError: Java heap space ROOT CAUSE: Enabled heap dump on outofmemory, after Analysis the heap dump we found that there are lots of entries for FileSystem$Cache$Key,FileSystem objects which was causing a memory leak WORKAROUND: set fs.hdfs.impl.disable.cache=true set fs.file.impl.disable.cache=true RESOLUTION: this has been fixed in https://issues.apache.org/jira/browse/HIVE-13151, so apply a patch to avoid this.
... View more
- Find more articles tagged with:
- Data Processing
- Hive
- hive-metastore
- Issue Resolution
Labels:
12-23-2016
11:50 AM
1 Kudo
SYMPTOM: user is frequently seeing following exceptions: org.apache.kafka.common.errors.RecordTooLargeException: The request included a message larger than the max message size the server will accept.
org.apache.kafka.common.errors.RecordTooLargeException: The message is 5745799 bytes when serialized which is larger than the maximum request size you have configured with the max.request.size configuration. ROOT CAUSE: This is happening due to max byte settings at producer level and broker level Broker side: message.max.bytes - this is the largest size of the message that can be received by the broker from a producer Producer side: max.request.size is a limit to send the larger message. WORKAROUND: NA RESOLUTION: message.max.bytes by default is 1M (1000012 bytes) for kafka 0.10.0. if you need to publish larger messages, you will need to adjust that on the brokers and then restart them. to avoid exception at producer side you need to increase max.request.size to send the larger message. PS: if you get RecordTooLargeException at consumer side then Increase max.partition.fetch.bytes which help you to consume big messages.
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- Issue Resolution
- Kafka
Labels:
12-23-2016
07:38 AM
@rama could you please try insert overwrite directory wasb:///<some location> select * from original table then create a external table and point to this new location wasb:///<some location>
... View more
12-23-2016
07:34 AM
could you please attach complete hiveserver2 logs, it looks hs2 is not able to create zknode, this could be kerberos issue if you running in secure env or there might be zookeeper database inconsisency. to rule out zk database inconsistencies you can try stop all services which are using zk change ZooKeeper directory location after browsing ambari=> zookeeper=> conf, by default it is /hadoop/zookeeper start all reservices
... View more
12-23-2016
07:08 AM
2 Kudos
@ARUN it looks hiveserver2 is not able to create zk node in zookeeper, it could be issue at zookeeper side could you please check whether zk node got created or not using /usr/hdp/current/zookeeper-client/bin/zkCli.sh -server localhost:2181 ls /hiveserver2 see if you have issue with zookeeper server.
... View more
12-23-2016
06:18 AM
3 Kudos
often we have need to read the parquet file, parquet-meta data or parquet-footer, parquet tools is shipped with parquet-hadoop library which can help us to read parquet. these are simple steps to build parquet-tools and demonstrate use of it. prerequisites: maven 3,git, jdk-7/8 // Building a parquet tools git clone https://github.com/Parquet/parquet-mr.git
cd parquet-mr/parquet-tools/
mvn clean package -Plocal // know the schema of the parquet file java -jar parquet-tools-1.6.0.jar schema sample.parquet // Read parquet file java -jar parquet-tools-1.6.0.jar cat sample.parquet // Read few lines in parquet file java -jar parquet-tools-1.6.0.jar head -n5 sample.parquet // know the meta information of the parquet file java -jar parquet-tools-1.6.0.jar meta sample.parquet
... View more
- Find more articles tagged with:
- Data Processing
- How-ToTutorial
- parquet
- tool
12-23-2016
05:49 AM
SYMPTOM: user complains that he is seeing some special characters (^@^@^@^@^@^@^) in GC logs similar to this concurrent-mark-sweep perm gen total 29888K, used 29792K [0x00000007e0000000, 0x00000007e1d30000, 0x0000000800000000)
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ @^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@2016-12-10T00:09:31.648-0500: 912.497: [GC2016-12-10T00:09:31.648-0500: 912.497: [ParNew: 574321K->15632K(629120K), 0.0088960 secs] 602402K->43713K(2027264K), 0.0090030 secs]
ROOT CAUSE: there was some query which was running on hiveserver2 in mr mode which spins Map Join actually forking a new process after inheriting the all java arguments was actually writing to same HiveServer2 GC files. this is introducing a special characters in the GC logs and also some of the GC events got skipped while writing to GC file. WORKAROUND: NA RESOLUTION: Run the query in Tez mode which will force the map join to run in to task container or set hive.exec.submit.local.task.via.child=false which will not fork a child process to run a local map task but this can be risky if somehow you mapjoin goes OOM then it can stall your service
... View more
- Find more articles tagged with:
- Data Processing
- hiveserver2
- Issue Resolution
- issue-resolution
Labels:
12-23-2016
05:00 AM
@rama will you able to run the suggestions?
... View more
12-22-2016
05:39 PM
2 Kudos
try adding these jars in hive aux jars or at session level using add jar option and see if it helps hive-hbase-handler-*.jar hbase-client-*.jar ADD JAR /usr/hdp/current/hive-client/lib/hive-hbase-handler-*.jar; ADD JAR /usr/hdp/2.3.2.0-2950/hbase/lib/hbase-client-*.jar ;
... View more
12-22-2016
05:33 PM
4 Kudos
remove property ‘org.apache.atlas.hive.hook.HiveHook’ from ‘hive.exec.post.hooks’ in hive-site.xml, and restart all affected components
... View more