Member since
03-23-2015
1288
Posts
114
Kudos Received
98
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3349 | 06-11-2020 02:45 PM | |
5062 | 05-01-2020 12:23 AM | |
2859 | 04-21-2020 03:38 PM | |
3562 | 04-14-2020 12:26 AM | |
2360 | 02-27-2020 05:51 PM |
10-09-2019
05:32 PM
Yes i didn't notice that it has space. So i used trim for it.
... View more
10-06-2019
03:32 PM
1 Kudo
@Mekaam, Glad that it helped. Cheers Eric
... View more
10-06-2019
03:13 PM
@gimp077 , Did you mean that "REFRESH" takes time, and eventually you can see the update data, but just some delay? How big is the table? I mean in terms of number of partitions and number of files in HDFS? Eric
... View more
09-25-2019
04:02 PM
@aohl, Thanks for sharing the details about your resolution on the issue. I am glad that it has been resolved and sure others will benefit from your findings here. Cheers Eric
... View more
09-25-2019
03:49 AM
@yukti, I see that there is a new line after UUID in the log, not sure if that is formatting issue on the post or it is there in the log? Can you please double check and also the content in UUID file? Cheers Eric
... View more
09-25-2019
03:44 AM
Hi Vijay, Sorry, haven't able to nail down the cause yet, but can you collect EXPLAIN EXTENDED of the query and share the output as attachment to the post? EXPLAIN EXTENDED select * from tdb1.t1 where bs1_dt=2017-06-23; EXPLAIN EXTENDED select * from tdb1.t1; I would like to check how HS2 does the query plan and see if there is any clue. Cheers Eric
... View more
09-23-2019
03:26 AM
@EricL I am facing the following problems when connecting to port 10000 and 10001 and Hive server2 logs goes as follows !connect jdbc:hive2://localhost:10000 hive hive Connecting to jdbc:hive2://localhost:10000 19/09/23 10:22:05 [main]: WARN jdbc.HiveConnection: Failed to connect to localhost:10000 Could not open connection to the HS2 server. Please check the server URI and if the URI is correct, then ask the administrator to check the server status. Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000: java.net.ConnectException: Connection refused (Connection refused) (state=08S01,code=0) HiveServer2 log It does not even hit Hiveserver2 , hence there are no logs beeline> !connect jdbc:hive2://localhost:10001 hive hive Connecting to jdbc:hive2://localhost:10001 19/09/23 10:22:21 [main]: WARN jdbc.HiveConnection: Failed to connect to localhost:10001 Unknown HS2 problem when communicating with Thrift server. Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10001: Invalid status 72 (state=08S01,code=0) HiveServer2 log: 2019-09-23T10:24:08,767 WARN [HiveServer2-HttpHandler-Pool: Thread-59] http.HttpParser: Illegal character 0x1 in state=START for buffer HeapByteBuffer@63bc8470[p=1,l=25,c=8192,r=24]={\x01<<<\x00\x00\x00\x05PLAIN\x05\x00\x00\x00\n\x00hive\x00hive>>>\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00...\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00} 2019-09-23T10:24:08,767 WARN [HiveServer2-HttpHandler-Pool: Thread-59] http.HttpParser: bad HTTP parsed: 400 Illegal character 0x1 for HttpChannelOverHttp@751d099b{r=0,c=false,a=IDLE,uri=null}
... View more
09-20-2019
10:11 AM
So it looks like column specific is only on a table without partitions (non-incremental) @hores that's incorrect, non-incremental compute stats works on partitioned tables and is generally the preferred method for collecting stats on partitioned tables. We've generally tried to steer people away from incremental stats because of the size issues on large tables, It would also be error-prone to use correctly and complex to implement - what happens if you compute incremental stats with different subsets of the columns? You can end up with different subsets of the columns on different partitions and then you have to somehow reconcile it all each time.
... View more
09-20-2019
10:04 AM
@Zane- I'm late but can provide some additional insight. I think the suggestion in the error message is a good one (I'm biased because I wrote it, but some thought went into it). "Memory is likely oversubscribed. Reducing query concurrency or configuring admission control may help avoid this error". The general solution for this is to set up admission control with some memory limits so that memory doesn't get oversubscribed, and so that one query can't gobble up more memory than you like. I did a talk at strata that gave pointers on a lot of this things - https://conferences.oreilly.com/strata/strata-ca-2019/public/schedule/detail/73000 In this case you can actually see that query 2f4b5cff11212907:886aa1400000000 is using Total=78.60 GB memory, so that's likely your problem. Impala's resource management is totally permissive out of the box and will happily let queries use up all the resources in the system like this. I didn't see what version you're running, but there were a lot of improvements in this area (config options, OOM-avoidance, diagnostics) in CDH6.1+ There's various other angles you can take to improve this - if the queries using lots of memory are suboptimal, tuning them (maybe just computing stats) makes a big difference. You can also
... View more
09-19-2019
05:50 PM
Not the best approach to getting rid of these messages but it gave me what I wanted. I set highest logging level to ERROR instead so everything else is not printed: tom@mds.xyz@cm-r01en01:~] 🙂 $ cat /etc/spark/conf/log4j.properties
log4j.rootLogger=${root.logger}
root.logger=ERROR,console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
shell.log.level=ERROR
log4j.logger.org.spark-project.jetty=WARN
log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=ERROR
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=ERROR
log4j.logger.org.apache.parquet=ERROR
log4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=FATAL
log4j.logger.org.apache.hadoop.hive.ql.exec.FunctionRegistry=ERROR
log4j.logger.org.apache.spark.repl.Main=${shell.log.level}
log4j.logger.org.apache.spark.api.python.PythonGatewayServer=${shell.log.level}
tom@mds.xyz@cm-r01en01:~] 🙂 $
tom@mds.xyz@cm-r01en01:~] 🙂 $
tom@mds.xyz@cm-r01en01:~] 🙂 $ digg /etc/spark/conf/log4j.properties /etc/spark/conf/log4j.properties-original
-sh: digg: command not found
tom@mds.xyz@cm-r01en01:~] 😞 $ diff /etc/spark/conf/log4j.properties /etc/spark/conf/log4j.properties-original
2c2
< root.logger=ERROR,console
---
> root.logger=DEBUG,console
10,11c10,11
< log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=ERROR
< log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=ERROR
---
> log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
> log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
tom@mds.xyz@cm-r01en01:~] 😞 $ Now I get my spark-shell without the INFO, DEBUG or WARNING messages all over it. Still interested in a final solution if possible. I only see it fixed in Spark 3.0 . Cheers, TK
... View more