About EricL

priyanka1_munja · ‎10-09-2019

Yes i didn't notice that it has space. So i used trim for it.

EricL · ‎10-06-2019

@Mekaam, Glad that it helped. Cheers Eric

EricL · ‎10-06-2019

@gimp077 , Did you mean that "REFRESH" takes time, and eventually you can see the update data, but just some delay? How big is the table? I mean in terms of number of partitions and number of files in HDFS? Eric

EricL · ‎09-25-2019

@aohl, Thanks for sharing the details about your resolution on the issue. I am glad that it has been resolved and sure others will benefit from your findings here. Cheers Eric

EricL · ‎09-25-2019

@yukti, I see that there is a new line after UUID in the log, not sure if that is formatting issue on the post or it is there in the log? Can you please double check and also the content in UUID file? Cheers Eric

EricL · ‎09-25-2019

Hi Vijay, Sorry, haven't able to nail down the cause yet, but can you collect EXPLAIN EXTENDED of the query and share the output as attachment to the post? EXPLAIN EXTENDED select * from tdb1.t1 where bs1_dt=2017-06-23; EXPLAIN EXTENDED select * from tdb1.t1; I would like to check how HS2 does the query plan and see if there is any clue. Cheers Eric

ranger · ‎09-23-2019

@EricL I am facing the following problems when connecting to port 10000 and 10001 and Hive server2 logs goes as follows !connect jdbc:hive2://localhost:10000 hive hive Connecting to jdbc:hive2://localhost:10000 19/09/23 10:22:05 [main]: WARN jdbc.HiveConnection: Failed to connect to localhost:10000 Could not open connection to the HS2 server. Please check the server URI and if the URI is correct, then ask the administrator to check the server status. Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000: java.net.ConnectException: Connection refused (Connection refused) (state=08S01,code=0) HiveServer2 log It does not even hit Hiveserver2 , hence there are no logs beeline> !connect jdbc:hive2://localhost:10001 hive hive Connecting to jdbc:hive2://localhost:10001 19/09/23 10:22:21 [main]: WARN jdbc.HiveConnection: Failed to connect to localhost:10001 Unknown HS2 problem when communicating with Thrift server. Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10001: Invalid status 72 (state=08S01,code=0) HiveServer2 log: 2019-09-23T10:24:08,767 WARN [HiveServer2-HttpHandler-Pool: Thread-59] http.HttpParser: Illegal character 0x1 in state=START for buffer HeapByteBuffer@63bc8470[p=1,l=25,c=8192,r=24]={\x01<<<\x00\x00\x00\x05PLAIN\x05\x00\x00\x00\n\x00hive\x00hive>>>\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00...\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00} 2019-09-23T10:24:08,767 WARN [HiveServer2-HttpHandler-Pool: Thread-59] http.HttpParser: bad HTTP parsed: 400 Illegal character 0x1 for HttpChannelOverHttp@751d099b{r=0,c=false,a=IDLE,uri=null}

Tim Armstrong · ‎09-20-2019

So it looks like column specific is only on a table without partitions (non-incremental) @hores that's incorrect, non-incremental compute stats works on partitioned tables and is generally the preferred method for collecting stats on partitioned tables. We've generally tried to steer people away from incremental stats because of the size issues on large tables, It would also be error-prone to use correctly and complex to implement - what happens if you compute incremental stats with different subsets of the columns? You can end up with different subsets of the columns on different partitions and then you have to somehow reconcile it all each time.

Tim Armstrong · ‎09-20-2019

@Zane- I'm late but can provide some additional insight. I think the suggestion in the error message is a good one (I'm biased because I wrote it, but some thought went into it). "Memory is likely oversubscribed. Reducing query concurrency or configuring admission control may help avoid this error". The general solution for this is to set up admission control with some memory limits so that memory doesn't get oversubscribed, and so that one query can't gobble up more memory than you like. I did a talk at strata that gave pointers on a lot of this things - https://conferences.oreilly.com/strata/strata-ca-2019/public/schedule/detail/73000 In this case you can actually see that query 2f4b5cff11212907:886aa1400000000 is using Total=78.60 GB memory, so that's likely your problem. Impala's resource management is totally permissive out of the box and will happily let queries use up all the resources in the system like this. I didn't see what version you're running, but there were a lot of improvements in this area (config options, OOM-avoidance, diagnostics) in CDH6.1+ There's various other angles you can take to improve this - if the queries using lots of memory are suboptimal, tuning them (maybe just computing stats) makes a big difference. You can also

TCloud · ‎09-19-2019

Not the best approach to getting rid of these messages but it gave me what I wanted. I set highest logging level to ERROR instead so everything else is not printed: tom@mds.xyz@cm-r01en01:~] 🙂 $ cat /etc/spark/conf/log4j.properties log4j.rootLogger=${root.logger} root.logger=ERROR,console log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n shell.log.level=ERROR log4j.logger.org.spark-project.jetty=WARN log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=ERROR log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=ERROR log4j.logger.org.apache.parquet=ERROR log4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=FATAL log4j.logger.org.apache.hadoop.hive.ql.exec.FunctionRegistry=ERROR log4j.logger.org.apache.spark.repl.Main=${shell.log.level} log4j.logger.org.apache.spark.api.python.PythonGatewayServer=${shell.log.level} tom@mds.xyz@cm-r01en01:~] 🙂 $ tom@mds.xyz@cm-r01en01:~] 🙂 $ tom@mds.xyz@cm-r01en01:~] 🙂 $ digg /etc/spark/conf/log4j.properties /etc/spark/conf/log4j.properties-original -sh: digg: command not found tom@mds.xyz@cm-r01en01:~] 😞 $ diff /etc/spark/conf/log4j.properties /etc/spark/conf/log4j.properties-original 2c2 < root.logger=ERROR,console --- > root.logger=DEBUG,console 10,11c10,11 < log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=ERROR < log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=ERROR --- > log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO > log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO tom@mds.xyz@cm-r01en01:~] 😞 $ Now I get my spark-shell without the INFO, DEBUG or WARNING messages all over it. Still interested in a final solution if possible. I only see it fixed in Spark 3.0 . Cheers, TK

Online	Offline
Last Visited	‎08-12-2020 03:17 AM

Member Since	‎03-23-2015 01:24 PM
Last Visited	‎08-12-2020 03:17 AM
Posts	1,288
Kudos received	113

Cloudera Community

Re: max() function generating an error in sqoop

Re: Add a dynamic variable to a Hive view

Re: Hive Server 2 failing to start CDP ,Cloudera M...

Re: Sqoop export from hive to teradata - > issue ...

Re: Cloudera Hadoop internal workings

Re: unable to create unique partitions in hive

Re: Beeline Fails to connect to Hive with Auto-TLS...

Re: After Impala Refresh Metadata is still stale

Re: Informatica ODBC Setup with SSL

Re: Host with invalid Cloudera Manager Agent UUID ...

Re: Select query on Hive partitioned table not wor...

Re: Not able to connect to hiveserver2 to access ...

Re: Impala compute incremental stats on specific c...

Re: ExecQueryFInstances rpc query_id=e74ef8d9b9215...

Re: DEBUG security.UserGroupInformation: Privilege...