About mbigelow

jirapong · ‎09-20-2017

Thank you very much. You're right. And this turn out to have me start whole cluster from zero again.

Bill Havanki · ‎09-19-2017

Hi Tomas79, First, thanks for your contributions to this thread as well as your suggestions! We do steer experienced users toward using configuration files and the bootstrap-remote CLI command vs. the UI, because the UI would get complex if we tried to add a checkbox or field or form for every Director feature into it. The Director server does have a full set of API endpoints that you can use to make updates to clusters that aren't easy or possible to do over the UI, so I recommend taking a look there. If you go to the /api-console URL for Director, there's an interactive facility for learning about the API and trying it out live. For example, there is an API endpoint for importing a configuration file directly into the server. It's documented here: https://www.cloudera.com/documentation/director/latest/topics/director_cluster_config.html#concept_lqt_2y1_x1b We don't have a corresponding configuration file export API endpoint yet, but you are not the first to suggest it, so be assured that it's on our wish list. In the meantime, the API can help you if you're willing to work with that. Director's single log is tough to navigate. Recent Director versions have added more context to lines in the log which make it feasible to filter relevant lines out. We've got some techniques documented here: https://www.cloudera.com/documentation/director/latest/topics/director_troubleshoot.html But I see room for more documentation. Specifically, at least as of 2.4: Each line includes a thread ID in square brackets. The ones starting with "p-" are for pipelines, Director's internal workflows, so you can follow one of those among all the other pipelines and other asynchronous tasks within Director. The fields following the thread ID are the unique (API) request ID, request method, and request URI that ultimately caused the activity being logged. You can work with the logback.xml file for the server to change the formatting, and perhaps even route logging to multiple files for easier comprehension (another ask that we've heard). Again, thanks for your feedback!

gvBigD · ‎09-14-2017

Thank you for your answers. I will install the license and report here how it went. Edit: As everything was already configured, the only point was about the ReportManager role brought by the Enterprise version: I was asked for the DB credentials for this particular role and where to install it. Then a few restarts and the Enterprise version was up-and-running.

Shahroz · ‎09-14-2017

Sorry for coming back this late. Yeah CM API did the work. Thanks though.

ebeb · ‎09-14-2017

Perfect answer thanks!! After increasing the Java Heap Size of HiveServer2 in Bytes to 1 GiB the query is running fine at 10,000 and 20,000 row limit and no Hive process crash so far. It was earlier at 50 MiB and when the query ran it used 250MiB java heap. However we still have some more setting for heap as below that I am not sure if better to bump up or not. I will look for a tuning guide for CDH. I do think the default values should not be set so low by Cloudera as none of the mainstream databases like Oracle, SQL Server would ever crash the server process due to a wayward SQL query just my opinion. Spark Executor Maximum Java Heap Size spark.executor.memory HiveServer2 Default Group 256 MiB Spark Driver Maximum Java Heap Size spark.driver.memory HiveServer2 Default Group 256 MiB Client Java Heap Size in Bytes Gateway Default Group 2 GiB Java Heap Size of Hive Metastore Server in Bytes Hive Metastore Server Default Group 50 MiB Java Heap Size of HiveServer2 in Bytes HiveServer2 Default Group 1 GiB Java Heap Size of WebHCat Server in Bytes WebHCat Server Default Group 50 MiB

anirbandd · ‎09-01-2017

I am having the same problem.. @mbigelow can you kindly provide some guidance as to how to initiate a hivecontext properly in an IDE like IntelliJ or Eclipse?

cc1983 · ‎08-18-2017

Reinstalled to overcome the issue.

imad87 · ‎08-17-2017

I added the following UserGroupInformation.setConfiguration(conf); UserGroupInformation.loginUserFromKeytab("myId@OurCompany.ORG", "/myPathtoMyKeyTab/my.keytab") I was able to connect and get a list of the files in the HSFS directory, however the write operation failed with the following exception: java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:197) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118) at java.io.FilterInputStream.read(FilterInputStream.java:83) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2270) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1701) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1620) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:772) 17/08/17 13:31:49 WARN hdfs.DFSClient: Abandoning BP-2081783877-10.91.61.102-1496699348717:blk_1074056717_315940 17/08/17 13:31:49 WARN hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[10.91.61.106:50010,DS-caf46aea-ebbb-4d8b-8ded-2e476bb0acee,DISK] Any ideas? Pointers, help is appreciated.

Fawze · ‎08-17-2017

I figured out the issue. The diffetence comes from /tmp/logs. Weird why hdfs dfs -du -h -s / is not considering /tmp/logs.

mmm286 · ‎08-17-2017

I've reinstalled Hive and it's solved! Thanks

Online	Offline
Last Visited	‎03-25-2019 05:55 PM

Member Since	‎08-16-2016 08:51 PM
Last Visited	‎03-25-2019 05:55 PM
Posts	642
Kudos received	129

Cloudera Community

Re: Configuring the HDFS superuser in Kerberos

Re: Hive process crash

Re: Upgrade from CDH 5.11 Express to Enterprise

Re: Adding user to Cloudera Manager using REST AP...

Re: Running in non-interactive mode, and data appe...

Re: Running in non-interactive mode, and data appe...

Re: Director cannot create EC2 - Insufficient numb...

Re: Upgrade from CDH 5.11 Express to Enterprise

Re: Unmanaged cloudera deployment and support??

Re: Hive process crash

Re: Failing to save dataframe to

Re: manifest.json failed to download

Re: Unable to access HDFS after enabling kerberos ...

Re: HDFS storage check shows different values

Re: HIVE metastore failed to start (password authe...