About mbigelow

mbigelow · ‎02-21-2017

Ok, had to get my MySQL DBA hat back out. InnoDB max key length is 767 bytes and MyISAM's is 1000. Latin1 is a character to byte equivalent while utf8 requires additional bytes per character. So the key that is trying to add to this table at a minimum is 1151 bytes. I just checked out my CDH 5.8.2 metastore and see the same index and same column sizes. So I don't have idea why it is an issue for you. Can you try upgrading to a lower CDH version first?

mbigelow · ‎02-21-2017

Do you have the bigtop-detect-javahome at either location? The error in the one image says it isn't there which may indicate an issue with your parcels or packages install (oh below is for parcels). /opt/cloudera/parcels/CDH/lib/bigtop-utils/bigtop-detect-javahome /opt/cloudera/parcels/CDH/bin/bigtop-detect-javahome

mbigelow · ‎02-21-2017

Sorry wrong setting. yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds

mbigelow · ‎02-21-2017

yes, that is pretty frequent though so I don't know how it will go. I'd be interested to know.

mbigelow · ‎02-21-2017

This got lost in my earlier reply... yarn.log-aggregation.retain-check-interval-seconds This determine when it checks if logs need to be aggregated. By default it is 0 which means it doesn't check and a job must finish. This will allow it to collect the logs for jobs that, in theory, won't end.

mbigelow · ‎02-21-2017

What is your CDH version?

mbigelow · ‎02-21-2017

I think I tried that too but it doesn't work. You need to set it in the spark-opts (where you should have your exectutor and driver memory set) like '--files hdfs:///user/hue/oozie/workspaces/hue-oozie-1463575878.15/hive-site.xml'

mbigelow · ‎02-21-2017

Where is your hive-site.xml located? This exception indicates that it isn't available to the job so it is launching the default embedded derby HMS database.

mbigelow · ‎02-21-2017

In my opinion, the issue at had is that the hive-site.xml is not passed properly and there for it defaults to using the embedded derby database. This "fix" is just allowing the Spark job to use an embedded derby HMS instead of your actual HMS. Have you checked that it is properly created tables or other metadata in your actual HMS?

mbigelow · ‎02-21-2017

You could do this in many ways. You could just load it in Solr/ES and go to town. Hive would not be a great fit but I could see some tables being build around specific data like job counters or metrics. MR jobs could be build to pull out specific data (possible to load into a Hive table) or Spark jobs (and the Spark shell can be used to explore there raw data). And simple tools like grep, awk, etc. can be used as the individual logs, when aggregated, are available to the user. If you have CM, the YARN application screen for a cluster, I'm pretty sure, is built using an embedded Solr and gives you and idea of what could be done. This is more around metrics and job counters again.

Online	Offline
Last Visited	‎03-25-2019 05:55 PM

Member Since	‎08-16-2016 08:51 PM
Last Visited	‎03-25-2019 05:55 PM
Posts	642
Kudos received	129

Cloudera Community

Re: Configuring the HDFS superuser in Kerberos

Re: Hive process crash

Re: Upgrade from CDH 5.11 Express to Enterprise

Re: Adding user to Cloudera Manager using REST AP...

Re: Running in non-interactive mode, and data appe...

Re: upgrade hive schema failed

Re: Failed to perform First Run of services. CDH v...

Re: Log managmement for Long-running Spark Stream...

Re: Log managmement for Long-running Spark Stream...

Re: Log managmement for Long-running Spark Stream...

Re: Could not initialize class org.apache.derby.jd...

Re: Could not initialize class org.apache.derby.jd...

Re: Could not initialize class org.apache.derby.jd...

Re: Oozie - Spark Action - HiveSQLContext

Re: Log managmement for Long-running Spark Stream...