About mbigelow

mbigelow · ‎05-23-2017

@BellRizz This is awesome. I hadn't thought to look at it from that angle. It makes sense though as the NN, based on its heap is expected to only be able to handle X number of blocks and so you can set the threshold on the DNs. You should add "times 3" to your calculation though as the NN calc only accounts for single replicas. It stores the metadata for a single block which will include the location of the other two replicas but this is part of the 300 bytes already. So when it is saying that, as a rule of thumb, 1 GB for 1 million blocks, there will actually be 3 million total blocks at a default replication factor of 3. And maybe also divide the number by the number of DNs. NN of 5 GB should handle upwards of 5 millions blocks, which is actually 15 million total. A 10 node cluster should set the DN block threshold to 1.5 million.

mbigelow · ‎04-27-2017

Is there anything useful in the cloudera-scm-agent logs on the node you are trying to add it to? Have you tried add other services to that node recently?

mbigelow · ‎04-27-2017

After you get the ticket what is the output of klist -ef? That will show the ticket you have and encryption type provided by the KDC that issued the ticket.

mbigelow · ‎04-27-2017

I have not found good information on the ZK store but HMS HA requires at a minimium the DB Token store. Something from the users sessions, presumably the delegation token, is saved in either memory, the metastore db, or ZK. The first means that you would need a new token/session if you ended up connecting to a different HMS instance. The other two allow for HA and I image ZK just adds more fault tolerance as most installs have a minimum of 3 versus 1 RDBMS. I don't think the latter is used or tested extensively though and have seen other people have issues with it, specifically when Kerberos is enabled.

mbigelow · ‎04-25-2017

Was the table an internal (managed) or external (unmanaged) table? The former will delete the metadata and the underlying data in HDFS. The latter will not. As for removing the data now, you need to be a HDFS superuser. You logged into HUE as cloudera which is not. Easiest way is through the command line, switch to the hdfs user, and then run the command. This requires shell access and sudo access to hdfs, which you may not have. In leui of that you could create an hdfs user in user (assuming no auth) and then log into it. This is risky though as then the user exist within the HUE db and anybody that can get access to it will have root level access to HDFS. If you can do either of these, update the HDFS configs to include the cloudera account as a HDFS superuser (https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cm_sg_s5_hdfs_principal.html).

mbigelow · ‎04-25-2017

Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded Increase the container and heap size. I am not sure whether it is a mapper or reducer that is failing but here are the settings to look into. set hive.exec.reducers.bytes.per.reducer= set mapreduce.map.memory.mb= set mapreduce.reduce.memory.mb= set mapreduce.map.java.opts=<roughly 80% of container size> set mapreduce.reduce.java.opts=<roughly 80% of container size>

mbigelow · ‎04-20-2017

It is missing the cloudera agent log at /var/log/cloudera-scm-agent//cloudera-scm-agent.log Make sure it is created and writable by the user running the CM agent process. This may not fix it but will give us the log to look at. It also gives a warning about the /run/cloudera-scm-agent being in mode 755 instead of 751. You could try correcting that manually although I don't think this is the issue.

mbigelow · ‎04-20-2017

The creds.localjceks is a Java Keystore. These are using to store SSL certificates and keys with their passwords but they are also used to store database password more securely than having them in plain text in the configuration file. It seems that it can't create this for either HUE or Hive. Can you clarify were in the installation is this failing? You mention installation scripts and running it manually but this is in the CM board. Is CM installed but you can add a new cluster? Or are you not able to install CM and the agents? If this is through CM, is it running as root or another account? When running it manually, do you have root privileges?

mbigelow · ‎04-19-2017

Those are probably messages when the other daemons try to contact it. The issue may still be with the statestore but check the catalogd and impalad logs to see if there is more information as this error is very generic. It just means that there was an authentication error between the client and the statestore. Also, if possible post the krb5.conf and principals for the impala roles.

mbigelow · ‎04-19-2017

man took a bit of trial and error. The issue with the first run is that it is returning an empty line. I tried a few awk specific was to get around it but they didn't work. So here is a hack. And using the variable withing awk as well. DC=PN hdfs dfs -ls /lib/ | grep "drwx" | awk '{system("hdfs dfs -count " $8) }' | awk '{ gsub(/\/lib\//,"'$DC'"".hadoop.hdfs.",$4); print $4 ".folderscount",$1"\n"$4 ".filescount",$2"\n"$4 ".size",$3;}' PN.hadoop.hdfs.archive.folderscount 9 PN.hadoop.hdfs.archive.filescount 103 PN.hadoop.hdfs.archive.size 928524788 PN.hadoop.hdfs.dae.folderscount 1 PN.hadoop.hdfs.dae.filescount 13 PN.hadoop.hdfs.dae.size 192504874 PN.hadoop.hdfs.schema.folderscount 1 PN.hadoop.hdfs.schema.filescount 14 PN.hadoop.hdfs.schema.size 45964 DC=VA hdfs dfs -ls /lib/ | grep "drwx" | awk '{system("hdfs dfs -count " $8) }' | awk '{ gsub(/\/lib\//,"'$DC'"".hadoop.hdfs.",$4); print $4 ".folderscount",$1"\n"$4 ".filescount",$2"\n"$4 ".size",$3;}' VA.hadoop.hdfs.archive.folderscount 9 VA.hadoop.hdfs.archive.filescount 103 VA.hadoop.hdfs.archive.size 928524788 VA.hadoop.hdfs.dae.folderscount 1 VA.hadoop.hdfs.dae.filescount 13 VA.hadoop.hdfs.dae.size 192504874 VA.hadoop.hdfs.schema.folderscount 1 VA.hadoop.hdfs.schema.filescount 14 VA.hadoop.hdfs.schema.size 45964

Online	Offline
Last Visited	‎03-25-2019 05:55 PM

Member Since	‎08-16-2016 08:51 PM
Last Visited	‎03-25-2019 05:55 PM
Posts	642
Kudos received	129

Cloudera Community

Re: Configuring the HDFS superuser in Kerberos

Re: Hive process crash

Re: Upgrade from CDH 5.11 Express to Enterprise

Re: Adding user to Cloudera Manager using REST AP...

Re: Running in non-interactive mode, and data appe...

Re: Hadoop Data Node: why is there a "magic" numbe...

Re: Issue while adding zookeeper

Re: SOLR + Kerberos + curl ==> Cannot find key of ...

Re: Hive Metastore Token Store

Re: HUE Metastore Manager - Drop Table not deletin...

Re: Hive: Union all and aggregation are failing wi...

Re: Installation failed. Failed to receive heartbe...

Re: Unable to install my cluster. I get this error...

Re: Impala service does not start after enabling k...

Re: parsing the HDFS dfs -count output