Member since
02-15-2016
33
Posts
6
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4600 | 01-18-2018 01:39 PM | |
3870 | 07-06-2017 09:57 AM | |
5302 | 05-24-2017 01:31 PM |
01-25-2018
11:10 AM
I am glad it's showing the increased values now. The following link might help, if not already referred to: https://www.cloudera.com/documentation/enterprise/5-12-x/topics/admin_nn_memory_config.html
... View more
01-23-2018
07:58 AM
Services won't expire, instead license expires. Most of these services are open source, but Cloudera Management Service is not. This stands true if you are using their EDH edition of Hadoop Distro; Cloudera express edition is free, I think. As far as I know, any node that's running services except Gateway roles, Flume roles & CM need to be licensed. But if these services are colocated on a machine along with other services then that machine also needs to be licensed. I would suggest reaching out to your account rep. for further details.
... View more
01-18-2018
01:39 PM
As far as I understand, Block Capacity means the total number of blocks HDFS can hold, irrespective of the size. For example, a file of 128MB size will consume 1 HDFS block (assuming HDFS block size is set to 128MB) from a Data Node perspective, but on the NameNode, it needs 2 namespace objects (1 for file inode and 1 block). Since all that is stored in memory, the block capacity should increave after increasing the heap size of namenode. Yes, you will have to restart HDFS and dependent services to see the increased capacity. However, it might take some time for it to reflect...
... View more
01-18-2018
12:57 PM
Cloudera support will definitely expire. As for the cluster, I think most of the cluster services will keep working, except the licensed pieces and one will be legally non-compliant to the licensing terms.
... View more
01-18-2018
12:47 PM
If you run the host inspector, then it will show you a detailed report about everything it found on that as well as all other hosts and from that report you can figure out what's wrong. The other option could be to restart agent on that particular node.
... View more
11-20-2017
09:34 AM
1 Kudo
Hi Everyone, I have a requirement to do full table loads for ~60 tables from an Oracle Database and I have a shell script that runs sqoop on each of those tables. But it takes a long time to load all those tables because some of them are huge, so I started tuning the sqoop job for each of them. However, I stumbled upon this option "--fetch-size" and I have some questions related to it: Does anyone know if it changes the "oracle.row.fetch.size" for the JDBC connection? Is there a maximum limit for this parameter? Does it impact the source DB or the Hadoop side resources? Are there any guidelines about finding an optimum value for this parameter? Thanks & Regards, Mohit Garg
... View more
Labels:
- Labels:
-
Apache Sqoop
07-20-2017
10:21 AM
Thanks Tristan! I had found that mistake and corrected it. Thanks for your response. Regards, MG
... View more
07-06-2017
09:57 AM
1 Kudo
Hi Everyone, Not sure if anyone else faced this issue, but after much research I was able to connect to Kerberized Hive successfully. I appended "-Djavax.security.auth.useSubjectCredsOnly=false" to the "jinit" .jinit(classpath=cp, parameters="-Djavax.security.auth.useSubjectCredsOnly=false") Basically, it disables the requirement of a GSS mechanism to obtain necessary credentials from an existing Subject and allows to use the specified authentication mechanism, which in this case is Kerberos.
... View more
06-30-2017
02:55 PM
Here's an update: I was able to fix the initial issue by adding all the jars in /opt/cloudera/parcels/CDH/... directory. However, now it's failing with Kerberos TGT not found error, although I am doing kinit before connecting. Is there something I am missing? " javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] "
... View more
06-30-2017
01:41 PM
I am trying to access Hive tables from R by using JDBC, but it's failing while establishing the connection for org.apache.hadoop.security.UserGroupInformation I thought "hadoop-core.jar" would provide this class, but it's still failing. Does someone have any idea? library("DBI") library("rJava") library("RJDBC") cp = c("/home/cdsw/impala_jars/hadoop-common.jar", "/home/cdsw/hive_jars/libthrift-0.9.0.jar", "/home/cdsw/hive_jars/hive_service.jar", "/home/cdsw/hive_jars/hadoop-core.jar", "/home/cdsw/hive_jars/TCLIServiceClient.jar", "hive_jars/hive-jdbc-1.1.0-cdh5.10.1-standalone.jar") .jinit(classpath=cp) for(l in list.files('/home/cdsw/hive_jars')){ .jaddClassPath(paste("/home/cdsw/hive_jars",l,sep=""))} for(l in list.files('/home/cdsw/impala_jars')){ .jaddClassPath(paste("/home/cdsw/impala_jars",l,sep=""))} .jclassPath() drv <- JDBC("org.apache.hive.jdbc.HiveDriver", "hive_jars/hive-jdbc-1.1.0-cdh5.10.1-standalone.jar", identifier.quote="`") con <- dbConnect(drv, "jdbc:hive2://localhost:10000/default;principal=hive/{HOST}@{REALM}")
... View more
Labels: