About csguna

csguna · ‎02-20-2018

does your location - hdfs path resides on S3 ? hdfs://ip-10-0-1-138.eu-central-1.compute.internal/files/test/avro';

csguna · ‎02-20-2018

1 . just run plain sqoop list-tables 2. see if the port is listiening - 5432 3. is your jdbc jar in place postgresql-9.2-1002.jdbc4.jar (Example )

csguna · ‎02-19-2018

For datanode block count threshold , trying run the balancer see if that fixes your problem

csguna · ‎02-19-2018

having too many small files in the hadoop cluster is against its mantra few large files works best in hadoop cluster. I will provide the below link that explains why too many small files is not good for hadoop cluster. https://blog.cloudera.com/blog/2009/02/the-small-files-problem/ Just curious to what type of small files are those if it is parquet format there are code in github that can merge those files and keep em in the cluster based on your data block size

csguna · ‎02-19-2018

you might want to use the keytab to avoid expiration of the ticket in the kerberos . check if you have valid kerberos ticket ? https://www.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_sg_kadmin_kerberos_keytab.html

csguna · ‎02-19-2018

transport exception is generic , its not able to communicate with HiveServer2. Check if you have enough resouce vcores /memory available ? possible in yarn web ui Check if you have enough memory in the host (linux ) were you have deployed HiveServer2 - Check if you can perform ssh into the host were HiveServer2 is runining ? Finally see if you have HiveServer2 is up and runining in green status ? How many roles are in the host were you had deployed HiveServer2 ? Can you provide me the full stack trace of HiveServer2 log - Check if you have any Out of memory being thrown again as you had heap issue before ?

csguna · ‎02-13-2018

Just quick info you can run pig in local mode as well as in mapreduce mode , By default, load looks for your data on HDFS in a tab-delimited file using the default load function PigStorage. also if you start you pig -x which local mode it will look for local fs . Nice that you found the fix. @SGeorge ,

csguna · ‎02-13-2018

@Cloudera learning - Did you had a chance to raise the datanode bandwidh , Datanode heapsize , increase the replication work multiplier before kicking of the decommision . this will certainly increase the performance. Also if your decommision is runining for ever i would suggest you to commission it back and perform decommision it again. -

csguna · ‎02-13-2018

Could you share the cloudera scm server logs and agent logs full stack trace that has exceptions or error . if the directory is empty that means something in cloudera scm agent we can narrow down it if you provide the above logs.

csguna · ‎02-07-2018

try this and please let me know if that fixes hadoop jar BigData/mbds_anagrammes.jar org.mbds.hadoop.anagrammes.Anagrammes /mot.txt /rs

Online	Offline
Last Visited	‎10-28-2024 06:24 AM

Member Since	‎05-16-2016 09:33 PM
Last Visited	‎10-28-2024 06:24 AM
Posts	785
Kudos received	112

Cloudera Community

Re: Kerberos / Sentry Integration

Re: How to upgrade Hive from 2.1 to 3.0 via CDH 6....

Re: How does nameservice id works for HA, how does...

Re: What license does the express edition fall und...

Re: Sqoop2 over Sqoop1 in CDH6

Re: Connection reset while CREATE EXTERNAL TABLE

Re: Unable to instantiate org.apache.hadoop.hive.q...

Re: Missing activated parcel? Starting cluster sta...

Re: Missing activated parcel? Starting cluster sta...

Re: org.apache.hadoop.security.authentication.clie...

Re: Connection reset while CREATE EXTERNAL TABLE

Re: Running Pig scripts from HUE editor - Job gets...

Re: time for decommission a data node

Re: cloudera manager failed to stop all services

Re: Input path does not exist: hdfs://quickstart.c...