About cdhhadoop

cdhhadoop · ‎09-12-2018

@Tomas79, Thanks for inputs. Job is not failing immediately. After 81% completion of mappers, it is failing.Can you suggest how to increase map or reduce memory just for hive as our other applications are running fine. Thanks, Priya

cdhhadoop · ‎09-11-2018

Hello All, I am also facing the issue of sending messages from producer. Our cluster is not having kerberos.Getting below error. 18/09/11 02:28:20 ERROR internals.ErrorLoggingCallback: Error when sending message to topic Topic_Landing_dev with key: null, value: 2 bytes with error: org.apache.kafka.common.errors.TimeoutException: Expiring 2 record(s) for Topic_0 due to 1599 ms has passed since batch creation plus linger time 18/09/11 02:28:20 ERROR internals.ErrorLoggingCallback: Error when sending message to topic Topic_Landing_dev with key: null, value: 5 bytes with error: org.apache.kafka.common.errors.TimeoutException: Expiring 2 record(s) for Topic_0 due to 1599 ms has passed since batch creation plus linger time ^C18/09/11 02:30:58 INFO producer.KafkaProducer: Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. In kafka logs at location /var/log/kafka , I see message like below ERROR kafka.server.ReplicaFetcherThread: [ReplicaFetcherThread-0-243], Error for partition [__consumer_offsets,19] to broker 243:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. kafka broker with id 243 is not available. Please help. Thanks, Priya

cdhhadoop · ‎09-11-2018

Hello All, I am trying to run select count(*) from table_name query in hive, but it is giving below error. Diagnostic Messages for this Task: Error: Java heap space FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask I tried using export HADOOP_CLIENT_OPTS=-Xmx4g and then launching hive , query executed once. but now its not working afterwards. I used analyze table command also and inner query option as well. but no luck. Please suggest. Thanks, Priya

cdhhadoop · ‎07-30-2018

Hello All, We have CDH 5.4 cluster and kerberos authentication is not enabled on cluster and also we don't have sentry service running in our cluster. We use beeline client and use "!connect jdbc:hive2://hostname:10000" as a connection string, and entered no credentials (username and password) ,but we are still able to see and access all databases and its tables. I set hive.server2.enable.doAs parameter to false and also using the user which doesn't have sudo access to root user, but still I am able to access all data. I want only certain user with valid password should be able to connect to beeline and rest users should not be able to connect only. I believe authentication is the way to do the same or is there any other way to achieve this please? Can you please help me on this? Thanks, Priya

cdhhadoop · ‎06-10-2018

Thanks

cdhhadoop · ‎01-04-2018

Hi, I am getting below error while running the flume agent. ERROR PollableSourceRunner:156 - Unhandled exception, logging and sleeping for 5000ms org.apache.flume.ChannelException: Unable to put batch on required channel: FileChannel ch2 { dataDirs: [/var/lib/flume-ng/plugins.d/custom/datac2] } at org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:200) at org.keedio.flume.source.SQLSource$ChannelWriter.flush(SQLSource.java:195) at org.keedio.flume.source.SQLSource$ChannelWriter.write(SQLSource.java:190) at org.keedio.flume.source.SQLSource$ChannelWriter.write(SQLSource.java:190) at java.io.Writer.write(Writer.java:192) at java.io.PrintWriter.write(PrintWriter.java:456) at java.io.PrintWriter.write(PrintWriter.java:473) at com.opencsv.CSVWriter.writeNext(CSVWriter.java:263) at com.opencsv.CSVWriter.writeAll(CSVWriter.java:151) at org.keedio.flume.source.HibernateHelper.executeQuery(HibernateHelper.java:155) at org.keedio.flume.source.SQLSource.process(SQLSource.java:127) at org.apache.flume.source.PollableSourceRunner$PollingRunner.run(PollableSourceRunner.java:139) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.flume.ChannelFullException: The channel has reached it's capacity. This might be the result of a sink on the channel having too low of batch size, a downstream system running slower than normal, or that the channel capacity is just too low. [channel=ch2] at org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doPut(FileChannel.java:460) at org.apache.flume.channel.BasicTransactionSemantics.put(BasicTransactionSemantics.java:93) at org.apache.flume.channel.BasicChannelSemantics.put(BasicChannelSemantics.java:80) at org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:189) The configuration settings that I am using for flume is as below. agent1.channels.ch1.capacity=100000 agent1.channels.ch2.capacity=100000 agent1.sinks.sink1.morphlineId=morphline1 agent1.sinks.sink1.type=org.apache.flume.sink.solr.morphline.MorphlineSolrSink agent1.sinks.sink1.channel=ch1 agent1.sinks.sink1.morphlineFile=/var/lib/flume-ng/plugins.d/custom/conf/morphlines.conf agent1.sinks.sink1.batchSize=100 agent1.sinks.sink1.batchDurationMillis=1000 agent1.sinks.sink2.morphlineId=morphline1 agent1.sinks.sink2.type=org.apache.flume.sink.solr.morphline.MorphlineSolrSink agent1.sinks.sink2.channel=ch2 agent1.sinks.sink2.morphlineFile=/var/lib/flume-ng/plugins.d/custom/conf/morphlines.conf agent1.sinks.sink2.batchSize=100 agent1.sinks.sink2.batchDurationMillis=1000 agent1.sources=sql-source ods-sql-source agent1.sinks=sink1 sink2 agent1.channels=ch1 ch2 #agent1.channels.ch1.type=memory #agent1.channels.ch2.type=memory agent1.sources.sql-source.type=org.keedio.flume.source.SQLSource agent1.sources.ods-sql-source.type=org.keedio.flume.source.SQLSource agent1.sources.sql-source.channels=ch1 agent1.sources.ods-sql-source.channels=ch2 #use FILE channel agent1.channels.ch1.type = file agent1.channels.ch1.transactionCapacity = 1000 agent1.channels.ch1.checkpointDir = /var/lib/flume-ng/plugins.d/custom/checkpointc1 #NOTE: point to your checkpoint directory agent1.channels.ch1.dataDirs = /var/lib/flume-ng/plugins.d/custom/datac1 #NOTE: point to your data directory #use FILE channel agent1.channels.ch2.type = file agent1.channels.ch2.transactionCapacity = 1000 agent1.channels.ch2.checkpointDir = /var/lib/flume-ng/plugins.d/custom/checkpointc2 #NOTE: point to your checkpoint directory agent1.channels.ch2.dataDirs = /var/lib/flume-ng/plugins.d/custom/datac2 #NOTE: point to your data directory Can you please help me on this? Thanks, Priya

cdhhadoop · ‎12-21-2017

Hi all, I am exeperiencing swap memory usage notifications for the service components like HDFS,hive,Yarn,hue,hbase,oozie,mapreduce,zookeeper in cloudera manager console. swappiness value across the hosts in the cluster is set to 10. Please provide your inputs to get rid of these. Thanks, Priya

cdhhadoop · ‎11-23-2017

@Harry, I am using command line option hadoop archive -archiveName archive-file-name input-path output-path for generating har files. Thanks, Priya

cdhhadoop · ‎11-20-2017

Hi, Can anyone help me on this please? Thanks, Priya

cdhhadoop · ‎11-16-2017

Hi, We are using CDH 5.9.2 14-node cloudera cluster. I have har file archived from multiple directories. However when I tried to check the contents of har file ,I got the error as below. hdfs dfs -ls har:///user/user1/HDFSArchival/Output1/Archive-16-11-2017-02-20.har -ls: Can not create a Path from an empty string Usage: hadoop fs [generic options] -ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [<path> ...] I am able to check the size of har file though. hdfs dfs -du -s -h /user/user1/HDFSArchival/Output1/Archive-16-11-2017-02-20.har 3.2 G 9.6 G /user/user1/HDFSArchival/Output1/Archive-16-11-2017-02-20.har When I was looking for the error, got to know about the bug in CDH4.4.0.1 (and possibly earlier) in the globbing functionality inside .HAR files, implemented in the methods FileSystem::globStatus() and FileSystem::globStatusInternal(). Is the bug still present in CDH 5.9 also? Can you please help me to solve this? Thanks, Priya

Online	Offline
Last Visited	‎05-08-2019 09:52 PM

Member Since	‎08-07-2017 06:39 PM
Last Visited	‎05-08-2019 09:52 PM
Posts	144
Kudos received	5

Cloudera Community

Re: Getting error while adding sentry service

Re: Launcher ERROR, reason: Main class [org.apache...

Re: select count(*) inside hive giving error

unable to send message from kafka producer in kafk...

select count(*) inside hive giving error

Authentication of users through beeline

Re: Getting error while running flume agent

Getting error while running flume agent

swap memory usage for HDFS,hive,Yarn,hue,hbase,ooz...

Re: Can not create a Path from an empty string for...

Re: Can not create a Path from an empty string for...

Can not create a Path from an empty string for hdf...