About csguna

DennisJaheruddi · ‎05-06-2019

Since the question was asked, the situation has changed. As soon as Hortonworks and Cloudera merged, NiFi became supported by Cloudera. Shortly after the integrations with CDH were also completed, so that NiFi is now a fully supported and integrated component. Hence the question already contains the answer: NiFi is the Cloudera answer for solving these usecases.

vincentyau · ‎05-03-2019

no cm installed. cdh: 2.6.0-cdh5.15.1 Failure: 2019-05-03 13:18:50,472 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:etl (auth:SIMPLE) cause:java.net.ConnectException: Call From clientMachine#14/clientMachine#14IP to NameNode:9000 failed on connection exception: java.net.ConnectException: Connection timed out; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 2019-05-03 13:18:50,473 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.net.ConnectException: Call From clientMachine#14/clientMachine#14IP to NameNode:9000 failed on connection exception: java.net.ConnectException: Connection timed out; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731) at org.apache.hadoop.ipc.Client.call(Client.java:1508) at org.apache.hadoop.ipc.Client.call(Client.java:1441) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy14.getBlockLocations(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:268) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1324) at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1311) at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1299) at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:315) at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:280) at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:267) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1630) at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:339) at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:335) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:335) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:784) at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:356) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:760) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.net.ConnectException: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:648) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:744) at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557) at org.apache.hadoop.ipc.Client.call(Client.java:1480) ... 31 more Datanode log on the failed client machine#14 around the failed time: 886389:2019-05-03 13:17:36,651 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: clientMachinej#14:50010:DataXceiver error processing WRITE_BLOCK operation src: /clientIP:36531 dst: /datanode#10IP:50010; org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-1669201578-10.202.33.149-1549049450582:blk_1155667125_83195343 already exists in state FINALIZED and thus cannot be created. 892139:2019-05-03 13:18:22,989 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(clientIP, datanodeUuid=3e35a5fc-4e9b-4548-8631-ab85b5885007, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-1f8081e6-d4ed-437d-8273-1f2a5edd4057;nsid=338758850;c=0):Exception writing BP-1669201578-10.202.33.149-1549049450582:blk_1155669671_83197947 to mirror datanode#10IP:50010 892140:java.io.IOException: Connection reset by peer 894608:java.io.EOFException: Premature EOF: no length prefix available 894613:2019-05-03 13:18:49,212 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-1669201578-10.202.33.149-1549049450582:blk_1155671740_83200091 894614:java.io.IOException: Premature EOF from inputStream 894626:2019-05-03 13:18:49,212 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: IOException in BlockReceiver.run(): 894627:java.nio.channels.ClosedByInterruptException 894642:java.nio.channels.ClosedByInterruptException 894657:2019-05-03 13:18:49,212 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1669201578-10.202.33.149-1549049450582:blk_1155671740_83200091 received exception java.io.IOException: Premature EOF from inputStream 894659:java.io.IOException: Premature EOF from inputStream 894674:java.io.EOFException: Premature EOF: no length prefix available 894679:2019-05-03 13:18:49,227 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: IOException in BlockReceiver.run(): 894680:java.io.IOException: Connection reset by peer 894698:java.io.IOException: Connection reset by peer 894716:2019-05-03 13:18:49,228 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-1669201578-10.202.33.149-1549049450582:blk_1155671508_83199858 894717:java.nio.channels.ClosedByInterruptException 894739:2019-05-03 13:18:49,228 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1669201578-10.202.33.149-1549049450582:blk_1155671508_83199858 received exception java.nio.channels.ClosedByInterruptException 894741:java.nio.channels.ClosedByInterruptException 894765:2019-05-03 13:18:49,369 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-1669201578-10.202.33.149-1549049450582:blk_1155671603_83199953 894766:java.io.IOException: Premature EOF from inputStream 894780:2019-05-03 13:18:49,369 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1669201578-10.202.33.149-1549049450582:blk_1155671603_83199953 received exception java.io.IOException: Premature EOF from inputStream 894782:java.io.IOException: Premature EOF from inputStream 894796:2019-05-03 13:18:49,532 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-1669201578-10.202.33.149-1549049450582:blk_1155671632_83199982 894797:java.io.IOException: Premature EOF from inputStream 894810:java.io.EOFException: Premature EOF: no length prefix available 894817:2019-05-03 13:18:49,532 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1669201578-10.202.33.149-1549049450582:blk_1155671632_83199982 received exception java.io.IOException: Premature EOF from inputStream 894819:java.io.IOException: Premature EOF from inputStream There was some GC pause on namenode machine#9, but fixed by increase the Heap Memory. 129,238,586 files and directories, 31,709,865 blocks = 160,948,451 total filesystem object(s). Heap Memory used 33.34 GB of 101.15 GB Heap Memory. Max Heap Memory is 101.15 GB. On the machine #9 datanode about the same time: 2019-05-03 13:18:42,146 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1669201578-10.202.33.149-1549049450582:blk_1155671398_83199748, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2019-05-03 13:18:42,148 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1669201578-10.202.33.149-1549049450582:blk_1155671637_83199987, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2019-05-03 13:18:42,149 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1669201578-10.202.33.149-1549049450582:blk_1155671717_83200068 src: /machine#12:33498 dest: /machine#9:50010 2019-05-03 13:18:42,492 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 1096ms (threshold=300ms) 2019-05-03 13:18:42,521 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 1163ms (threshold=300ms) 2019-05-03 13:18:54,592 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 508ms (threshold=300ms) 2019-05-03 13:18:55,892 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 14349ms (threshold=300ms) 2019-05-03 13:18:55,734 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 338ms (threshold=300ms) 2019-05-03 13:18:55,404 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 585ms (threshold=300ms) 2019-05-03 13:18:55,352 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 13225ms (threshold=300ms) 2019-05-03 13:18:55,094 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 501ms (threshold=300ms) 2019-05-03 13:18:55,223 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 2067ms (threshold=300ms) 2019-05-03 13:18:56,730 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 4850ms No GCs detected 2019-05-03 13:18:55,005 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 1025ms (threshold=300ms) 2019-05-03 13:18:54,957 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 850ms (threshold=300ms) 2019-05-03 13:18:54,796 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 12170ms (threshold=300ms) 2019-05-03 13:18:54,634 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 504ms (threshold=300ms) 2019-05-03 13:18:56,107 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write packet to mirror took 13561ms (threshold=300ms) 2019-05-03 13:18:57,012 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow PacketResponder send ack to upstream took 1512ms (threshold=300ms), PacketResponder: BP-1669201578-10.202.33.149-1549049450582:blk_1155670553_83198851, type=HAS_DOWNSTREAM_IN_PIPELINE, replyAck=seqno: 1704 reply: 0 reply: 0 downstreamAckTimeNanos: 9451006117 2019-05-03 13:18:57,036 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow PacketResponder send ack to upstream took 392ms (threshold=300ms), PacketResponder: BP-1669201578-10.202.33.149-1549049450582:blk_1155671880_83200232, type=LAST_IN_PIPELINE, downstreams=0:[], replyAck=seqno: 67 reply: 0 downstreamAckTimeNanos: 0 2019-05-03 13:18:57,068 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow PacketResponder send ack to upstream took 600ms (threshold=300ms), PacketResponder: BP-1669201578-10.202.33.149-1549049450582:blk_1155671512_83199862, type=HAS_DOWNSTREAM_IN_PIPELINE, replyAck=seqno: 19 reply: 0 reply: 0 reply: 0 downstreamAckTimeNanos: 11402140530 2019-05-03 13:18:57,066 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Slow PacketResponder send ack to upstream took 2563ms (threshold=300ms), PacketResponder: BP-1669201578-10.202.33.149-1549049450582:blk_1155671711_83200062, type=LAST_IN_PIPELINE, downstreams=0:[], replyAck=seqno: 153 reply: 0 downstreamAckTimeNanos: 0

Bittu · ‎04-13-2019

Had figured out a solution, and shared it on my blog.. https://tips-to-code.blogspot.com/2018/07/automated-bash-script-to-export-all.html

gzigldrum · ‎04-05-2019

Please attempt to start CM server again @sbommaraju Then look up /var/log/cloudera-scm-server/cloudera-scm-server.log as this should indicate potential reason for the startup failure. Also we suggest to start your own thread in the forums if you need further help with this startup error

alexB · ‎02-24-2019

In case you are doing it from Windows you can use Python script hivehoney to extract table data to local CSV file. It will: Login to bastion host. pbrun. kinit. beeline (with your query). Save echo from beeline to a file on Windows. Execute it like this: set PROXY_HOST=your_bastion_host set SERVICE_USER=you_func_user set LINUX_USER=your_SOID set LINUX_PWD=your_pwd python hh.py --query_file=query.sql

patwa95 · ‎02-23-2019

Hi, I am also facing the same issue. Not able to load hive table into spark. I tried to copy the xml files in spark conf folder. But its permission denied and I tries to change the permission for the folder also .That is also not working. Using cloudera vm 5.12

sridarvs · ‎02-22-2019

Requirement is to create a parquet file and send it to downstream. Issue is that parquet file is always getting created as empty file even when there is data in "select * from prov_cust" table.

RKrishna · ‎01-09-2019

Hi All, We want to pull the data from MS SQL Server into hadoop using SQOOP with windows authentication. I am using the below sqoop command to pull the information - Sqoop eval --connect "jdbc:jtds:sqlserver://IP:portnumber;databaseName=test;domain=Domain_Name;useNTLMv2=true" --username XXX --password XXXXXX --query 'SELECT * FROM INFORMATION_SCHEMA.COLUMNS where $CONDITIONS' --driver net.sourceforge.jtds.jdbc.Driver When executing the above sqoop cmd it is throwing an error - WARN tool.EvalSqlTool: SQL exception executing statement: java.sql.SQLException: I/O Error: DB server closed connection. at net.sourceforge.jtds.jdbc.TdsCore.nextToken(TdsCore.java:2388) at net.sourceforge.jtds.jdbc.TdsCore.login(TdsCore.java:609) at net.sourceforge.jtds.jdbc.JtdsConnection.<init>(JtdsConnection.java:369) at net.sourceforge.jtds.jdbc.Driver.connect(Driver.java:183) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:215) at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:904) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.tool.EvalSqlTool.run(EvalSqlTool.java:64) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252) Caused by: java.io.IOException: DB server closed connection. at net.sourceforge.jtds.jdbc.SharedSocket.readPacket(SharedSocket.java:883) at net.sourceforge.jtds.jdbc.SharedSocket.getNetPacket(SharedSocket.java:762) at net.sourceforge.jtds.jdbc.ResponseStream.getPacket(ResponseStream.java:477) at net.sourceforge.jtds.jdbc.ResponseStream.read(ResponseStream.java:114) at net.sourceforge.jtds.jdbc.TdsCore.nextToken(TdsCore.java:2282) Please can you help me on this. Thanks in advance

ludof · ‎12-16-2018

Hi @csguna, CDH version is 5.13.2

SandeepKolli · ‎12-06-2018

To add, For sizing a datanode heap it's similar to namenode heap, its recommend 1GB per 1M blocks. As a block could be as small a 1byte or as large as 128MB, the requirement of heap space is the same.

Online	Offline
Last Visited	‎10-28-2024 06:24 AM

Member Since	‎05-16-2016 09:33 PM
Last Visited	‎10-28-2024 06:24 AM
Posts	785
Kudos received	112

Cloudera Community

Re: Kerberos / Sentry Integration

Re: How to upgrade Hive from 2.1 to 3.0 via CDH 6....

Re: How does nameservice id works for HA, how does...

Re: What license does the express edition fall und...

Re: Sqoop2 over Sqoop1 in CDH6

Re: Apache Nifi - Cloudera Alternative?

Re: mapreduce job ConnectException: Connection tim...

Re: Executing multiple hive scripts in a folder

Re: Cloudera Manager server is not starting

Re: how to download hive data into csv format

Re: How to load Hive data in to Spark-shell

Re: PARQUET File getting created as empty file

Re: Sqoop fails with windows authentication error ...

Re: Doubts about YARN memory configuration in Clou...

Re: Datanodes report block count more than thresho...