About csguna

csguna · ‎01-27-2017

csguna · ‎01-25-2017

Hi Can you anyone help me out with below configuration file and please let me know how to generate the SSH private key file - We configuring HA namenode cluster for testing purpose. <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/exampleuser/.ssh/id_rsa</value> </property> Not my area - I have no idea as to how to generate this file /home/exampleuser/.ssh/id_rsa Thanks

csguna · ‎01-15-2017

Few Suggestions that could help you out. How many reducers per btypes have you configured ? Check how many reducers are being invoked ? If you are doing a Join make sure that the large table is stated in last of the query , so it can be streamable. Would consider enabling the parallel execution mode in Hive. Enable if you can the Local Mode Have you enabled Jvm reuse ?

csguna · ‎01-01-2017

Can this be done in production ? mate

csguna · ‎12-31-2016

The mapreduce is getting successed and I am able to check the results in the hdfs. The problem is when i try to see in the histroy server the jobs are not there checkout the logs found this error 16/12/31 06:34:27 ERROR hs.HistoryFileManager: Error while trying to move a job to done org.apache.hadoop.security.AccessControlException: Permission denied: user=mapred, access=READ, inode="/user/history/done_intermediate/matt/job_1483174306930_0005.summary":matt:hadoop:-rwxrwx--- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:182) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5461) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5443) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:5405) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1680) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1632) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1612) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1586) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:482) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980) at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1139) at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1127) at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1117) at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:264) at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:231) at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:224) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1290) at org.apache.hadoop.fs.Hdfs.open(Hdfs.java:309) at org.apache.hadoop.fs.Hdfs.open(Hdfs.java:54) at org.apache.hadoop.fs.AbstractFileSystem.open(AbstractFileSystem.java:619) at org.apache.hadoop.fs.FileContext$6.next(FileContext.java:785) at org.apache.hadoop.fs.FileContext$6.next(FileContext.java:781) at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at org.apache.hadoop.fs.FileContext.open(FileContext.java:781) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.getJobSummary(HistoryFileManager.java:953) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.access$400(HistoryFileManager.java:82) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo.moveToDone(HistoryFileManager.java:370) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo.access$1400(HistoryFileManager.java:295) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$1.run(HistoryFileManager.java:843) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) looks like a permission but i am not sure where i should change em and what should be the chmod value below is my current configuration sudo -u hdfs hadoop fs -mkdir /user $ sudo -u hdfs hadoop fs -mkdir /user/matt $ sudo -u hdfs hadoop fs -chown matt /user/matt $ sudo -u hdfs hadoop fs -mkdir /user/history $ sudo -u hdfs hadoop fs -chmod 1777 /user/history $ sudo -u hdfs hadoop fs -chown mapred:hadoop \ /user/history can someone please help me with this issue .

csguna · ‎11-29-2016

Since you are going to use third party webpage I assuming that you wont be able to integrate or deploy flume sdk.if the webpage is ok in sending data via HTTP rather than using Flume's RPC , then I think HTTP source would be a good fit. From a client point of view HTTP source will act like a web server that accepts flume event.Either you can write your own Handler or use HTTPSourceXMLHandler in your configuration , the default Handler accepts Json format . The format which HTTPSourceXMLHandler accept is state below <events> <event 1 2 3 ..> <headers> <header 1 2 3 ..> </header> <body> </body> </event..> </events> The handler will parse the XML into flume events and pass it on to the HTTP Source. Which will then pass on to Channel and goes to Sink or Another agent depends on the flow.

csguna · ‎11-21-2016

Check the all the impala and hive demon status using the below command , if anyone one of them is not runing up please start and fire the invalidate metadata for refersh. sudo service impala-state-store status note - if not started please replace status with start. sudo service impala-catalog status sudo service hive-metastore status sudo service impala-server status

csguna · ‎11-16-2016

It worked . Thanks mate

csguna · ‎11-16-2016

I will certainly try to write the query in single line as per you suggestion , but i am wondering why do we need a placeholder $CONDITIONS When we are forcing the sqoop to perform only one job by using -num-mappers 1 ?

csguna · ‎11-15-2016

I am facing the sql error and sqoop error in two senarios . I performing this data for testing. DB - mysql Sqoop version : Sqoop version: 1.4.4-cdh5.0.0 My table citi +------+------------+-----------+ | id | country_id | city | +------+------------+-----------+ | 10 | 101 | omaha | | 11 | 102 | coloumbus | | 12 | 103 | buff | +------+------------+-----------+ table country +------------+---------+ | country_id | country | +------------+---------+ | 101 | us | | 102 | in | | 103 | nz | +------------+---------+ below is my sqoop import sqoop import \ > --connect jdbc:mysql://localhost/ms4 \ > --username xxx \ > --password yyy \ > --query 'SELECT citi.id, \ > country.name, \ > citi.city \ > FROM citi \ > JOIN country USING(country_id) \ > --num-mappers 1 \ > --target-dir cities below is the error i am seeing . I dont find anything wrong with my --query to my knoweldge . 16/11/15 05:27:02 INFO manager.SqlManager: Executing SQL statement: SELECT citi.id, \ country.name, \ citi.city \ FROM citi \ JOIN country USING(country_id) \ WHERE (1 = 0) 16/11/15 05:27:02 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '\ country.name, \ citi.city \ FROM citi \ JOIN country USING(country_id) \ WHERE' at line 1 com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '\ country.name, \ citi.city \ FROM citi \ JOIN country USING(country_id) \ WHERE' at line 1 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at com.mysql.jdbc.Util.handleNewInstance(Util.java:411) at com.mysql.jdbc.Util.getInstance(Util.java:386) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1052) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1990) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625) at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119) at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2283) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:699) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:708) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:243) at org.apache.sqoop.manager.SqlManager.getColumnTypesForQuery(SqlManager.java:233) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:356) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1298) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1110) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:396) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:506) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231) at org.apache.sqoop.Sqoop.main(Sqoop.java:240) 16/11/15 05:27:02 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: No columns to generate for ClassWriter at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1116) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:396) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:506) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231) at org.apache.sqoop.Sqoop.main(Sqoop.java:240) As a work around or trying to see if the WHERE $CONDITIONS is problem or not i skip WHERE $CONDITIONS by forcing the sqoop to use one mapper sqoop import \ > --connect jdbc:mysql://localhost/movielens \ > --username training \ > --password training \ > --query 'SELECT citi.id, \ > country.name, \ > citi.city \ > FROM citi \ > JOIN country USING(country_id)' \ > --num-mappers 1 \ > --target-dir cities I pretty sure we can force sqoop to avoid parallelism ,but it is complaining or throwing error. ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Query [SELECT citi.id, \ country.name, \ citi.city \ FROM citi \ JOIN country USING(country_id)] must contain '$CONDITIONS' in WHERE clause. at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:352) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1298) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1110) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:396) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:506) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231 I would greatlt appreciate for any kind of information or solution. Thanks

Online	Offline
Last Visited	‎10-28-2024 06:24 AM

Member Since	‎05-16-2016 09:33 PM
Last Visited	‎10-28-2024 06:24 AM
Posts	785
Kudos received	112

Cloudera Community

Re: Kerberos / Sentry Integration

Re: How to upgrade Hive from 2.1 to 3.0 via CDH 6....

Re: How does nameservice id works for HA, how does...

Re: What license does the express edition fall und...

Re: Sqoop2 over Sqoop1 in CDH6

Re: Configure NameNode HA Cluster - How to generat...

Configure NameNode HA Cluster - How to generate SS...

Re: Hive Queries run slowly

Re: org.apache.hadoop.security.AccessControlExcept...

org.apache.hadoop.security.AccessControlException:...

Re: Use Flume to get a webpage data. How to config...

Re: exercise 2 error timed out (code THRIFTSOCKET...

Re: sqoop unable to import data from two tables

Re: sqoop unable to import data from two tables

sqoop unable to import data from two tables