About ashsskum

ashsskum · ‎05-05-2017

That was it. Thanks much!

ashsskum · ‎05-04-2017

<code> package com.test.utility; import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.SQLException; import java.sql.Statement; public class EHPhoenix { static final String JDBC_DRIVER = "org.apache.phoenix.jdbc.PhoenixDriver"; static final String ZOOKEPER = "trnsyxsvr:2181"; static final String SECURITYTYPE = "/hbase-secure:phtest@HDP_DEV.COM"; static final String KEYTAB = "/home/phtest/phtest.headless.keytab"; static final String DB_URL = "jdbc:phoenix:" + ZOOKEPER + ":" + SECURITYTYPE + ":" + KEYTAB; public static void main(String[] args) { Connection conn = null; Statement st = null; System.out.println(DB_URL + ";"); try { Class.forName("org.apache.phoenix.jdbc.PhoenixDriver"); System.out.println("Connecting to database.."); conn = DriverManager.getConnection(DB_URL); System.out.println("Creating statement..."); st = conn.createStatement(); String sql; sql = "SELECT * FROM collector_status"; ResultSet rs = st.executeQuery(sql); while (rs.next()) { String did = rs.getString(1); System.out.println("Did found: " + did); } rs.close(); st.close(); conn.close(); } catch (SQLException se) { se.printStackTrace(); } catch (Exception e) { // Handle errors for Class.forName e.printStackTrace(); } finally { // finally block used to close resources try { if (st != null) st.close(); } catch (SQLException se2) { } // nothing we can do try { if (conn != null) conn.close(); } catch (SQLException se) { se.printStackTrace(); } } System.out.println("Goodbye!"); } } <code>Gives the following error when running it from Eclipse IDE. Can you please advise? <code> jdbc:phoenix:trnsyxsvr:2181:/hbase-secure:phtest@HDP_DEV.COM:/home/phtest/phtest.headless.keytab; log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Connecting to database.. java.sql.SQLException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions: Thu May 04 03:19:36 EDT 2017, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=73361: row 'SYSTEM:CATALOG,,' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=trnsyxsvr,16020,1493844289998, seqNum=0 at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2465) at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2382) at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76) at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2382) at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255) at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:149) at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:270) at com.cox.util.EHPhoenix.main(EHPhoenix.java:27) Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions: Thu May 04 03:19:36 EDT 2017, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=73361: row 'SYSTEM:CATALOG,,' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=trnsyxsvr,16020,1493844289998, seqNum=0 at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:271) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:210) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:210) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:327) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:302) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:167) at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:162) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:797) at org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:602) at org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:366) at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:403) at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2410) ... 9 more Caused by: java.net.SocketTimeoutException: callTimeout=60000, callDuration=73361: row 'SYSTEM:CATALOG,,' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=trnsyxsvr,16020,1493844289998, seqNum=0 at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:169) at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:65) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to trnsyxsvr/210.220.3.139:16020 failed on local exception: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to trnsyxsvr/210.220.3.139:16020 is closing. Call id=11, waitTime=189 at org.apache.hadoop.hbase.ipc.AbstractRpcClient.wrapException(AbstractRpcClient.java:281) at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1238) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:223) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:328) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32831) at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:379) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:201) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:63) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:210) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:364) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:338) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:136) ... 4 more Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to trnsyxsvr/210.220.3.139:16020 is closing. Call id=11, waitTime=189 at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.cleanupCalls(RpcClientImpl.java:1057) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.close(RpcClientImpl.java:856) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.run(RpcClientImpl.java:575) Goodbye!

ashsskum · ‎04-27-2017

Hi, I am unable to visualize how multiple NiFi nodes in a cluster processes a flowfile. In a NiFi cluster, the same dataflow runs on all the nodes. As a result, every component in the flow runs on every node Ref: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html Processors can then be schedule to run on the Primary Node only, via an option on the scheduling tab of the processor which is only available in a cluster. Ref: https://community.hortonworks.com/articles/16120/how-do-i-distribute-data-across-a-nifi-cluster.html With the context as described above, let's say at the very top of the process I have a processor of "CRON driven" schedule type and the rest are simply "Timer driven". Assume this flow doesn't involve any Kafka processors within it and there are 2 nodes in the cluster. If the schedule is set to 12.00AM at the top processor start, will it start on both the NiFi nodes? If so, should we set the Execution to "Prmary node" for the top processor, to prevent the processor getting executed in both the NiFi nodes in parallel? In the second scenario if my top processor is ConsumeKafka_0_10 and if it is of "Timer driven" schedule type, if my topic is configured with 6 partitions and if I set the concurrency to 3, would 3 processor instances run one node and another 3 processor instance on another node?

ashsskum · ‎03-06-2017

Assume updates, inserts and deletes are replicated into Hadoop file system in a flat text files via ORACLE GoldenGate and there is a partition folder for each day. We have an external Hive table based on the files ingested, that can be used to run queries on it. As far as getting the upto date data is concerned, this method is fine. However, we have an issue with this: 1) The files keeps growing and this will increase the storage utilization, potentially cause the capacity problems 2) As more files to scan, the Hive query performance will decrease eventually. Also utilize additional resources for the map reduce jobs. The above asks for a Compaction, unfortunately this is something not addressed by GoldenGate. I see that the Hive internal compaction (minor/major) supports only on ORC format and that external tables cannot be made ACID tables since the changes on external tables are beyond the control of the compactor. It appears to me that custom compaction process is our best option. I do not want to use GoldenGate - Kafka integration (with Log compaction feature) as this requires to retrieving the data from stream each time to build the latest image of the table and making it accessible via Hive. Can someone please advise on the custom compaction and if there are any better alternatives using the above GoldenGate replication?

ashsskum · ‎12-19-2016

My use case requires to write in a delimited format. INSERT OVERWRITE LOCAL fits perfectly for this. I wish we had a way to custom delimit the content retrieved through SelectHiveQL processor, so I couldn't opt it. I agree it's a good option of writing to HDFS instead. I will attempt to modify the process, but I still wonder why it wouldn't work with write to Local filesystem, through NiFi.

ashsskum · ‎12-19-2016

What I understand is that Hive temporarily uses the /tmp on the HDFS to write into and then copies over to Local directory. So, in the ranger the recursive access to /tmp HDFS folder has been issued. But, the issue still persists with NiFi

ashsskum · ‎12-18-2016

Sample file: NiFi flow: After Fetchfile (flowfile content):

ashsskum · ‎12-18-2016

When you retrieve the content of a file through above processor, in the flowfile we can see a new line character appended at the end. Is this an expected behavior? Any reason for this behavior and how it can be avoided?

ashsskum · ‎12-18-2016

Please see my response above

ashsskum · ‎12-18-2016

Ranger is specific to HDFS. I am referring to the issue with Hive writing to the local file system, see Hive statement in my summary above. And, /tmp directory is accessible by any user

Online	Offline
Last Visited	‎03-06-2019 11:26 PM

Member Since	‎07-09-2016 06:23 AM
Last Visited	‎03-06-2019 11:26 PM
Posts	83
Kudos received	17

Cloudera Community

Re: NiFi: ReplaceTextWithMapping processor

Re: NiFi: Hive dynamic variable

Re: Phoenix: Error connecting through Java

Phoenix: Error connecting through Java

NiFi: Isolated Processors in a clustered

Compaction: ORACLE GoldenGate replication in HDFS

Re: Hive: INSERT OVERWRITE does not work

Re: Hive: INSERT OVERWRITE does not work

Re: NiFi: FetchFile processor appends new line at ...

NiFi: FetchFile processor appends new line at end

Re: Hive: INSERT OVERWRITE does not work

Re: Hive: INSERT OVERWRITE does not work