Member since
07-07-2016
53
Posts
6
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
765 | 06-27-2016 10:00 PM |
01-11-2018
08:37 PM
Agreed. Thanks for suggestion. For now it seems I have a work around by changing the run schedule from 0 seconds to 1 seconds and I dont see Lease holder exception. Even though there is a little latency in writing to HDFS unlike 0 seconds but error has gone. I will work on your suggestion for production. Thanks for help! Srikaran
... View more
01-11-2018
06:32 PM
@Bryan Bende I liked MergeContent option as you suggested. But Please clarify this. In production surveys will come real time as soon as customer write the survey we want to see in HDFS. So my use-case is during 24 hour period which is per day I want to see only 1 file in HDFS and as soon as Surveys were posted I should see that survey in HDFS. If I use Merge Content processor will that be still considered Real-Time? I am guessing it will wait until data reach certain threshold, upon which merge will happen and write to HDFS? During a day there will be times where no surveys at all or bunch of surveys coming at the same time or 1 survey per second. Thanks Srikaran.
... View more
01-11-2018
05:46 PM
@Bryan BendeHi Bryan. We are testing this in DEV and it has only 1 NIFI Node. However the puthdfs cluster has 4 datanodes. Prod we will have 2 nifi nodes and 5 datanodes. Thanks
... View more
01-11-2018
05:29 PM
HI We have a NIFI flow where we are sourcing the social media surveys from an API and writing them to HDFS via PutHDFS processor in with conflict resolution strategy as "append". This flow works if surveys are coming 1 by 1 with a second or 2 seconds delay. We want to test some 20000 surveys all coming at once and "PutHDFS" processor is failing for this scenario. Error is given below: WARN org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.append: failed to create file XXXXXXXXXXXX for DFSClient_NONMAPREDUCE_XXXXXXXXX because current leaseholder is trying to recreate file. org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:user@XXXXXXXXX (auth:KERBEROS) cause:org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file XXXXXXXXXXX for DFSClient_NONMAPREDUCE_XXXXXXXX for client XXXXXXXX because current leaseholder is trying to recreate file. INFO org.apache.hadoop.ipc.Server: IPC Server handler 14 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.append from XXXXXXXX Call#XXXXX Retry#0: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file XXXXXXXXX for DFSClient_NONMAPREDUCE_XXXXXXXX because current leaseholder is trying to recreate file. With these exception all the records are getting blocked in nifi queue to puthdfs and eventually they are not writing into HDFS. Is there a way to configure Nifi PutHDFS processor to accomodate this use-case? Rt now its configured under scheduling as "Timer Driven", Concurrent tasks as "1" and with run schedule as 0 seconds. Yield duration is 1 second. Please suggest. Thanks Srikaran
... View more
Labels:
- Labels:
-
Apache NiFi
01-04-2018
06:37 PM
@Karl Fredrickson Hi Karl..Same issue after Stop and restart. I tried 1 hour and 4 hours for Kerberos relogin period as I am using the same relogin period for FetchHDFS/ListHDFS. This is happening only for "GetHDFS". I am assuming "GetHDFS" processor is trying to delete/move or write which might need some other permissions. The HDFS files are owned by hive:hive with 771 permissions. With the same 771 permissions and hive:hive fetchhdfs & listhdfs is working. Thanks
... View more
01-04-2018
05:38 PM
Hi I am using FetchHDFS nifi processor which is running fine to fetch the exact HDFS file. I want to get all HDFS files under a directory hence using GetHDFS by keeping the source file option as "True". But I am getting a Kerberos error saying "ERROR [Timer-Driven Process Thread-1] o.apache.nifi.processors.hadoop.GetHDFS GetHDFS[id=XXXXXXXXXX] Error retrieving file hdfs://XXXXXXXXXXXXXXXXXXXX.0. from HDFS due to java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt): {}
java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205)
Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:311)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:287)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:287) I am wondering why Same Kerberos credentials are working for "FetchHDFS/ListHDFS" but not "GetHDFS". "GetHDFS" need additional setup? Please suggest. Thanks Srikaran
... View more
Labels:
- Labels:
-
Apache NiFi
12-06-2017
06:26 PM
1 Kudo
@Timothy Spann Thanks a lot. These are very helpful, Let me test the flow and will update accordingly. Thanks
... View more
12-06-2017
06:24 PM
1 Kudo
@anarasimham Looks like GetHDFS will replace HDFS file. I am planning to use fetchHDFS and then invoke http processor. For now I am converting avro file to JSON on Hadoop end and fetching the json and posting it. I will directly test avro & other formats and will update. Thanks!
... View more
12-04-2017
07:17 PM
1 Kudo
Hello. I have a HDFS file for which data needs to be posted to an outside URL (https), I have the user name and password for the URL; I can post a sample JSON via postman from my browser by using the user name and password. Now I have to use Ni-FI for this flow. Please let me know what are the exact nifi processors should I use to get the data from HDFS and post it into the URL via another ni-fi processor. Also kindly let me know what format the HDFS data should be in for these kind of use-cases. Thanks Srikaran
... View more
Labels:
- Labels:
-
Apache NiFi
09-14-2016
03:06 PM
@Predrag Minovic Great options. It looks like from all the options above the 2nd ZK quorum should be installed manually outside Ambari and configure the Kafka accordingly? If that's the case when I do upgrades in future on this cluster I have to take care of 2nd manual ZK quorum upgrade as a separate effort rt? And I like the 2 clusters solution but what if some business logic on cluster 1 is dependent on kafka on cluster 2? In that case I guess "2 clusters " solution will not work rt? Please confirm! Thanks Sri.
... View more
09-13-2016
07:10 PM
Hi
I am planning to build a HDP 2.4.2 Kerberized cluster via Ambari Blue-Prints and I am going to change the blue print to have 6 Zookeepers. The reason why I am having 6 zk's is I want to have two ZK quorums with 3 ZK's, 1 quorum I want to use for HDFS NN HA, Hbase and other services except for Kafka and for Kafka alone I want to have other ZK Quorum dedicated. I am assuming when I build the cluster with 6 ZK's initially I guess it will create only 1 ZK quorum with 6 ZK's in it. Can I change it to have to 2 ZK quorums after cluster installation from zkcli? or is there an option in Ambari blue print itself to create 2 ZK quorums with 3 ZK servers in each quorum? Please advice! Thanks
... View more
Labels:
- Labels:
-
Apache Kafka
06-27-2016
10:00 PM
@milind pandit
This is what I am giving for worker.childopts! Please correct if you see something weird.
... View more
06-27-2016
08:02 PM
Hi We are hitting the below Storm Run Time errors as below:
... View more
Labels:
- Labels:
-
Apache Storm
06-07-2016
04:34 PM
@Benjamin Leonhardi Thanks, This makes sense, so its always better to set the value to "True" rt?
... View more
06-07-2016
02:02 PM
Let me put the question this way. If I have hive.execution.engine=tez; why do I need the property hive.server2.tez.initialize.default.sessions to set it to "True"? Whats the use-case for this property? I ran multiple tests but my hive.execution.engine property drives how the query works and not this default sessions property.
... View more
06-07-2016
01:43 PM
@Ted Yu Thanks it worked.
... View more
06-07-2016
02:06 AM
Accepting this answer as well!
... View more
06-07-2016
12:21 AM
@Ted Yu Error message shows 60 seconds as default. Do we need to bump it to 90 seconds? thats what you mean? I dont see this property currently set so I guess its taking from default. Please confirm.
... View more
06-07-2016
12:20 AM
@Ted Yu Region servers are good.
... View more
06-07-2016
12:12 AM
Error: org.apache.hadoop.hbase.client.ScannerTimeoutException: 70218ms passed since the last invocation, timeout is currently set to 60000 at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:434) at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:364) at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:205) at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:147) at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$1.nextKeyValue(TableInputFormatBase.java:216) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at
... View more
- Tags:
- Data Processing
- HBase
Labels:
- Labels:
-
Apache HBase
06-07-2016
12:08 AM
Sure. Thanks @Ted Yu
... View more
06-07-2016
12:08 AM
@Enis This worked. We just omitted TTL and created table. After that we did a Describe table and see TTL FOREVER. Thanks Enis
... View more
06-06-2016
11:51 PM
@Ted Yu
Same error Ted. Please see below: ERROR: For input string: "MAX_VALUE"
... View more
06-06-2016
11:36 PM
@Ted Yu Please see above
... View more
06-06-2016
11:35 PM
1 Kudo
We are trying to create the table in Hbase shell as below: Pasted the error. ERROR: For input string: "FOREVER" After this error we created table as below by removing TTL and adding hbase.store.delete.expired.storefile and it succeeded. CONFIGURATION =>
{'hbase.store.delete.expired.storefile' => 'false'}}
... View more
06-06-2016
11:19 PM
If both are not same how they are different? Can you please let me know?
... View more
Labels:
- Labels:
-
Apache Tez
06-06-2016
11:17 PM
Because customer is asking TTL => 'FOREVER' worked for them in HDP 2.2 and its not working in HDP 2.4 while creating the Hbase tables, So they have to give CONFIGURATION => {'hbase.store.delete.expired.storefile' => 'false'} instead of TTL.
... View more