Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

querydatabase processor fetches duplicate records althiygh being run on primary node mode in 3 node cluster

querydatabase processor fetches duplicate records althiygh being run on primary node mode in 3 node cluster

New Contributor

@Matt Clarke @Matt Burgess

Hi All, I am using querydatabase processor in incremental and primary node execution mode to fetch records from source.But the processor is fetching duplicate records sometimes, is it because the state of the last fetched records is not propogated between the same processor in cluster by zookeepr and the other querydb processor fetches the same records again?(i have read that processors running on prmary node share states) so is this because of lag in reading state ferom zoookeeper.please suggest!! or should i not use querydb in cluster mode and go with generatetablefetch as suggested in this post https://community.hortonworks.com/questions/203372/querydatabase-to-run-in-distributed-manner.html

1 REPLY 1

Re: querydatabase processor fetches duplicate records althiygh being run on primary node mode in 3 node cluster

Super Guru

@sri chaturvedi

There is similar kind of thread in HCC regards to duplicated records, and described most of the cases which yields duplicated records.

Please think about these solutions mentioned and i'm hoping these will not yield for data duplications :-).

Don't have an account?
Coming from Hortonworks? Activate your account here