Support Questions
Find answers, ask questions, and share your expertise

Hive Replication: regular expressions on table names?

Hive Replication: regular expressions on table names?


I am trying to get Hive replication working and am not yet fully sure I understand the options. 


I can see that if you specify an individual database then you can leave the table specification blank - or specify a regular expression for the tables. I found that "*" was not an acceptable regular expression, so I wonder what rules they are using for that. 


However I really need wild cards to specify databases as well as tables. Is this possible? 


For instance imagine that I have 100 databases called




and another 100 each called







My choices right now are to replicate all of them all at once, or replicate them one database at a time. This is a nightmare due to the large number of databases. Ideally I want a replication job which does one specific area which I can schedule according to some business decision. 


Am I right in thinking that I cannot have multiple Hive replications going on at the same time even if they are totally different databases?




Re: Hive Replication: regular expressions on table names?

Super Guru



"*" is not a valid regex.  ".*" may be what you were going for...


I am not quite clear on your business requirement, but I think you are saying that you want to maybe create 10 replication schedules that will replicate chunks of 10 of your area databases... akin to this: