I am trying to get Hive replication working and am not yet fully sure I understand the options.
I can see that if you specify an individual database then you can leave the table specification blank - or specify a regular expression for the tables. I found that "*" was not an acceptable regular expression, so I wonder what rules they are using for that.
However I really need wild cards to specify databases as well as tables. Is this possible?
For instance imagine that I have 100 databases called
area1_something_db
and another 100 each called
area2_something_db
area3_something_db
area4_something_db
area5_something_db
My choices right now are to replicate all of them all at once, or replicate them one database at a time. This is a nightmare due to the large number of databases. Ideally I want a replication job which does one specific area which I can schedule according to some business decision.
Am I right in thinking that I cannot have multiple Hive replications going on at the same time even if they are totally different databases?