Support Questions
Find answers, ask questions, and share your expertise

Hadoop services high availability

Solved Go to solution

Hadoop services high availability

Contributor

Hello all ! I have to make some hadoop services highly available. so i was wondering about the best way to achieve that, here is the services list : HBase Lily indexer, Sqoop, HiveServer2 thanks a lot

tazimehdi.com
1 ACCEPTED SOLUTION

Accepted Solutions

Re: Hadoop services high availability

sqoop is a client only, so you can have sqoop installed in multiple nodes behind a IP load balancer.

I don't know about Lilly indexer (part of HDP Search Connector). Documentation is here: https://doc.lucidworks.com/lucidworks-hdpsearch/2.3/Guide-Jobs.html#_hbase-indexer, but I'm not sure if it has HA out-of-the-box with Solr Cloud.

View solution in original post

10 REPLIES 10

Re: Hadoop services high availability

@Mehdi TAZI

There is a High Availability section in: http://docs.hortonworks.com (choose you version)

For lastest HDP version (2.3.4), see this:

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-...

Re: Hadoop services high availability

Contributor

Hello ! first of all thanks a lot for you answer, otherwise I already checked the high availability section on the official documentation, but there is no information about how to make lilly indexer, sqoop highly available.

tazimehdi.com

Re: Hadoop services high availability

sqoop is a client only, so you can have sqoop installed in multiple nodes behind a IP load balancer.

I don't know about Lilly indexer (part of HDP Search Connector). Documentation is here: https://doc.lucidworks.com/lucidworks-hdpsearch/2.3/Guide-Jobs.html#_hbase-indexer, but I'm not sure if it has HA out-of-the-box with Solr Cloud.

View solution in original post

Re: Hadoop services high availability

Contributor

Are you sur that sqoop is only a client ? i'm agree that the processing is already HA, cause it uses yarn.but what about the Sqoop metastore and job tool ?

tazimehdi.com

Re: Hadoop services high availability

See this screenshot from sqoop in Ambari. It's only client.

There no sqoop metastore service, by default sqoop uses derby database, but if you want, you can use external mysql or postgres database for sqoop, then, you can configure this database in HA mode.

1220-screen-shot-2016-01-07-at-142741.png

Re: Hadoop services high availability

Contributor

i didn't mention that i'm using sqoop2 with the sqoop server.

tazimehdi.com

Re: Hadoop services high availability

Hortonworks does not support sqoop2 now. Sqoop supported in lastest hdp 2.3.4 is:

  • Apache Sqoop 1.4.6

Re: Hadoop services high availability

Mentor
@Mehdi TAZI

Lily is not an HDP product, it was contributed by Ngdata. Perhaps you can reach out to their mailing list for any HA advice? You may have to come up with your own HA for it, then you should certainly share your docs here for the best of the community! As far as HBase, make sure you run multiple master servers.

Re: Hadoop services high availability

Contributor

Thanks :), i'll check for Lily.

tazimehdi.com