Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hadoop services high availability

Solved Go to solution

Hadoop services high availability

Contributor

Hello all ! I have to make some hadoop services highly available. so i was wondering about the best way to achieve that, here is the services list : HBase Lily indexer, Sqoop, HiveServer2 thanks a lot

tazimehdi.com
1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Hadoop services high availability

sqoop is a client only, so you can have sqoop installed in multiple nodes behind a IP load balancer.

I don't know about Lilly indexer (part of HDP Search Connector). Documentation is here: https://doc.lucidworks.com/lucidworks-hdpsearch/2.3/Guide-Jobs.html#_hbase-indexer, but I'm not sure if it has HA out-of-the-box with Solr Cloud.

View solution in original post

10 REPLIES 10
Highlighted

Re: Hadoop services high availability

@Mehdi TAZI

There is a High Availability section in: http://docs.hortonworks.com (choose you version)

For lastest HDP version (2.3.4), see this:

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-...

Highlighted

Re: Hadoop services high availability

Contributor

Hello ! first of all thanks a lot for you answer, otherwise I already checked the high availability section on the official documentation, but there is no information about how to make lilly indexer, sqoop highly available.

tazimehdi.com
Highlighted

Re: Hadoop services high availability

sqoop is a client only, so you can have sqoop installed in multiple nodes behind a IP load balancer.

I don't know about Lilly indexer (part of HDP Search Connector). Documentation is here: https://doc.lucidworks.com/lucidworks-hdpsearch/2.3/Guide-Jobs.html#_hbase-indexer, but I'm not sure if it has HA out-of-the-box with Solr Cloud.

View solution in original post

Highlighted

Re: Hadoop services high availability

Contributor

Are you sur that sqoop is only a client ? i'm agree that the processing is already HA, cause it uses yarn.but what about the Sqoop metastore and job tool ?

tazimehdi.com
Highlighted

Re: Hadoop services high availability

See this screenshot from sqoop in Ambari. It's only client.

There no sqoop metastore service, by default sqoop uses derby database, but if you want, you can use external mysql or postgres database for sqoop, then, you can configure this database in HA mode.

1220-screen-shot-2016-01-07-at-142741.png

Highlighted

Re: Hadoop services high availability

Contributor

i didn't mention that i'm using sqoop2 with the sqoop server.

tazimehdi.com
Highlighted

Re: Hadoop services high availability

Hortonworks does not support sqoop2 now. Sqoop supported in lastest hdp 2.3.4 is:

  • Apache Sqoop 1.4.6
Highlighted

Re: Hadoop services high availability

Mentor
@Mehdi TAZI

Lily is not an HDP product, it was contributed by Ngdata. Perhaps you can reach out to their mailing list for any HA advice? You may have to come up with your own HA for it, then you should certainly share your docs here for the best of the community! As far as HBase, make sure you run multiple master servers.

Highlighted

Re: Hadoop services high availability

Contributor

Thanks :), i'll check for Lily.

tazimehdi.com
Don't have an account?
Coming from Hortonworks? Activate your account here