Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hadoop lib syncronisation

Highlighted

Hadoop lib syncronisation

New Contributor
I'm using Cloudera CDH5. I try to work with Twitter Analyse. After building flume-sources-1.0-SNAPSHOT.jar and hive-serdes-1.0-SNAPSHOT.jar I placed this jars in manually in directories on all knots(4) of my cluster

/usr/share/cmf/lib/plugins/flume-sources-1.0-SNAPSHOT.jar
/opt/cloudera/parcels/CDH/lib/hadoop/lib/hive-serdes-1.0-SNAPSHOT.jar
/opt/cloudera/parcels/CDH/lib/hive/lib/hive-serdes-1.0-SNAPSHOT.jar

After this actions Flume got ability to stream JSON from Twitter and Hive to analyze it.

Question: does Cloudera Manager UI has a tool for jar distribution between knots? Or should I use some other tools to distribute jar to knots in my cluster? Because for 4 knots I'm able to distribute this jars, but what is with 10 or more knots?
2 REPLIES 2
Highlighted

Re: Hadoop lib syncronisation

Master Collaborator

I'm not aware of any mechanism within Hue or CM to ship custom or add-on jars to the nodes in your cluster in order to enable extra functionality like you're describing.  I think people would typically script this with a simple shell script and using scp to iterate over your node list and drop the jar into the correct directory on each node.  Other than that, many customer who have large clusters manage their local filesystems, libs, etc. by using a tool like Puppet or some other configuration mgmt tool which automates the administration of large clusters.

Highlighted

Re: Hadoop lib syncronisation

New Contributor

Can I some how configure to use hdfs?

 

Like

hdfs://namenode:8020/lib_dir

 

 

Don't have an account?
Coming from Hortonworks? Activate your account here