Created on 03-29-2018 01:39 PM - edited 09-16-2022 07:49 AM
Preventive Maintenance Data Science Use-Case for Fleet Cost Avoidance.
Repo InfoCreated on 05-02-2018 03:25 PM
This demo requires a custom library from this project: https://github.com/hortonworks/fieldeng-scythe.To install custom Scala dependencies from github using IntelliJ IDEA on Mac, perform these steps (replace descriptions in <carets> with appropriate values.
From a shell prompt:
cd ~/IdeaProjects
git clone https://github.com/<path>/<projectname>.git
(mkdir ~/IdeaProjects if the directory does not exist)
(clone https://github.com/hortonworks/fieldeng-scythe.git in this case)
Open IntelliJ, create a new project under File -> New -> Project From Existing Sources, import from existing sources cloned to new directory created above
Add dependencies when prompted
In IntelliJ, add a JAR file to the outputs using File -> Project Structure, Artifacts. Select Create new Artifact -> JAR -> From modules with Dependencies.
Check "Include in project build” checkbox.
Path should default to: …/IdeaProjects/<projectname>/out/artifacts/<jarname>_jar. Give the JAR file an appropriate name.
Build the project by selecting Build -> Build Project (will likely rebuild the project completely due to the changes above)
Verify creation of JAR in local directory
Copy JAR to Zeppelin server. Zeppelin will handle distributing the dependency to the Spark nodes.
scp -i ~/ssh/<certificate file>.pem ~/IdeaProjects/<projectname>/out/artifacts/<jarname>_jar<username>@<zeppelin.host.ip.address>:~
ssh -i ~/ssh/<certificate file>.pem <username>@<zeppelin.host.address> sudo cp <jarname>.jar /usr/lib
In Zeppelin Interpreter config, edit %Spark or %Spark2 interpreter as needed. At bottom of configuration settings, add dependency to file location with absolute path:
/usr/lib/<jarname>.jar
%spark2
import <com.your.libraryname>._