Created on 09-22-202002:00 AM - edited on 09-28-202003:06 AM by VidyaSargur
In Cloudera Machine Learning experience (or CDSW for the on-prem version), projects are backed with git. You might want to use GitHub on your projects, so here is a simple way to do that.
First things first: there are basically two ways of interacting with git/GitHub: HTTPS or SSH; We'll use the latter to make the authentication easy. You might also consider SSO or 2FA for enhancing security, here we'll focus on the basics.
To make this authentication going on under the hood, copy our SSH key from CML to Github.
Find your SSH key in the Settings of CML:
Copy that key and add it in Github, under the SSH and GPG keys in your github.com settings: Add SSH key.
Put cdsw in the Title and paste your ssh content in the Key:
Let's start with creating a new project on github.com:
The important thing here is the access mode we want to use: SSH
In CML, start a new project with a template:
Open a Terminal window in a new session:
Convert the project to a git project:
cdsw@qp7h1qllrh9dx1hd:~$ git init
Initialized empty Git repository in /home/cdsw/.git/
Finally, push the changes (so all files for the first commit) to our master, so on github.com:
cdsw@qp7h1qllrh9dx1hd:~$ git push -u origin master
The authenticity of host 'github.com (140.82.113.4)' can't be established.
RSA key fingerprint is SHA256:nThbg6kXUpJWGl7E1IGOCspRomTxdCARLviKw6E5SY8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'github.com,140.82.113.4' (RSA) to the list of known hosts.
Counting objects: 56, done.
Delta compression using up to 16 threads.
Compressing objects: 100% (46/46), done.
Writing objects: 100% (56/56), 319.86 KiB | 857.00 KiB/s, done.
Total 56 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), done.
To github.com:laurentedel/MyProject.git
* [new branch] master -> master
Branch 'master' set up to track remote branch 'master' from 'origin'.
There you go!
Now we can use the git commands are used to Modify file(s):
cdsw@qp7h1qllrh9dx1hd:~$ git status
On branch master
Your branch is up to date with 'origin/master'.
Untracked files:
(use "git add <file>..." to include in what will be committed)
README.md
nothing added to commit but untracked files present (use "git add" to track)
Commit/push:
cdsw@qp7h1qllrh9dx1hd:~$ git add README.md
cdsw@qp7h1qllrh9dx1hd:~$ git commit -m "adding a README"
[master 7008e88] adding a README
1 file changed, 1 insertion(+)
create mode 100644 README.md
cdsw@qp7h1qllrh9dx1hd:~$ git push -u origin master
Warning: Permanently added the RSA host key for IP address '140.82.114.4' to the list of known hosts.
Counting objects: 3, done.
Delta compression using up to 16 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 290 bytes | 18.00 KiB/s, done.
Total 3 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), completed with 1 local object.
To github.com:laurentedel/MyProject.git
5d75525..7008e88 master -> master
Branch 'master' set up to track remote branch 'master' from 'origin'.