Summit ML Training

From SCECpedia
Revision as of 03:34, 13 October 2020 by Maechlin (talk | contribs) (→‎Training Documents)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

OLCF/IBM Machine Learning Training Materials

Tutorial information posted on github

Training Documents

Running Jupyter Notebooks on Summit

The training involved cloning a github repo into your account on Summit, creating an anaconda virtual environment, then doing an anaconda install of all the required libraries. Those instructions are covered in the documentation introductory materials.

The git repo contains a number of jupyter notebooks that can be run remote. Basic instructions are given on wiki pages for the IBM Github ML training, but a few steps are required to control the notebooks from your personal computers.

Two main steps include: 1) ssh into summit, activate the right conda envs, then run the commands below to identify an open port that the webbrowser can connect to on summit. 2) ssh into summit again using a version of ssh that redirects that terminal session to a port on your laptop that your browser can connect to. This will allow a browser on you laptop to run jobs on summit.

The details steps look include the following.


First, using your two factor login, ssh into summit into your own accoun there. Then, run these commands on Summit from your terminal window to find an unused port.

%echo "ssh -NL $myport:$(hostname):$myport $USER@summit.ornl.olcf.gov"

#The commands above returned on Summit:
%ssh -NL 7046:login1:7046 pmaech@summit.ornl.olcf.gov

#Start a jupyter notebook on Summit with this:
%jupyter-notebook --no-browser --port=$myport --ip='0.0.0.0'

which returns something like this:
(wmlce17-ornl) [pmaech@login4.summit ~]$ jupyter-notebook --no-browser --port=$myport --ip='0.0.0.0'
[I 22:31:44.001 NotebookApp] JupyterLab extension loaded from /ccs/home/pmaech/.conda/envs/wmlce17-ornl/lib/python3.6/site-packages/jupyterlab
[I 22:31:44.001 NotebookApp] JupyterLab application directory is /ccs/home/pmaech/.conda/envs/wmlce17-ornl/share/jupyter/lab
[I 22:31:44.005 NotebookApp] Serving notebooks from local directory: /autofs/nccs-svm1_home1/pmaech
[I 22:31:44.005 NotebookApp] Jupyter Notebook 6.1.1 is running at:
[I 22:31:44.005 NotebookApp] http://login4:7046/?token=654efaa66e39e2dadb80868ceafc02583b3685c386c2404c
[I 22:31:44.005 NotebookApp]  or http://127.0.0.1:7046/?token=654efaa66e39e2dadb80868ceafc02583b3685c386c2404c
[I 22:31:44.005 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 22:31:44.015 NotebookApp] 
    
    To access the notebook, open this file in a browser:
        file:///autofs/nccs-svm1_home1/pmaech/.local/share/jupyter/runtime/nbserver-40395-open.html
    Or copy and paste one of these URLs:
        http://login4:7046/?token=654efaa66e39e2dadb80868ceafc02583b3685c386c2404c
     or http://127.0.0.1:7046/?token=654efaa66e39e2dadb80868ceafc02583b3685c386c2404c

#This prints out http:links that you can post into browser on mac once the ssh forwarding is setup

Check which login is mentioned in the returned strings above, because that is needed in the next command.

#Now in another terminal window, on my mac, connect to summit again with this
ssh -NL 7046:login4.summit.olcf.ornl.gov:7046 pmaech@summit.olcf.ornl.gov

# This kind of hangs in the terminal window 2
#Now in a browwer on the mac, type the url that is given in the cluster 
# terminal window 1 when the jupyter notebooks was started
% http://127.0.0.1:7046/?token=654efaa66e39e2dadb80868ceafc02583b3685c386c2404c

The browser then shows the directories on the cluster and you can run iphynb notebooks step by step from the browser

Related Entries