1. Jupyterlab Setup in Data Scientists's own Laptop¶
This page describes how you can setup your own laptop or desktop's Jupyterlab for accessing InfinStor services.
1.1. Prerequisites¶
- Install
Node.js
(npm
) 14.x using these instructions - Install aws cli for your operating system
- after installation, run
aws configure
to setup AWSaccess key id
andaccess key secret
to allow access to AWS. This configuration is needed to access the mlflow artifact S3 bucket.
- after installation, run
- Download and install miniconda3 for your operating system.
- The 'JupyterLab' server will be setup in your laptop or desktop in a
conda
environment.
- The 'JupyterLab' server will be setup in your laptop or desktop in a
-
Create a new conda environment named
infinstor
. Open a new terminal and run the following commands:conda activate base conda create --name infinstor python=3.8
-
Activate this new 'conda environment'
infinstor
- run the command
conda activate infinstor
- once activated, the prompt has the word
infinstor
in it, similar to what is shown below. The prompt will vary based on the operating system used, but will contain the wordinfinstor
if theconda environment
is successfully activated. Verify this before proceeding further.(infinstor) user@laptop $
- run the command
-
Next, install pre-requisites as shown below in this
infinstor
conda environmentconda activate infinstor pip install --upgrade mlflow conda install -c conda-forge -y ipywidgets pip install --upgrade jupyterlab jupyter labextension install @jupyter-widgets/jupyterlab-manager pip install --upgrade aiohttp
1.2. InfinStor Python Packages¶
The InfinStor serivce is augmented by three python packages that are distributed through Pypi and installing using pip. These packages need to be installed in the Data Scientist laptop or desktop, the Cloud VM where transforms are executed and Cloud VMs that are part of distributed computing frameworks such as EMR
- infinstor: The InfinStor SDK. It is installing using 'pip install infinstor'
- infinstor-mlflow-plugin: The MLflow plugin required for MLflow tracking, projects, transforms etc.
- jupyterlab_infinstor: This package supports the InfinStor jupyterlab sidebar. This is loaded in the jupyterlab server process and needs to be installed in that python environment using the command 'pip install jupyterlab_infinstor'
The following commands uninstall older versions and install new versions of the infinstor pip packages.
conda activate infinstor
pip uninstall -y infinstor
pip install infinstor
pip uninstall -y infinstor-mlflow-plugin
pip install infinstor-mlflow-plugin
pip uninstall -y jupyterlab_infinstor
pip install jupyterlab_infinstor
1.3. InfinStor npm packages¶
The InfinStor service is aided by the jupyterlab sidebar npm package jupyterlab-infinstor. This package is installed in the machine where the jupyterlab server runs.
Important
This package is compatible with version 2.x and 3.x of jupyterlab. Ensure that you are running version 2.x or 3.x of jupyterlab by typing the command jupyter --version and looking for the line jupyter lab. The jupyter lab version should be greater than 2.x
conda activate infinstor
jupyter labextension install jupyterlab-infinstor
1.4. Start Jupyterlab server¶
-
Start Jupyterlab server using the command below.
- Add
--ip=0.0.0.0
to the command if you want the JupyterLab server to be accessible from other computers. -
Ask your Infinstor platform administrator for the value of
MFLOW_TRACKING_URI
. It is usually similar toinfinstor://mlflow.infinstor.<your_domain>
. Use this value in theexport
command belowexport MLFLOW_TRACKING_URI=<value from administrator> jupyter lab --log-level INFO --no-browser
- Add
-
After successful startup of
JupyterLab
, will see an URL similar to below in the output. Use this URL in your browser to access yourJupyterLab
serverTo access the server, copy and paste one of these URLs: http://mylaptop:8888/lab?token=b291da4666ff13143421325b528e1811c3889a5957571d99 or http://127.0.0.1:8888/lab?token=b291da4666ff13143421325b528e1811c3889a5957571d99