Concepts and High Level Architecture¶
The InfinStor ML Platform includes a Mlflow service hosted in the cloud. All standard capabilities of open source Mlflow are included in InfinStor. InfinStor Service includes a hosted service for Mlflow Tracking, a Mlflow Projects plugin for executing Mlflow projects in an EC2 VM and a Mlflow Model Management service. Specifically, the Mlflow functionality included with InfinStor Starter are:
Mlflow Tracking Service¶
- Always on
- Authentication built in with the ability to federate to external systems such as Azure Active Directory and Google oauth
- Built using Lambdas and DynamoDB, therefore it does not need any VMs running 24x7
- Multi user, Data Scientists get their own private view of the experiments and runs they are tracking
- Highly scalable. Supports thousands of concurrent Data Scientist users
- Disaster Recovery and High Availability built in. InfinStor MLflow service is resilient to server, rack, datacenter and whole region failure
Mlflow Projects Backend¶
- Useful for running mlflow projects in one of two targets:
- inline in jupyterlab
- in a single VM in EC2. The VM is started up on demand and shut down when idle is detected
- Projects can be of two types:
- Standard Mlflow projects
- Auto generated projects from Jupyter Notebooks, captured for use in the cloud using the InfinStor platform
Mlflow Model Management¶
- MLflow Models is a shared space for Data Scientists to share models with other Data Scientists in the same organization. Models are not visible to users from other Enterprises
- The run that created a 'Registered Model' is visible to any user who can view the model
Mlflow Tracking¶
- This shows the Mlflow Tracking components of the InfinStor Starter service
- InfinStor Mlflow Tracking server is an implementation of the Mlflow REST API using AWS Lambda and DynamoDB
- Applications that use Mlflow Tracking need to set the environment variable MLFLOW_TRACKING_URI to infinstor://mlflow.infinstor.com/
- The Mlflow User Interface is accessed using the InfinStor Jupyterlab sidebar
Mlflow Projects¶
- This shows the Mlflow Projects components of the InfinStor Starter service
- InfinStor supplies a projects backend plugin for the open source mlflow library installed using pip
- Applications must add the parameter '-b infinstor-backend' to the 'mlflow run' command