1. Install Root Stack¶
The following is a step by step guide to installing the InfinStor service in your own AWS account.
1.1. InfinStor root stack¶
- In your AWS Console, go to CloudFormation and choose create stack
- Use Amazon S3 URL for template source. Specify the build number in the URL below to match the build that was provided to you. The URL is:
https://s3.amazonaws.com/infinstorcft/a.b.cc/infinstor.yaml
Here is a detailed explanation of the parameters:
1.2. Options¶
1.2.1. User Management¶
InfinStor MLflow includes sophisticated user management capabilities. The important choices for the user are:
1.2.1.1. Option 1: Standalone AWS Cognito User Pool:¶
Users and groups are managed in a standalone AWS Cognito User Pool that is not connected to a corporate user directory such as Microsoft Active Directory. The InfinStor MLflow administrator must create users and groups in this Cognito User Pool. The CFT options for this architecture are:
- CreateCognito true
- IsExternalAuth false
- IsDirectAzureActiveDirectory false
- AadDirectoryId empty
- AadClientId empty
Note that administrators may also choose to use a pre-existing Cognito User Pool by setting the CFT option CreateCognito to false. If an pre-created Congito User Pool is to be used, set the option CreateCognito to false. The pre-created user pool must have been created using another CloudFormation template and must export these outputs:
- CognitoUserPoolId (Id of the pre created cognito user pool)
- CliClientId (Id of the CLI client in the pre created cognito user pool)
- JupyterhubClientId (Id of the Jupyterhub client in the pre created cognito user pool)
- MlflowuiClientId (Id of the Mlflow UI client in the pre created cognito user pool)
- WebClientId (Id of the Web UI client, i.e. dashboard, in the pre created cognito user pool)
1.2.1.2. Option 2: AWS Cognito User Pool with federation to corporate directory¶
AWS Cognito can be configured to federate authentication to an external authentication system such as Active Directory or Google Oauth. Any external authentication system supported by Cognito is automatically supported by InfinStor MLflow. Note that InfinStor uses cognito as the core authentication service, cognito then federates to the corporate directory. The CFT settings are:
- CreateCognito true
- IsExternalAuth true
- IsDirectAzureActiveDirectory false
- AadDirectoryId empty
- AadClientId empty
Note that administrators may also choose to use a pre-existing Cognito User Pool by setting the CFT option CreateCognito to false. If an pre-created Congito User Pool is to be used, set the option CreateCognito to false. The pre-created user pool must have been created using another CloudFormation template and must export these outputs:
- CognitoUserPoolId (Id of the pre created cognito user pool)
- CliClientId (Id of the CLI client in the pre created cognito user pool)
- JupyterhubClientId (Id of the Jupyterhub client in the pre created cognito user pool)
- MlflowuiClientId (Id of the Mlflow UI client in the pre created cognito user pool)
- WebClientId (Id of the Web UI client, i.e. dashboard, in the pre created cognito user pool)
1.2.2. Specify Domain and Customize Hostnames¶
- InfinStor Domain: Specify the domain where the Infinstor service will be installed. e.g.
yourcompany.com
orinfinstor.yourcompany.com
or others. InfinStor uses five hostnames in this domain. By default they are api, mlflow, mlflowui, mlflowstatic and service. These endpoint names can be customized as described in the following parameters. Without any customization, if Infinstor Domain is set toinfinstor.yourcompany.com
, then Infinstor CloudFormation Template (CFT) will create the following DNS entries automatically or will expect them to be created manually.- api.infinstor.yourcompany.com
- mlflow.infinstor.yourcompany.com
- mlflowui.infinstor.yourcompany.com
- mlflowstatic.infinstor.yourcompany.com
- service.infinstor.yourcompany.com
- mlflowDnsName:
- This is the hostname of the REST endpoint for the MLflow service. It is set in the Data Scientists' environment variable MLFLOW_TRACKING_URI. For example, if InfinStor Domain is set to
infinstor.yourcompany.com
and mlflowDnsName is customized to mlflowrest, then the environment variable MLFLOW_TRACKING_URI must be set to infinstor://mlflowrest.infinstor.yourcompany.com/ - If createDnsEntries is set to fasle (see further below), then set this to the output value MlflowRestApiARecord, from the nested stack with a name that contains mlflow-lambdas.
- This is the hostname of the REST endpoint for the MLflow service. It is set in the Data Scientists' environment variable MLFLOW_TRACKING_URI. For example, if InfinStor Domain is set to
- mlflowuiDnsName:
- This is the hostname of the MLflow UI. For example, if InfinStor Domain is set to
infinstor.yourcompany.com
and mlflowuiDnsName is not customized, i.e. it is set to the default of mlflowui, then the URL for accessing the MLflow UI is https://mlflowui.infinstor.yourcompany.com/ - If createDnsEntries is set to fasle (see further below), then set this to the output value MlflowUIARecord , from the nested stack with a name that contains staticfiles.
- This is the hostname of the MLflow UI. For example, if InfinStor Domain is set to
- mlflowstaticDnsName:
- Ignore this: it isn't used and is deprecated. Will be removed shortly.
- If createDnsEntries is set to fasle (see further below), then set this to the output value MlflowStaticARecord, from the nested stack with a name that contains staticfiles.yaml.
- apiDnsName:
- This is the hostname of the REST endpoint where InfinStor Dashboard REST APIs are served
- If createDnsEntries is set to fasle (see further below ), then set this to the output value APIARecord, from the nested stack with a name that contains dashboard-lambdas
- serviceDnsName:
- This is the hostname of theInfinStor Dashboard UI. For example, if InfinStor Domain is set to
infinstor.yourcompany.com
and serviceDnsName is not customized, i.e. it is set to the default of service, then the URL for accessing the InfinStor Dashboard UI is https://service.infinstor.yourcompany.com/ - If createDnsEntries is set to fasle (see further below), then set this to the output value ServiceARecord, , from the nested stack with a name that contains staticfiles.yaml.
- This is the hostname of theInfinStor Dashboard UI. For example, if InfinStor Domain is set to
- aigatewayDnsName:
- This is the hostname of theMLflow AI gateway.
- If createDnsEntries is set to fasle (see further below), then set this to the output value AIGatewayCNameRecord, from the nested stack with a name that contains mlflow-lambdas.
1.2.3. Certificate Creation Options¶
- CreateCertificates: true or false. This parameter determines whether InfinStor CFT will automatically create certifictes or not.
- If you set this to true, Infinstor CFT automatically creates a certificate in AWS ACM in your AWS account, for each subdomain (under Infinstor Domain) specified above. This certificate is used when configuring AWS API Gateway and AWS Cloudfront
- If you set this to false, you must have a wildcard certificate ready for the subdomains listed above and provide it in the next parameter.
- Enter the ARN of a pre-existing wildcard certificate located in the region where stack is being installed: If CreateCertificates is true this can be blank. This setting is only valid if CreateCertificates is set to false. Specify the ARN of a wildcard certificate in the region where this stack is being installed. This certificate is used when configuring AWS API Gateway and AWS Cloudfront
- Enter the ARN of a pre-existing wildcard certificate located in us-east-1: If CreateCertificates is true this can be blank. This setting is only valid if CreateCertificates is set to false. Specify the ARN of a wildcard certificate in the us-east-1 region, if installing in a non us-east-1 region. This certificate is used to configure AWS Cloudfront, which expects a certificate in us-east-1.
1.2.4. DNS Entry Creation Options¶
- CreateDnsEntries: true or false.
- If you want InfinStor CFTs to automatically create the required DNS entries (see above for the entries), set this to True. If set to True,
Route53HostedZoneId
below must also be set. - If you will create the needed DNS entries manually (instead of Infinstor CFT automatically creating the above DNS entries for you), set this to False. See above for details.
- If you want InfinStor CFTs to automatically create the required DNS entries (see above for the entries), set this to True. If set to True,
- Route53HostedZoneId: DNS Zone ID in Route 53 for the Infinstor Domain specified above, where InfinStor is to be installed. Only needed if CreateDnsEntries above is true.
1.2.5. Permissions Boundary Options¶
- UseBoundaryPolicy: true or false. If your corporate policy requires you to set a boundary policy, set this to true and enter the boundary policy ARN in the BoundaryPolicyARN item below
- BoundaryPolicyARN: If the above configuration item UseBoundaryPolicy is set to true, then this config item is required and must have the ARN of the boundary policy to use
1.2.6. Other Parameters¶
- enableDDBPointInTimeRecov Setting this to true causes this CFT to configure the MLflow dynamodb database for point in time recovery
Be sure to tick the 'IAM Resources with Custom Names' and 'CAPABILITY_AUTO_EXPAND' checkboxes while clicking through the Stack Options page.