Skip to content

Developing Transforms

Typical Development Cycle

  • Choose the Data to work with - InfinSnap, InfinSlice or Label
  • Create a cell out of an existing transform, or an InfinStor provided template
  • Note that the injected cell contains a call to the InfinStor SDK function test_infin_transform
  • Modify the transform in the injected jupyter cell
  • Run the modified transform
  • The InfinStor SDK function test_infin_transform is called. This is for testing the modifications made
  • When the transform works as desired, capture the cell as a transform using the 'Capture' button
  • Note that the call to test_infin_transform will be stripped from the captured transform

Create a code cell using an existing transform or a template

Pressing the 'Browse' button in the Develop subsection of the Transforms section of the sidebar brings up a dialog where the transform to be modified can be chosen. Once the transform is chosen, pressing the 'Cell' button inserts a code cell into the notebook. In this example, the 'file-by-file-template' is chosen and the following code is inserted in a cell in the notebook

# This transform is called for each file in the chosen data
def infin_transform_one_object(filename, temp_output_dir, **kwargs):
    print('infin_transform_one_object: Entered. filename=' + filename + ', temp_output_dir=' + temp_output_dir)


%reset -f
from infinstor import test_infin_transform # infinstor

input_data_spec = dict() # infinstor
input_data_spec['type'] = 'infinsnap' # infinstor
input_data_spec['time_spec'] = 'tm20200907070000' # infinstor
input_data_spec['bucketname'] = 'isstage5test' # infinstor
input_data_spec['prefix'] = '' # infinstor
rv = test_infin_transform('isstage5', input_data_spec) # infinstor

Capture process strips off lines after '%reset -f'

Note the lines following the '%reset -f' line. They are useful during development, but are not necessary in the captured transform, hence these lines will be stripped off when the cell is captured as a transform.

test_infin_transform

The function test_infin_transform is provided by the InfinStor sdk (the pip install package called infinstor). This function calls the transform function in the cell e.g. infin_transform_one_object. This is for development convenience.

Example transformation: Simply copy input file to output

import os
import shutil

# This transform is called for each file in the chosen data
def infin_transform_one_object(filename, temp_output_dir, **kwargs):
    print('infin_transform_one_object: Entered. filename=' + filename + ', temp_output_dir=' + temp_output_dir)
    shutil.copyfile(filename, os.path.join(temp_output_dir, os.path.basename(filename)))


%reset -f
from infinstor import test_infin_transform # infinstor

input_data_spec = dict() # infinstor
input_data_spec['type'] = 'infinsnap' # infinstor
input_data_spec['time_spec'] = 'tm20200907070000' # infinstor
input_data_spec['bucketname'] = 'isstage5test' # infinstor
input_data_spec['prefix'] = '' # infinstor
rv = test_infin_transform('isstage5', input_data_spec) # infinstor

Capturing the newly authored transform

Press the 'Capture' button in the Develop subsection of the Transforms section of the sidebar. The newly authored transform will be captured for future use