Export ML Transform (ETL) Manifest

On conclusion of the above data preparation (data wrangling), useful data (sometimes filtered to focus on ML objective useful data) is exported for training in external machine learning toolsets. Thus a prepared dataset of useful data patterns is generated, but also now a set of scripts, which is a sub-set of that filtered, selected data (not all transforms used in data preparation) will be reused as validated on the history data before running that machine learning model. Prepared data is used to train a model and data preparation transforms (again, not necessarily all, but many of the batch preparation transforms are selected for use in an online application). Exporting the dataset, now includes the following:
  • Prepared data
  • Data Preparation Transforms for online use
  • An input/output data list identifying raw (input) to transformed (output) data labels
Provide Feedback
Have questions or feedback about this documentation? Please submit your feedback here.