Export
ML Transform (ETL) Manifest
On conclusion of the
above data preparation (data wrangling), useful data (sometimes
filtered to focus on ML objective useful data) is exported for training
in external machine learning toolsets. Thus a prepared dataset of
useful data patterns is generated, but also now a set of scripts,
which is a sub-set of that filtered, selected data (not all transforms
used in data preparation) will be reused as validated on the history
data before running that machine learning model. Prepared data is
used to train a model and data preparation transforms (again, not
necessarily all, but many of the batch preparation transforms are selected
for use in an online application). Exporting the dataset, now includes
the following:
- Prepared data
- Data Preparation Transforms for online use
- An input/output data list identifying raw (input) to transformed (output) data labels
Provide Feedback