Data Preparation

Use the importer to gather variables from data files into an internal structure called a dataset. When you save a project with a dataset, its name is stored in that project folder. A Data Explorer project can only export data that is in a dataset.
A dataset consists of the original (raw) data values (obtained from the process history through imported data files), and a list of functions or transforms that have been applied to the data, producing a set of transformed data values. The transformed variables can include variables that are unchanged from their raw values, variables whose raw values have been modified by the transforms, and newly created variables generated by transforms. After any transforms have been applied to the dataset, the original data can still be viewed as it was before the transforms were applied.
The terms column and variable are used interchangeably. A raw variable, in most cases, is a process variable that was read into the dataset from a data file; there are some other types of variables that are treated as raw variables, which will be discussed later. If a transform is applied to a raw variable, it is still considered to be a raw variable, but it has both raw and transformed values. A computed variable is a variable that was created by applying a transform function to any variable in the dataset; the computed variable is said to depend on multiple other variables from which it was transformed. An independent variable is a variable created by applying a transform that generates new values without reference to any variable that already existed in the dataset; examples would be generating constants, row numbers, random numbers (noise), or date/time values.
Provide Feedback
Have questions or feedback about this documentation? Please submit your feedback here.