Investigate Your Data
- Note the data is loaded. The number of rows varies between the different data sources (17787 and 65533). You can review the measurement/point names that describe our two turbines (Unit 1 and Unit 2). Statistics are presented for each point. For example Unit 2 Stack NOx has 10273 valid rows (out of 17787), a minimum below 0, a maximum of 30.98, a mean of 3.16 and a standard deviation of 3.44 (which is 108.7% of the mean).Selectthe box outline icon ( Select All). Then right mouse-click within the greyed statistics andselectPlot and Trend Plot.
- All data is presented in row plots, which are not aligned across the different datasets.Changethe plot presentation option from By Row to By Time using upper right button.
- Now data from our data files are aligned in the plots by time (but no longer by rows). If yourun your cursoracross the data near any visible point you can note that Unit 2 has results (running?) from September 18 through 21 and from September 27 through 28.Scroll downon this analyzer data to locate Unit 1 data using the scrollbar on the right.
- Note that the Unit 1 turbine seems to be running from 9/21 through 27 (running your cursor over trend plots provides a pop up of time and value).Dragyour mouse (clicked) across a section of the displayed Unit 1 data (this will zoom over that range).
- Zoom is flexible by dragging or using selected ranges (row or time as presented) with the find (flashlight) icon. In the upper left additional zoom icons to zoom in (step-wise) or zoom out (where present to your previous zoom level) or zoom back to display all data. Next to these zoom tools are additional plotting options,select(turn off) the line button.
- Note that points are interpolated and Unit 1 is not running the entire interim time. Gaps are filled in with lines between populated times.Experimentwith Zoom and Lines/points. You may notice that average points are not plotted where there are insufficient pixels to display all, but preferentially display extremes (to support identifying events or relationships). Also when in ‘line’ rather than point mode – if zoom is sufficient to display points separately they are displayed with lines automatically.Scrollback to our first points,zoom[] fully out andturn[] line display back on.
- Maximize the windowand change number of plots displayed to 6 (from 4). While there are apparent spikes in data for now we’ll leave most of these as is.Selectboth Inlet O2 and Stack O2 (with mouse-click and ctrl-mouse-click).Right mouse-click,selectCut and Cut Below Options. Ctrl-click selects individual plots and Shift-click selects a range of plots.
- Dragyour cut Value below the visibly apparent real data on both plots and “Apply”your cut.The two O2 measurements are correlated and we can see (put cursor over any trend values to understand colors) that Stack O2 is sometimes higher than Inlet O2 (duct burner damper?). A zero excess O2 is bad data. A high O2 (up to 21.6% here) is more likely ambient, i.e. without combustion. You may notice controls in the upper left to change from cut (remove data) to clip (set data to limit), from Below (to Above or Both high and low) and to set a specific value (numerically, not graphically)
- Data is cleaned.Scrolldown to check the O2 data on Unit 1. No cuts are required. Continue to check the rest of the data (Scroll down).Observe your data and look for more data/time or other issues.
- There are problems with the plant weather station data. Apply cuts (multi-select, right mouse-click,Cut, Cut Below Value) on the problematic Outside Air Humidity and Temperature. Remember multi-select is Ctrl-mouse-click or you may select a range with mouse-click, shift-mouse-click.
- Cutthe data (dragcut line) below the least value, but above 0 (orsetthe cut below value to 5).SelectApply.
- ThisSelectthe Dataset, Transforms option from the top icon.Selectour TurbineEmissionsAnalyzer dataset (you may work with more than one dataset at the same time).Select(Ctrl-Click) the three ‘Outside Air” condition transforms where values were Cut below 5. This is our transform list of how we have cleaned/processed the raw data. At the end of an exercise if/when you export part of the dataset for developing a Machine Learning application you will want to execute these same transforms prior to scoring your ML algorithm. In addition you may note that in Interactive Data Exploration data is not eliminated so almost anything is reversible.Right mouse-clickandDeletethese transforms. ThenselectCommit and close.
- Multi-Selectthe three Outside Air plots again (if needed) and with aright-mouse clickoptionsCutRows or Time.
- As needed, you willChangethe upper left plot option from “Drag Cut” to “Drag for Zoom”.Zoomin (by dragging) around the first data drop. You may do this multiple times and if needed reverse one step with the Zoom Out button.
- Changecontext to “Drag Cut” andDragacross the rows that should be eliminated. This can be quicker and simpler if you include one or two extra ‘good’ values on each side. Zoom out andrepeat(Zoom & Drag Cut)SelectApply with two cuts selected leveraging the zoom tools.Scrolldownthrough your data further to look for more issues or information.
- You may notice that other than the analyzer file (emissions) theSelectthe dataset overview tab (TurbineEmissionsAnalyzer) andselectthe Time Merge button at the top (also available under the Dataset Icon below the menu bar).
Provide Feedback