39
PI World 2020 Lab Optimizing Analytics for Better Performance

PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

PI World 2020 Lab

Optimizing Analytics for Better Performance

Page 2: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

2 | P a g e

Operational IntelligenceCopyright Copyright & Trademark © Copyright 1995-2020 OSIsoft, LLC 1600 Alvarado Street San Leandro, CA 94577 © 2020 by OSIsoft, LLC. All rights reserved. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, mechanical, photocopying, recording, or otherwise, without the prior written permission of OSIsoft, LLC. OSIsoft, the OSIsoft logo and logotype, Managed PI, OSIsoft Advanced Services, OSIsoft Cloud Services, OSIsoft Connected Services, PI ACE, PI Advanced Computing Engine, PI AF SDK, PI API, PI Asset Framework, PI Audit Viewer, PI Builder, PI Cloud Connect, PI Connectors, PI Data Archive, PI DataLink, PI DataLink Server, PI Developers Club, PI Integrator for Business Analytics, PI Interfaces, PI JDBC Driver, PI Manual Logger, PI Notifications, PI ODBC Driver, PI OLEDB Enterprise, PI OLEDB Provider, PI OPC DA Server, PI OPC HDA Server, PI ProcessBook, PI SDK, PI Server, PI Square, PI System, PI System Access, PI Vision, PI Visualization Suite, PI Web API, PI WebParts, PI Web Services, RLINK and RtReports are all trademarks of OSIsoft, LLC. All other trademarks or trade names used herein are the property of their respective owners. U.S. GOVERNMENT RIGHTS Use, duplication or disclosure by the US Government is subject to restrictions set forth in the OSIsoft, LLC license agreement and/or as provided in DFARS 227.7202, DFARS 252.227-7013, FAR 12-212, FAR 52.227-19, or their successors, as applicable. Published: April 17, 2020

Page 3: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

Table of Contents

Contents Table of Contents .............................................................................................................................................. 3

1. Introduction................................................................................................................................................. 5

1.1 Overview of Lab .................................................................................................................................. 5

1.2 Key Tasks ........................................................................................................................................... 5

1.3 Key Resources for Improving Analysis Service Performance .............................................................. 6

2. Directed Activity - Export Analytics Performance Data and Analyze ........................................................... 7

2.1 Objective of Activity ............................................................................................................................. 7

2.2 Activity Tasks ...................................................................................................................................... 7

2.3 Step by Step Explanation .................................................................................................................... 7

2.4 Solution ............................................................................................................................................. 15

2.5 Summary ........................................................................................................................................... 16

3. Directed Activity – Build Error Checking into an Analytic........................................................................... 18

3.1 Objective of Activity ........................................................................................................................... 18

3.2 Tasks ................................................................................................................................................ 18

3.3 Step by Step Explanation .................................................................................................................. 18

3.4 Summary ........................................................................................................................................... 22

4. Directed Activity – Pre-calculate Variables used in Multiple Analytics ....................................................... 23

4.1 Objective of Activity ........................................................................................................................... 23

4.2 Identify the Tasks .............................................................................................................................. 23

4.3 Step by Step Explanation .................................................................................................................. 23

5. Directed Activity – Shift Calculations to PI Server ..................................................................................... 26

5.1 Objective of Activity ........................................................................................................................... 26

5.2 Tasks ................................................................................................................................................ 26

5.3 Step by Step Explanation .................................................................................................................. 27

6. Directed Activity – Use Exit() Function to Optimize Conditional Calculations ............................................ 28

6.1 Objective of Activity ........................................................................................................................... 28

6.2 Tasks ................................................................................................................................................ 29

6.3 Step by Step Explanation .................................................................................................................. 29

7. Review ..................................................................................................................................................... 31

7.1 Analysis of Performance Statistics Following Fixes ........................................................................... 31

8. Optional Extra – Manually Recalculate Analytics with Out of Order Data as Inputs................................... 33

Page 4: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

4 | P a g e

8.1 Objective of Activity ........................................................................................................................... 33

8.2 Tasks ................................................................................................................................................ 33

8.3 Steps ................................................................................................................................................. 33

Save the Date! ................................................................................................ Error! Bookmark not defined.

Page 5: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

1. Introduction

1.1 Overview of Lab

A modern PI System uses an asset based approach to data analysis as opposed to the older tag based approach previously used. With PI Vision, the PI Integrators, and other tools that are asset focused there is great value to be had through an asset based PI System.

One of the biggest benefits comes from the Asset Analytics system that is part of PI AF. The “one-to-many” feature of creating analytics from templates and the greater power of the analytics engine have proven very powerful compared to the older Performance Equation system that was built into the PI Data Archive. The Asset Analytics engine is also much more scalable than the Performance Equation subsystem was, allowing for large numbers of Analytics.

There is a potential danger to PI System performance when analytics are created at large scales, however. Several factors contribute to potential performance issues, including:

• Badly written analytics that do not follow best practices

• Errors in source data causing slow performance in otherwise well-written analytics

• Analytics that are written in a way that works fine at small scales that are then scaled up to the point where they can cause issues

If there are too many performance issues with Asset Analytics, analytic results can be delayed or even skipped. This can lead to a lack of confidence among users in the results of the analytics.

1.2 Key Tasks

In this lab, we will cover how to improve the efficiency of a large-scale analytics implementation. We will start with how to analyze where to focus efforts in improving analytics. We will then show some common problems that could be causing performance issues with analytics along with potential improvements.

• Export Analysis Statistics from PSE

• Analyze the Analysis Statistics to identify the Analysis(es) that is consuming resources

• Add error checking to avoid Analyses in error

• Identify the best place and time to perform a calculation

• Scheduling

• Dependencies

• Calculate in AF or on the PI Data Archive

• Use the Exit() function to calculate only when necessary

Page 6: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

6 | P a g e

• Monitor recalculations and recalculate manually if necessary

1.3 Key Resources for Improving Analysis Service Performance

1. KB Articles

a. How to Troubleshoot PI Analysis Service Performance

https://customers.osisoft.com/s/knowledgearticle?knowledgeArticleUrl=KB01664

b. Backfilling and Recalculating Analyses with Asset Analytics

https://customers.osisoft.com/s/knowledgearticle?knowledgeArticleUrl=KB01056

c. Automatic Recalculation with the Analysis Service: Technical Details

https://customers.osisoft.com/s/knowledgearticle?knowledgeArticleUrl=KB01790

2. Sebastien's blog on PI Square has a series on improving Analysis performance

a. https://pisquare.osisoft.com/people/sraposo/blog

b. Use variables

c. Data density and pattern

d. Input attributes

e. Analyses in warning or error

f. Scheduling

Page 7: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

2. Directed Activity - Export Analytics Performance Data and Analyze

2.1 Objective of Activity

When first building out a PI System with Analytics, it is rare to see performance problems due to

the small number of analytics. When performance problems do occur, they are easy to isolate

and improve. As the PI System is used and grows over time, and additional analytics are added

and grow the system to a large scale, performance problems both become more likely and

become harder to isolate. The first step to improving the performance of a PI System’s Analytics

is to identify where to focus troubleshooting efforts. In this activity we will show one method for

identifying Analytics that could use attention.

2.2 Activity Tasks

• Export the Analytics performance data from PI System Explorer

• Use PowerShell to parse the data from XML format into a CSV format for import into Excel

• Import the data into Excel

• Enhance the data with additional derivative columns

• Analyze the data in a Pivot table

2.3 Step by Step Explanation

A. Export the performance data from PI System Explorer

1. Launch PI System Explorer

2. Go to the “Management” tab

3. Right click in the lower right under “Pending Operations”

Page 8: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

8 | P a g e

4. Select “View Analysis Services Statistics”

Page 9: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

5. Click “Save” in the lower right and save the resulting file to “C:\Class”

💡

Tip

The PowerShell commands are in a file called “PowerShell and Excel content to copy.txt” located in the C:\Class folder on the lab virtual

machine.

C. Convert the data into a CSV format in PowerShell

1. Launch PowerShell 2. Change directory to “C:\Class”:

cd \Class 3. Import the XML file to the variable “XmlDocument”:

[xml]$XmlDocument = Get-Content -Path .\analysisServiceStatistics.txt

4. Store the relevant performance information into the variable “EvalSummary”: $EvalSummary = $XmlDocument.AnalysisServiceStatistics.ServiceStatistics.EvaluationStatistics.EvaluationStatisticsForCalculationGroups.ANEvaluationStatisticsForGroup.EvaluationSummary._childrenSummary.ANEvaluationSummary

Page 10: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

10 | P a g e

💡

Tip

PowerShell uses the <Tab> key to autofill based on the text already entered. For example, try typing "$XmlD" and then <Tab>. PowerShell will autofill "$XmlDocument". Next, you can type ".An" and <Tab>. PowerShell

will autofill " $XmlDocument.AnalysisServiceStatistics".

5. Export the relevant performance information to a CSV file for use in Excel

using the “Export-Csv” PowerShell function: $EvalSummary | Select-Object -Property DisplayName, @{LABEL="AvgElapsed"; Expression={$_.ElapsedMilliSeconds.Average}}, @{LABEL="MaxElapsed"; Expression={$_.ElapsedMilliSeconds.Maximum}}, @{LABEL="AvgLag"; Expression={$_.EvaluationLagMilliSeconds.Average}}, @{LABEL="MaxLag"; Expression={$_.EvaluationLagMilliSeconds.Maximum}}, @{LABEL="DataDuration"; Expression={[Xml.XmlConvert]::ToTimeSpan($_.Duration).TotalSeconds}}, @{LABEL="ExecutionCount"; Expression={$_.Count}} | Export-Csv -path AnPerfStats.csv -NoTypeInformation

D. Import the CSV file into Excel

1. Browse to the “C:\Class” folder to open the “AnPerfStats.CSV” file created above in Excel

2. Click the “Close” button to dismiss the Office Activation Wizard. 3. Type Ctrl+S to save. You will be prompted to change format. Click “No”

change format.

Page 11: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

4. Click “Browse” to open the “Save As” dialog and save the file as .xlsx and convert it to Excel format

E. Add a “Template” column

1. Insert a new column to the right of the “DisplayName” column

2. Add a header “Template” in column B1 3. In cell B2, insert the following code to extract the Template name from the

DisplayName: =MID(LEFT(A2,LEN(A2)-2),SEARCH("[",A2,SEARCH("[",A2)+1)+1,SEARCH("]",A2)-SEARCH("[",A2))

Page 12: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

12 | P a g e

4. Fill down for all of column B by highlighting cell B2 and double-clicking on the square in the lower right of the cell

F. Calculate the average triggering interval

1. Create the column heading “AvgTrigger” to the right of “ExecutionCount” 2. Calculate the average trigger interval by dividing "DataDuration" by the

"ExecutionCount": =G2/H2 3. Fill down.

G. Calculate a weighting factor to determine how significant the calculations are in terms of performance load

1. Create the column heading “WeightingFactor” to the right of “AvgTrigger” 2. Set the weighting factor to "AvgElapsed" divided by the "AvgTrigger": =C2/I2 3. Fill down

H. Build Pivot table

1. Create a new sheet 2. From the “Insert” ribbon, click on “PivotTable”

3. In the “Select a table or range” field, select the entire source data table. Click OK to close the

Page 13: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

4. In the “PivotTable Fields” docked window on the right, drag “Template” to “Rows”

5. Drag “AvgElapsed” to “Values”

a. Click the dropdown next to “AvgElapsed” and click “Value Field Settings” b. Change the value type to “Average of AvgElapsed” by selecting

“Average” from the list and clicking “OK”

6. Drag “MaxElapsed” to “Values” and change it to “Max of MaxElapsed” using the value field settings

7. Drag “AvgLag” to “Values” and change it to “Average of AvgLag” using the value field settings

8. Drag “MaxLag” to “Values” and change it to “Max of MaxLag” using the value field settings

9. Drag “AverageTrigger (Seconds)” to “Values” and change it to “Average of AverageTrigger (Seconds)” using the value field settings

10. Drag “WeightingFactor” to “Values” and change it to “Average of WeightingFactor” using the value field settings

I. Sort the Pivot table by Average of WeightingFactor

Page 14: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

14 | P a g e

1. Highlight the “Row Labels” cell and click the dropdown on the right of the cell

2. Select “More Sort Options” 3. Select “Descending (Z to A) by:” and select “Average of WeightingFactor”

from the dropdown, then click “OK”

Page 15: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

J. Add Conditional Formatting to highlight Analytics load

1. Highlight the cells in the “Average of WeightingFactor” column excepting the header and footer rows

2. From the “Home” ribbon, select “Conditional Formatting” 3. Select “Data Bars” from the dropdown, then choose the data bar fill of your

choice

2.4 Solution

Your solution should look something like this:

A sample solution file called “PerformanceAnalysis.xlsx” is available in the “C:\Class\Solution” folder. The meaning of the columns is as follows:

• Elapsed (Milliseconds): How long the Analytic took to execute. The Elapsed time should always be less than the triggering interval.

• Lag (Milliseconds): How delayed the Analytic result was from when it should have been available. High lag indicates that the Analytics engine is unable to keep up with its performance load.

• Trigger (Seconds): How often the Analytic executes.

Page 16: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

16 | P a g e

• WeightingFactor: A calculated column that is the average elapsed time divided by the trigger interval. Analytics that take longer or execute more frequently will have a higher WeightingFactor.

2.5 Summary

By inspecting the PivotTable, in particular the WeightingFactor column, it is possible to figure out where to focus efforts for optimizing Analytics to improve performance. In the example solution above, the “High Wind Speed Calculation” analytic is dramatically more expensive than most other analytics, followed closely by “Lost Power 30d Average”. In this example, since most of the analytics trigger at roughly the same interval, the “Average of AvgElapsed” column could also be used to diagnose the system. However, the WeightingFactor column takes into account both the amount of time the analytic takes to complete and how often it triggers. This means the WeightingFactor is more useful in real systems, where the triggering rate differs. Analytics that take a long time to execute but don’t execute very frequently might not need as much attention as those that trigger often.

One factor to keep in mind with this analysis is that the PI System Explorer plugin that analyzes analytic performance only shows the worst performing five instances of each analytic template. This still allows us to see which analytics are performing badly, but it may hide certain problems caused when many analytics instances are running because the analysis is part of template used for many elements. If an analytic takes over a second to evaluate and triggers every 60 seconds, it could appear expensive, but if there are only a few instances of this particular analytic it might not be problematic. On the other hand, an analytic that takes a tenth of a second to execute every 60 seconds might seem benign in this analysis since we’re only seeing the five worst instances, but if there are tens of thousands of instances, it could be problematic.

One way to deal with this would be to add a column to the raw data indicating how many instances of each analytic template exist. There is not an easy way to get this number from the data we can export currently, but it could be obtained from a count of elements of a particular template using either PI System Explorer or a PI SQL query tool.

Another way to approach a situation like this, if the analysis performed in this lab is not

sufficient, is to contact OSIsoft Technical Support. OSIsoft has tools that can be run to extract

more detailed performance data for all analytics. This data can then be used to build a more

complete PivotTable of analytics performance. In the below example, the “Count of averageLag”

column shows how many instances of each analytic exist. Additionally, the “TimeWeighting”

factor used a sum rather than an average, so that each instance of a particular analytic template

contributed to the total, adding an additional layer of weighting based on count of a particular

analytic.

Page 17: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to
Page 18: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

18 | P a g e

3. Directed Activity – Build Error Checking into an Analytic

3.1 Objective of Activity

One key measure admins can take on an ad-hoc basis to improve the performance of the

Analysis service is to stop calculations that are generating errors. One step better than stopping

calculations in error is to write calculations that do not generate errors because they incorporate

robust error checking.

Error checking has additional benefits, beyond performance:

• More accurate results from the calculations

• More reliable calculation output

• Less user confusion

The Asset Analytics engine also attempts to restart calculations in an error state every fifteen

minutes by default. This can cause significant performance issues if there are a large number of

Analytics in an error state.

3.2 Tasks

• Exclude the “Local Wind Speed” attribute on a turbine that does not have a local wind speed

sensor

• Add error checking to the “Local Wind Speed vs Site Wind Speed” analytic

• Replace the “Local Wind Speed” tag with a “Not Applicable” tag on a turbine that does not

have a local wind speed sensor

3.3 Step by Step Explanation

A. Exclude Attributes not available on certain assets

1. Launch PI System Explorer 2. Navigate to the Elements view 3. Navigate to the “RB006” element under “Scirocco\Finisterre”

Page 19: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

Select the “Local Wind Speed” attribute and add the “Excluded” property using the attribute properties pane on the right

4. Check in your changes 5. Change to the Analyses tab for the “RB006” element 6. Observe that the “Local Wind Speed” analytic is in a paused state because

an input variable is an “Excluded” attribute

💡

Tip

If the excluded attributes are not wrapped in “BadVal()” function checks, they will cause the analytic to pause. If there are other calculations that

should take place within the Analytic unrelated to the excluded attributes, make sure to use the “BadVal()” function to test the input as in section B

below.

B. Local Wind Speed vs Site Wind Speed - Check input values and Exit() when the input values do not contain good data.

1. In PI System Explorer, select the Library view 2. Navigate to the “Turbine” element template 3. On the “Analysis Templates” tab, select the “Local Wind Speed vs Site Wind

Speed Analytic”

Page 20: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

20 | P a g e

4. With a turbine GE001 selected as the Example Element, preview the results of the analytic.

5. Note that the value of the input “Local Wind Speed” sometimes evaluates to

“Comm Fail”, causing calculations to fail.

6. Select turbine RB001 as the example element, then evaluate the analytic

again. Note that the RB turbines do not have local wind speed sensors, so the analytic will always fail.

7. Add error checking and use the “Exit()” function to stop processing if there is

a bad value on the local wind speed input:

IF BadVal('Local Wind Speed') THEN Exit() ELSE 'Local Wind Speed'

💡

Tip

Use Shift+Enter to add line breaks within an expression in PSE.

Page 21: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

a. The “BadVal(‘Local Wind Speed’) function will return true if there is a problem with the input, either a missing tag or a bad state in the tag value itself.

b. The “Exit()” function will stop processing of the analytic and save processing time and further errors if there is a bad value on the input.

8. Preview the analytic results for both a GE turbine and an RB turbine, and note that now there are no errors.

9. Check in your changes in PI System Explorer.

💡

Tip

For analytics that read their own output, a “Totalizer” type functionality for example, error checking is critical! A simple error

getting into the results tag could result in continuing errors.

C. Use a “Not Applicable” tag to replace attributes that are not available for certain assets

1. In PI System Explorer, navigate to the Elements view 2. Navigate to the “RB007” element under “Scirocco\Finisterre” 3. Select the “Local Wind Speed” attribute 4. Click “Settings” to change the PI Point Data Reference settings 5. Enter “NotApplicableTag” for the tag name field and press “OK”

6. Check in your changes 7. Note that the “Local Wind Speed” attribute now displays a value of “NA” 8. Change to the Analyses tab and select the “Local Wind Speed vs Site Wind

Speed” analytic 9. Evaluate the results. Note that now the analytic will exit cleanly because of

our error checking. The “NotApplicableTag” is a double type with a system digital state of “NA”, so that the BadVal() function will evaluate to true.

💡

If the target PI Tag does not exist, the BadVal() function will also give a true result, but performance will be better when the tag exists. The

Page 22: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

22 | P a g e

Tip Analytics engine will be able to cache the value from an existing tag rather than searching for a non-existing tag (on each evaluation).

3.4 Summary

Error checking, both for data errors within the tags and for non-existing tags, can be critical to analytics performance at a large scale. The “BadVal()” function is key. Both attribute exclusion and a default “Not Applicable” tag are valid ways to solve situations where tags are not available for certain assets, but each has advantages.

Excluded attributes make it more obvious that it is a known factor that a particular asset does not have a certain tag. Excluding attributes on many assets takes extra effort however. Conversely, if the “Not Applicable” tag is set as the default on an element template, and only assets that have the tags replace the default, it can be much less effort at large scales.

For example, a transmission and distribution customer might have hundreds of transformers with one to five sensors attached, but with the sensors being moved around to different transformers month to month. In this case, it is easiest to default all five attributes to the “Not Applicable” tag, then overwrite the existing tags with the proper tag name. Then when a new set of tags comes in, the attributes can all be reset to template and the correct tags overwritten again.

Page 23: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

4. Directed Activity – Pre-calculate Variables used in Multiple Analytics

4.1 Objective of Activity

One best practice for Asset Analytics is to use variables within a single Analysis to store values

for later use. When we store the result of a calculation, that result can then be used multiple

times within the Analysis without evaluating the calculation again.

This concept can be taken a step further by considering calculations that are used in multiple

Analytics.

This is useful in a few scenarios:

1) Where both an Expression Analytic and an Event Frame Generation analytic use the

same calculation result as an input.

2) Where multiple Expression Analytics for different elements use the same calculation

from a common source variable as an input, for example a source variable on a higher

level of the hierarchy.

3) When multiple Expression Analytics use the same source calculation, but the derivative

calculations need to be calculated on a different schedule.

If an expression is used in two different Expression Analytics with similar scheduling, the two

Expression Analytics can often be combined into a single analytic and therefore use a common

source variable within that analytic.

It is worth noting that using this approach creates extra analytics, which can be a problem in

systems that are already large. However, if the precursor analytic can be made more efficient or

scheduled more slowly than the original expensive analytics it feeds data to, the approach is

worth considering.

4.2 Identify the Tasks

• Identify variable or calculation used in multiple Analytics

• Create new precursor analytic to calculate the common variable

• Modify original analytics to use results from the new output tag rather than calculating

multiple times

4.3 Step by Step Explanation

A. Identify variable or calculation used in multiple Analytics

1. In PI System Explorer, navigate to the Library, select the “Turbine” element template, then select the “Production Lost vs. Average Power Production” analytic

Page 24: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

24 | P a g e

2. Launch a 2nd instance of PI System Explorer by right clicking on it in the taskbar and selecting “PI System Explorer (64-bit)”

3. In the new instance of PI System Explorer, navigate to the Library, select the “Turbine” element template, then select the “Production Lost EF” analytic

4. Note that both analytics use the 30 day average of Active Power calculation:

5. In either instance of PI System Explorer, inspect the “Active

Power|30dAverage” attribute. Note that it is a PI Point data reference calculation, and is therefore calculated on demand, or in other words when either analytic executes.

6. Preview results on the “Production Lost vs. Average Power Production”

analytic and note that it is executing every 10 seconds. Barring extreme circumstances, a 30 day average cannot change significantly in 10 seconds, yet it must be recalculated every 10 seconds for this Analysis (and then again separately for the EF Analysis). We can improve performance if we calculate it separately on a slower schedule.

B. Create a new precursor Analytic to calculate the common variable

1. Create a new analytic called “ActivePower30dAvg”. Set the first input to the “Active Power|30dAverage” attribute, and map the output to the “Active Power|30dAverageStored” attribute (which points to a PI tag). Set the execution scheduling to periodic every hour.

Page 25: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

2. Check in your changes, and backfill the new “ActivePower30dAvg” analytic for today (start time “t”, end time “*”) for all turbine elements.

C. Modify original analytics to use results from the new output tag rather than calculating multiple times

1. Change the “AvgPower” input in the “Production Lost vs. Average Power Production” analytic to use the new “ActivePower|30dAverageStored” attribute:

2. In the same instance of PI System Explorer, Change the “AvgPower” input in the “Production Lost EF” analytic to use the new “ActivePower|30dAverageStored” attribute

3. Check in your changes.

Page 26: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

26 | P a g e

5. Directed Activity – Shift Calculations to PI Server

5.1 Objective of Activity

Certain simple calculations can be shifted to take place on the PI Data Archive using the

Archive Subsystem. This is accomplished using the summary calculation methods on the PI

Point Data Reference. These calculations include:

• Average (Event Weighted or Time Weighted)

• Count

• Delta

• Maximum/Minimum

• Population Standard Deviation

• Range

• Standard Deviation

• Total (Event Weighted or Time Weighted)

There are two primary advantages to shifting these calculations to the PI Data Archive:

1) Performance load on the Analytics server is reduced

2) Network traffic between the Analytics server and the PI Data Archive is reduced

Note that reducing the load on the Analytics server comes at a cost of increasing load on the PI

Data Archive. However, the PI Data Archive is extremely efficient at these kinds of calculations,

so it is often acceptable to share the load for these kind of calculations with the PI Data Archive.

The network traffic improvement happens because only the calculation results are sent from the

PI Data Archive to the Analytics server instead of all the raw values needed for calculation. In

cases where the source data is frequent, this can be a significant advantage when considering

large numbers of analytics. However, performance load on the PI Data Archive should be

monitored and considered as well. If a PI System Collective is present, using the secondary PI

Data Archive as the data source for Analytics can often help with performance load.

In this section, we will optimize the “Lost Power 30d Average” analytic by both reducing the

scheduling to something reasonable for a 30 day average, and by shifting the calculation itself

to the PI Data Archive.

5.2 Tasks

• Compare results for an Analytics calculation and PI Point Data Reference summary

calculation

• Modify the Analytic to use the PI Point Data Reference summary calculation

Page 27: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

5.3 Step by Step Explanation

A. Compare results for an Analytics calculation and PI Point Data Reference summary calculation

1. In PI System Explorer, navigate to the Library, select the “Turbine” element template, then the “Lost Power 30d Average” analytic

2. Inspect the two inputs, and note that “AnalyticsCalc” uses the PI Analysis function “TagAvg()” while the “ServerSideCalc” input uses a reference to a PI Point Data Reference summary calculation.

3. Preview results for the analytic and note that the resulting values are the same.

B. Modify the Analytic to use the PI Point Data Reference summary calculation

1. Delete the “AnalyticsCalc” variable and map the “ServerSideCalc” variable to the “Lost Power 30d Avg” attribute instead.

2. Note that the analytic is event scheduled and executes far more often than needed for a 30d average.

3. Change the analytic to hourly periodic scheduling.

💡

Tip

In this lab, the PI Data Archive and the Analytics Service are located on the same virtual machine, so a change in calculation loading is difficult to detect. In real world cases, the CPU loading of the Analytics Service and

PI Archive Subsystem should be monitored.

Page 28: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

28 | P a g e

6. Directed Activity – Use Exit() Function to Optimize Conditional Calculations

6.1 Objective of Activity

Some Analytics are extremely expensive, but necessary under certain conditions. Originally the

best way to deal with this was “IF-THEN-ELSE” logic to avoid expensive calculations and the

“NoOutput()” function to avoid storing an unnecessary result. In a recent release of Analytics, a

new function was added that takes this a step further. The “Exit()” function allows Analytics to

stop processing entirely when desired, skipping all further rows (variables) without evaluating.

Although this could previously be accomplished with the “IF-THEN-ELSE” logic and the

“NoOutput()” function, using the “Exit()” function can be much cleaner and even saves a few

processing cycles by skipping all further calculations in the analytic.

Although less necessary from a performance perspective, the “Exit()” function is also useful to

build calculations that execute on a slower schedule than the “Daily” option provided by the

Analytics scheduler. For example, a weekly calculation rather than daily. To accomplish this,

check the date or day using the time functions of Analytics, and use the “Exit()” function if the

calculation should not be executed at the current time.

Page 29: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

6.2 Tasks

• Modify the “High Wind Speed Calculations” analytic to only execute when the wind speed is high

6.3 Step by Step Explanation

A. Add IF THEN ELSE with Exit() to the calculation

1. In PI System Explorer, navigate to the Library, select the “Turbine” element template, then the “High Wind Speed Calculations” analytic

2. Note that the Analytic executes fully even if the Wind Speed is not higher than 20.

3. Modify the “WindSpeedHigh” variable to be “true” if wind speed is greater than 20 and to “Exit()” otherwise

4. Preview results and note that the Analytic displays “—” for results unless the

wind speed is high. This indicates that the Analytic is exiting.

5. Alter the “HighWindPercentage” line to only execute if it has been more than

a minute since the last high wind event by adding “IF HasChanged(‘High Wind Speed Calculation_HighWindPercentage’,’*-1m’) THEN Exit() ELSE “ to the variable:

Page 30: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

30 | P a g e

💡

Tip

The “HasChanged(<Attribute>,<Time>)” function can be useful if there is an expensive calculation that needs to take place sometimes, but does not

need to be executed at more than a certain frequency.

6. Check in your changes.

Page 31: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

7. Review

7.1 Analysis of Performance Statistics Following Fixes

We can repeat the directed activity from section 2 after making our performance improvements

in order to see the improved results. Although we may not have time to let the Analytics engine

run for a long period, results should still be visible.

The performance statistics are reset when the Analytics service is reset, so first reset the

Analytics Service in your lab virtual machine with the following steps:

1. Launch the “Services” application by typing “services” in the search 2. Find the service called “PI Analysis Service” 3. Restart the service by right clicking on it and selecting “Restart”

After restarting the service, wait about ten minutes to give the analytics time to run, then repeat

the steps in the directed activity to export the results and analyze them. It could be interesting to

put the new results in a new tab in your existing spreadsheet to be able to compare.

On the next page, you can see sample results from the changes we made to the analytics.

Some of the same analytics are still expensive, but they are dramatically less expensive than

previously. The average elapsed time has decreased significantly on the analytics that we

worked on, as has the weighting factor.

Page 32: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

32 | P a g e

Before:

After:

Page 33: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

8. Optional Extra – Manually Recalculate Analytics with Out of Order Data as Inputs

8.1 Objective of Activity

Automatic recalculation can help ensure your analytics take into account all of the data,

particularly when data changes unexpectedly after the initial calculation. However, automatic

recalculation may create a heavy load on the server in some situations. First, the nature of the

input data may cause recalculations: late arriving and out of order data can trigger

recalculations. If the input data arrives slowly and spread out over time it may cause many

recalculations before the final "correct" calculation is run.

To make matters worse, if there are analyses that use the output of a recalculated analysis as

inputs, they must also be recalculated, which could trigger a cascade of recalculations. This is

especially common when rolling up the results of other analytics. In the case of cascades,

manually recalculating on a schedule allows forced sequential recalculating of analytics that

have dependencies on other analytics.

Additionally, there may be some time limit after which the data should not be recalculated.

In the above scenarios, the solution is to disable automatic recalculations and recalculate

manually.

References:

• Automatic Recalculation with the Analysis Service: Technical Details

• PI Asset Framework and PI System Explorer > Asset analytics and notifications > Analysis

service configuration

8.2 Tasks

• Determine how frequently analyses are auto-recalculating

• Disable automatic recalculation

• Script a solution to recalculate the analyses at the desired frequency

8.3 Steps

A. Create a list of Analyses to recalculate manually

1. Open the recalculation log file by right clicking in the "Pending Operations" pane of the "Management" pane.

Page 34: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

34 | P a g e

In our lab systems, the path is: C:\ProgramData\OSIsoft\PIAnalysisNotifications\Data\Recalculation\re

calculation-log.csv. 2. If needed, use Excel to filter and count the recalculations for each analysis.

Filter column "Type" and remove all manual recalculation requests. 3. Create a text file (or one file for each recalculation schedule) called

C:\Class\DailyRecalculationIDs.txt and add the IDs for the Analyses that should be manually recalculated.

4. Close Excel to release the recalculation log file (if Excel locks the file, the service will not be able to write to it).

B. Disable automatic recalculation for the above Analyses

C. Create a script to recalculate

1. Create a script to add the desired Analyses to the recalculation queue. Sample below.

$filePath = "C:\Class\DailyRecalculationIDs.txt" $AFServerName = "pisrv01" $StartTime = "y" $EndTime = "t" $idsString = Get-Content -path $filePath $GuidList = new-object System.Collections.Generic.List[Guid] foreach($idString in $idsString) { $GuidList.Add(([Guid]::Parse($idString))) }

Page 35: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

$GuidArray = $GuidList.ToArray() [Reflection.Assembly]::LoadWithPartialName("OSIsoft.AFSDK") | Out-Null [OSIsoft.AF.PISystems] $AFServers = new-object OSIsoft.AF.PISystems [OSIsoft.AF.PISystem] $AFServer = $AFServers[$AFServerName] $Analyses = [OSIsoft.AF.Analysis.AFAnalysis]::FindAnalyses($AFServer,$GuidArray,$null) $TimeRange = [OSIsoft.AF.Time.AFTimeRange]::Parse($StartTime,$EndTime) $AFServer.AnalysisService.QueueCalculation($Analyses,$TimeRange,1)

D. Use task scheduler to run the script daily.

1. Open "Task Scheduler" and select "Create Task…" in the "Actions" bar on the right.

2. On the "General" tab, give the task a meaningful name, select an account to

run the task (the service account for Analysis would be a good candidate),

Page 36: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

36 | P a g e

and set it to "Run whether user is logged in or not".

3. On the "Triggers" tab, create a new trigger. Try to schedule the recalculation

for a time where load will be low. In particular, do not run at midnight since there are likely many calculations that already run at midnight.

4. On the "Actions" tab, create a new action. In the “Program/script” field enter

Powershell.exe and in the “Add arguments” field enter -file c:\class\DailyRecalcScript.ps1

Page 37: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

E. Verify recalculation is working as expected

1. Open the recalculation log file and verify that the calculations were scheduled and executed without errors.

2. Review the output data.

Page 38: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

38 | P a g e

Page 39: PI World 2020 Lab Optimizing Analytics for Better Performance World 2020 … · 1.1 Overview of Lab A modern PI System uses an asset based approach to data analysis as opposed to

© Copyright 2020

OSIsoft, LLC