36
www.wiredcity.com Page 1 or 36 Cisco NetFlow Reporting Instruction Manual Version 1.0 WiredCity 777 Davis St, Suite 250 San Leandro CA 94577 Ph: + 1 510 297 5874 Fax: +1 510-357-8136 [email protected] www.wiredcity.com

NetFlow Reporting[1]

Embed Size (px)

DESCRIPTION

aa

Citation preview

  • www.wiredcity.com Page 1 or 36

    Cisco NetFlow Reporting Instruction Manual

    Version 1.0

    WiredCity 777 Davis St, Suite 250

    San Leandro CA 94577 Ph: + 1 510 297 5874 Fax: +1 510-357-8136

    [email protected] www.wiredcity.com

  • www.wiredcity.com Page 2 or 36

    TABLE OF CONTENTS

    1 Introduction................................................................................................................. 3 2 Audience ..................................................................................................................... 3 3 System Requirements.................................................................................................. 3 4 Reference Manuals...................................................................................................... 3 5 NetFlow Totals............................................................................................................ 4

    Outlined below are examples of how to use event weighted totals to: ....................... 4 Example 1: Calculating Traffic Volumes ................................................................... 5 Example 2: Calculating Traffic Breakdowns.............................................................. 8 Example 3: Creating Reusable Breakdown Reports................................................. 12 5.1.1 Example 4: Incorporating Time of Day in Traffic Reports ...................... 15

    6 Reporting on NetFlow Details .................................................................................. 16 Example 5: Outputting NetFlow Details................................................................... 17 Example 6: Host-Host Communication Matrix using PivotTables .......................... 20 Example 7: Identifying Top-N Talkers using PivotCharts...................................... 27 Example 8: Server-side Data Filtering...................................................................... 32

    7 PI-Datalink Tips and Tricks...................................................................................... 35 7.1 Calculation Basis .............................................................................................. 35 7.2 Scaling Results.................................................................................................. 35 7.3 Excel Time Functions ....................................................................................... 35 7.4 Lookup Tables .................................................................................................. 36

  • www.wiredcity.com Page 3 or 36

    1 Introduction IT Monitor collects Cisco NetFlow records that provide valuable information about how users and applications utilize network infrastructure. Fast, accurate reporting on this data allows network administrators to:

    Accurately plan network capacity upgrades Verify acceptable use policy compliance by identifying unauthorized users,

    applications and protocols Account for asset utilization Design application deployments that use expensive resources (such as bandwidth)

    more effectively and efficiently. Reporting of NetFlow data is an ongoing challenge because of the volumes of NetFlow records that can be produced. The purpose of this document is to show how NetFlow data can be more effectively managed and reported using IT Monitor by:

    Utilizing PI-Datalink to create fast, re-usable traffic reports. Combining PI-Datalink and Excel Pivot-Charts for Top-N reports. Optimizing reporting speed using best practices.

    2 Audience The intended audience is IT Monitor users who have deployed the NetFlow interface and wish to produce reports on the data that is collected. An understanding of NetFlow interface tag configuration is required.

    3 System Requirements PI Server 3.3 and later NetFlow Interface to the PI System 1.0 and later PI-Datalink 3.0 and later Microsoft Excel 97 and later PI-SMT Module Database Builder plug-in 3.1 and later (optional)

    4 Reference Manuals Cisco NetFlow (http://www.cisco.com/go/netflow) OSIsoft NetFlow Interface to the PI System User Manual OSIsoft PI-Datalink User Manual Microsoft Excel User Manual

  • www.wiredcity.com Page 4 or 36

    5 NetFlow Totals The PI-NetFlow Interface records the following filtered network traffic into OSIsofts PI Server1:

    source IP address or hostname destination IP address or hostname source MAC (media access control) address destination MAC address source IP port destination IP port transport protocol (e.g., ICMP, UDP, or TCP) type of service network interface autonomous system (BGP)

    Traffic volumes are periodically stored in PI (the default setting is once per minute). The total traffic for a period of time can be measured by calculating the sum of values that fall within a specified time-period. This is referred to as an Event-weighted total. To occur, compression should be disabled because the data stored to NetFlow tags is comprised of discrete values. Event-weighted totals can be calculated on the PI server using the PI-SDK summary function call. This is a very efficient reporting mechanism because the result is transmitted directly to the client. Event-weighted totals of NetFlow tags can be calculated quickly, with minimal impact on the users computer or network as the request is processed on the PI Server itself. Creating reports that use Event weighted totals to provide:

    Low data volume required in PI. A NetFlow tag using the default write interval requires less than 5 MB to store a years data

    Fast response because a small number of values are required to produce reports Bounded network overhead between the client and server regardless of the

    calculated time period of the report

    Outlined below are examples of how to use event weighted totals to: Calculate traffic volumes Calculate traffic breakdowns Create reusable breakdown reports Incorporate time of day in traffic reports

    1 Complete NetFlow records can also be collected by the NetFlow interface, as discussed in Section 6.

  • www.wiredcity.com Page 5 or 36

    Example 1: Calculating Traffic Volumes A simple NetFlow report can represent the volume of traffic over a nominated period of time and this report can be shown as either a single value or a trend. For example, consider a tag called NF_AllTraffic that captures all NetFlow records. The total traffic over a 1 day period can be calculated using an Event-weighted total. This is achieved by using the Advance Calculated Data function in PI-Datalink 3.0 (Figure 1).

    Figure 1. Advanced Calculated Data feature in the PI-Datalink menu.

    To get the Total Traffic over a one-day period, the following features in the Advanced Calculated Data window need to be configured (Figure 2):

    Tagname (linked from Spreadsheet) PI Server (linked from Spreadsheet) Start Time (linked from Spreadsheet) End Time (linked from Spreadsheet) Calc. Mode (total) Calculation Basis (event-weighted)

  • www.wiredcity.com Page 6 or 36

    Figure 2. Configuration of Advanced Calculated Data.

    Once you have clicked OK you will see the total number of traffic flows (Figure 3). This total can be used as an input to other Excel functions. Figure 3. Result from the Advanced Calculated Data function.

  • www.wiredcity.com Page 7 or 36

    Time intervals can also be specified. For example, to determine how the traffic varied over each hour of the day, you specify a Time Interval of 1 hour. By checking the show timestamps box you will also be able to display which hour each value corresponds to. Figure 4. Using Advanced Calculated Data with a 1 hour interval to produce hourly traffic totals.

    By using Excel graphs, you can view the daily fluctuations. For example, the X-Y scatter plot has been used to demonstrate this in Figure 5.

  • www.wiredcity.com Page 8 or 36

    Figure 5. X-Y scatter plot graph of hourly traffic totals.

    Example 2: Calculating Traffic Breakdowns Richer reports showing the breakdown of traffic volumes can be created using multiple NetFlow tags that exploit the additive property of network traffic. Using the same procedures outlined in example one, the Even- weighted total can be calculated for the following NetFlow tags over a one-day period to produce the values shown in Figure 6.

    NF_AllTraffic (captures all traffic) NF_HTTP (captures traffic to or from TCP port 80) NF_SMTP (captures traffic to or from TCP port 25) NF_PI (captures traffic to or from TCP port 545 and 5450)

  • www.wiredcity.com Page 9 or 36

    Figure 6. Event-weighted totals of all traffic, http, smtp and PI.

    Each tag represents a subset of NF_AllTraffic, so you can determine a percentage breakdown of traffic by protocol. Unaccounted other traffic can also be shown by subtracting all accounted protocols from the total traffic. The results of these calculations are shown in Figure 7. Figure 7. Percentages and other traffic calculated using Excel formulas.

  • www.wiredcity.com Page 10 or 36

    A breakdown of the links utilization (by protocol) can be presented in an Excel pie chart (Figure 8). The other measurement is an important indicator of unexpected traffic although its use should be minimized. It is recommended that new tags that capture other protocols be introduced, to give you a more accurate picture of your network traffic. You can use NetFlow detail records (section 6) to help discover what these new protocols would be.

    Figure 8. Pie chart represents the protocol breakdown.

    A protocol breakdown by time series can be created by adding a Time Interval into the Advanced Calculated Data function. By checking the row(s) function the data will be presented across the page (Figure 9).

  • www.wiredcity.com Page 11 or 36

    Figure 9. Creating a time interval of 1 hour.

    This data can be presented in a stacked area chart as shown in Figure 10. This report is useful for determining a baseline of protocol utilization. For example, the chart demonstrates that an increase in network traffic was actually resulting from an increase in SMTP flows. Figure 10. Stacked area chart represent time-series protocol breakdown.

  • www.wiredcity.com Page 12 or 36

    To recap, we have shown how tags can be used to separate NetFlow data into multiple, recognizable protocols. There are also other useful breakdowns that can be created by configuring tags including;

    Link usage by server Protocol usage for a specific server Interface usage for a specific router Type of Service (TOS) usage for a specific interface

    Example 3: Creating Reusable Breakdown Reports The previous example showed how to use NetFlow tags to report a breakdown of traffic. It can be useful to create this type of report as a template that can be customized for each of the breakdowns listed above. A generic report can be created that is reusable for each data/tag source and the PI Module Database is used to develop this2. For example, the following tags can be used to produce an application breakdown for Server1 and Server2:

    NF_Server1_AllTraffic NF_Server1_HTTP NF_Server1_SMTP NF_Server1_PI NF_Server2_AllTraffic NF_Server2_HTTP NF_Server2_SMTP NF_Server2_PI

    The tags are organized as aliases in the Module Database hierarchy:

    2 Further information on how to use and configure the PI module database can be found in the PI SDK manual

  • www.wiredcity.com Page 13 or 36

    The Alias/Property function can be used to lookup an Alias for a specific Module Path. This is shown below. Figure 11. Configuration of Alias/Property function to resolve an Alias for a Module Path.

    The results of the generic report are shown for Server1 in Figure 12.

  • www.wiredcity.com Page 14 or 36

    Figure 12. Generic report uses a Module Path as the context.

    A new Module Path can be set by using the Module Database function (Figure 13). Figure 13. Creating a new module path for the Module Database.

  • www.wiredcity.com Page 15 or 36

    5.1.1 Example 4: Incorporating Time of Day in Traffic Reports There are reporting scenarios that should only include traffic occurring during a specific window of time. For example, bandwidth may be charged at different rates during the day according to peak and off-peak hours. In this instance, a report should present two values representing the usage during these different time-periods. This can be implemented using a Filter Expression (Figure 14). Figure 14. Using a Filter Expression in PI Data link.

    Consider a tag called NF_AllTraffic which captures all NetFlow records. The total traffic during business hours over the last 30 days can be calculated using the Filter Expression '*' >= 't+9h' and '*' < 't+17h'. Excel can escape the leading quote so either use '' or a leading space at the start of the filter expression.

  • www.wiredcity.com Page 16 or 36

    Figure 15. Filter expression used to limit calculation data to business hours values.

    6 Reporting on NetFlow Details In addition to traffic volumes, the following details of NetFlow records, that meet the filtering criteria, are stored by the PI-NetFlow interface into PI:

    source IP address source machine name destination IP address destination machine name source IP port destination IP port transport protocol (e.g., ICMP, UDP, or TCP) number of bytes

    While the NetFlow details provide much richer reports, there are the following disadvantages:

    Large volumes of data are stored in PI and each NetFlow record requires at least 48 bytes. The arrival rate can potentially exceed thousands of records per second and storing all NetFlow records, indefinitely, may not be feasible for an enterprise network

    All data is post-processed on the client which can slow the report generation All values must be transmitted to the client resulting in high network overhead

  • www.wiredcity.com Page 17 or 36

    As such, it is recommended that detail reports be applied to shorter time periods (for example while troubleshooting).

    Example 5: Outputting NetFlow Details A simple report using NetFlow details involves outputting raw NetFlow records. PI-Datalink 3.0 provides two functions for importing raw data into a spreadsheet; Compressed Data (Start Time/Number) and Compressed Data (Start Time/End Time). Figure 16. Importing raw data using the Compressed Data function.

    Compressed Data (Start Time/Number) returns a fixed number of values and Compressed Data (Start Time/End Time returns all values over a time range. Consider a tag called NF_AllTraffic which captures all NetFlow records and is configured to store detail records into the following tags:

    NF_AllTraffic_detail_octet NF_AllTraffic_detail_prot NF_AllTraffic_detail_srcIP NF_AllTraffic_detail_srcPort NF_AllTraffic_detail_destIP NF_AllTraffic_detail_destPort

    The last ten records can be extracted using the Compressed Data (Start Time/Number) function by configuring the following:

    Tagname (linked from Spreadsheet) PI Server (linked from Spreadsheet) Start Time (linked from Spreadsheet) Number of Values (linked from Spreadsheet) A negative value indicates the

    records should be retrieved backwards in time from the Start Time. Check the Show Timestamps box

  • www.wiredcity.com Page 18 or 36

    Figure 17. Configuration of Compressed Data (Start Time/Number).

    This same feature can be used to get values from the other NetFlow detail tags. In this case, the Show Timestamps box does not need to be checked as these values share the same timestamp as the original column (see below) . Figure 18. The last ten NetFlow records are reconstructed in Datalink..

  • www.wiredcity.com Page 19 or 36

    Parsing large volumes of NetFlow can be very frustrating. Excels AutoFilter function can be used to reduce the data that is displayed. To add filter drop down menus, click Row 6 (the heading row) and select Auto Filter from the Data / Filter submenu. The lists of filter options are automatically determined (Figure 19).

    Figure 19. Using Auto-filter to sort data.

  • www.wiredcity.com Page 20 or 36

    Example 6: Host-Host Communication Matrix using PivotTables If NetFlow is stored in a LAN/PacketCapture scenario, all host-host communications can be stored in NetFlow detail records. Excel PivotTables can be used to display the Host-Host Communication Matrix. NetFlow records need to be imported into Excel so they can be used as source data (as demonstrated in Example 5). Two worksheets are created, one named Report and another named Data. Using the Compressed Data (Start Time/Number) feature, data is imported as a bounded number of records into the Data worksheet (Figure 20).

    Figure 20. Data worksheet contains the last 1000 records.

    A PivotTable can then be added to the Report worksheet by selecting PivotTable and PivotChart Report from the Data menu.

  • www.wiredcity.com Page 21 or 36

    Figure 21. A PivotTable is configured on the Report worksheet.

    The PivotTable can be configured for the Host-Host Communication Matrix using the PivotTable and PivotChart Wizard. The source data for the PivotTable is contained in the Data worksheet, so you select the Microsoft Office Excel list or database option and the kind of report you want to create is a PivodTable. Figure 22. The PivotTable uses Excel data for the data source.

  • www.wiredcity.com Page 22 or 36

    Select the cell range of the NetFlow records on the Data worksheet to use as the source data. Figure 23. The PivotTable data source is the 1000 records found on the Data worksheet.

    You can specify whether the PivotTable should be added to a new worksheet or an existing one. The PivotTable often needs several rows of space above it for Page Fields so when adding to an existing worksheet, select a position several rows down.

  • www.wiredcity.com Page 23 or 36

    Figure 24. The PivotTable is created on the Report worksheet.

    The completed PivotTable is initially blank (Figure 25). Fields need to be dragged to the appropriate locations. To create the Host-Host Communication Matrix, add the following fields to the Pivot table using the Add to button. Figure 26 demonstrates what the Pivot table will look like.

    Page Area Column Area Row Area Data Area Protocol Src Port Dest Port

    Dest IP

    Src IP

    Octets

  • www.wiredcity.com Page 24 or 36

    Figure 25. The blank PivotTable needs fields added.

    Figure 26. PivotTable with the appropriate fields added.

    The default configuration for the Octets field was a Count of Octets. This represents the number of flows rather than the volume of traffic. To determine the volume of traffic,

  • www.wiredcity.com Page 25 or 36

    right-click the Octets field and select Field Settings. Select Sum from the Summarize by list as shown in Figure 27. Figure 27. The Octets field is configured to represent Sum of Octets.

    The configuration of the Host-Host Communication Matrix is now complete. You can use the Freeze Panes feature in Excel to make navigation of the matrix more user-friendly. (For example, click on cell B11 and select Freeze Panes from the Window menu). The Source and Destination IPs will always be shown as row and column headers.

  • www.wiredcity.com Page 26 or 36

    Figure 28. The completed Host-Host Communication Matrix using Freeze Panes for easy navigation.

    The fields in the page area (ie Protocol, Src Port and Destination Port) allow filtered data that restricts the matrix to show only communications that use a particular protocol or port. The options available when filtering by protocol are demonstrated in Figure 29. The numbers correspond to the following protocols; 1: ICMP, 6: TCP, 17: UDP. Figure 29. Filter the Host-Host Matrix data by Protocol, Source Port and Destination Port.

  • www.wiredcity.com Page 27 or 36

    Section 7.4 describes how the numbers can be replaced with the actual protocol names. The Row and Column fields can exclude values from the PivotTable, this can be done by deselecting the IP Address that you do not want included. Figure 30. Specific IP Addresses can be excluded from the Host-Host Matrix.

    Example 7: Identifying Top-N Talkers using PivotCharts NetFlow reporting can identify the top-10 users or top-10 protocols on a network and Excel PivotCharts can be used to create powerful drill-in reports that return this data. A Top-N report can be created using the same procedures as the Host-Host Communication Report. Using the same Data worksheet, select a PivotChart report (with PivotTable report) as the kind of report you want to generate.

  • www.wiredcity.com Page 28 or 36

    Figure 31. The PivotChart report is configured to use the NetFlow data.

    Again, the completed PivotChart is initially blank (Figure 32). Fields need to be dragged to the appropriate locations. For a Top-N Protocols Report, add the following fields:

    Page Area Data Area Category Axis Series Axis Octets Protocol

    Figure 32. The blank PivotChart needs fields added.

  • www.wiredcity.com Page 29 or 36

    As in the previous example, the default configuration for the Octets field was a Count of Octets that represents the number of flows rather than the volume of traffic. To determine the volume of traffic, right-click the Octets field and select Format PivotChart Field Select Sum from the Summarize by list as shown in Figure 33. Figure 33. The Octets field is configured to represent Sum of Octets.

    The protocols are listed along the bottom of the PivotChart in numeric order. When charting a large number of Ports or IP Addresses, this is not very practical. You can modify the protocol order and place a limit on the number of Protocols displayed. To change the sort and limit, right-click on the Protocol field, select Format PivotChart Field and click Advanced To limit the protocols to the Top 10, select the Descending button in AutoSort options, set Using field to Sum of Octets, and select On in Top 10 AutoShow. This configuration is shown below in Figure 34.

  • www.wiredcity.com Page 30 or 36

    Figure 34. Protocol Field configured to only display the Top 10 Protocols.

    This PivotChart allows you to quickly and easily identify that the top protocol on the network is TCP, followed closely by UDP. ICMP is insignificant in comparison. Figure 35. Top-10 Protocols Report.

  • www.wiredcity.com Page 31 or 36

    The most powerful feature of PivotCharts is the ability to drill down into data. You can drill into a protocol and identify which IP Addresses are using it. To identify the users of ICMP traffic, right-click on the x-axis legend text for the protocol and select Show Detail 1. The detail required should be selected from the resulting list. Selecting the Src IP will show the IP Address that the ICMP traffic originated from.

    Figure 36. Drill into the ICMP protocol to identify the source.

    The IP Addresses responsible for all ICMP traffic are shown. You can continue to use the Show Detail feature to further refine and drill with the different levels of data.

  • www.wiredcity.com Page 32 or 36

    Figure 37. Results of Drill-in operation.

    Example 8: Server-side Data Filtering Reports for NetFlow details are reliant on large volumes of data being imported into Excel for processing however Excel is limited to 65536 rows of data. Often, not all NetFlow records are needed in a report so it makes sense to filter these out at the server. Filter Expressions were first introduced in Example 4 to perform server-side filtering based upon the time of day. This example uses a Filter Expression to correlate values on the server. NetFlow detail import techniques (shown in Example 5) are extended to only import details matching a certain criteria. The NetFlow details report can be extended to only show TCP flows by configuring all the import functions to use the following filter expression:

    'NF_AllTraffic_detail_prot' = 6 The Compressed Data (Start Time/Number) function, using the above filter expression, is shown below.

  • www.wiredcity.com Page 33 or 36

    Figure 38. The data import functions are configured to use a filter expression.

    The results of the import functions are shown below in Figure 39.

    Figure 39. Only NetFlow details where the protocol = 6 (TCP) are imported.

  • www.wiredcity.com Page 34 or 36

    Filter expressions can also be used to perform server side querying of NetFlow details. For example, to determine the total volume of HTTP traffic, calculate an Event-weighted Total for NF_AllTraffic_detail_octet with the following filter expression:

    'NF_AllTraffic_detail_prot' = 6 AND ('NF_AllTraffic_detail_srcPort' = 80 OR 'NF_AllTraffic_detail_destPort' = 80)

    The Advanced Calculated Data function using the above filter expression is shown in Figure 40. Excel cells used in formulas are limited to 255 characters. Complex filters must be implemented in the NetFlow tag configuration. Figure 40. Total HTTP traffic is calculated on the server by using a filter expression.

  • www.wiredcity.com Page 35 or 36

    7 PI-Datalink Tips and Tricks

    7.1 Calculation Basis Always use an Event-Weighted Calculation Basis when working with NetFlow tags because values are stored as discreet values. A Time-Weighted Calculation Basis will interpret tag values as time-normalized data and will produce incorrect results.

    7.2 Scaling Results NetFlow tags typically store traffic volumes in Bytes. Most PI-Datalink functions support a Conversion Factor for scaling results. The following table shows some common traffic volume conversion factors.

    Unit Conversion Factor Kilobyte 0.0009766 Megabyte 9.53674E-07 Gigabyte 9.31323E-10

    7.3 Excel Time Functions Excel data functions are more preferable than the PI time functions in NetFlow reports because a PI relative time such as * may change between the execution of one function and another. This may not be a problem with fast reporting functions however may be an issue with slower functions such as large data imports. Excel supports relative time functions such as Now() and Today(). Dates are represented as the number of days since 1st January 1970 and time is represented as a fraction of a day. The following table shows how to implement some common relative time functions.

    Relative Time Excel Function Now =Now() Today =Today() Yesterday =Today() 1 1 Hour ago =Now() 1/24

    Pressing F9 causes the time functions to re-calculate and refreshes the PI-Datalink functions.

  • www.wiredcity.com Page 36 or 36

    7.4 Lookup Tables The TCP/IP fields in NetFlow details are stored as raw numbers. A lookup table can be used to convert these to meaningful names. The protocol number is converted to the protocol name using a simple table and the VLOOKUP function (see Figure 41). Figure 41. Protocol lookup table accessed using the VLOOKUP function.