Dividing the PizzaDividing
the Pizza
An Advanced Traffic
Billing System
An Advanced Traffic
Billing System
Christopher Lawrence BurkeThe University of Queensland
Menu
DesignInputsProcessOutputs
Design - OverviewA short analysis was done on the existing mechanisms for collecting traffic statistics and processing them into information used for billing. The existing system relied on several Excel spreadsheets, a lot of manual processing and produced results which were less than accurate. The following couple of slides show simplified versions of the existing and planned processes.
Design – The ProcessCustomers and
Peer Networkers
ProxiesDialup Banks etc
Providers (e.g. Optus)
Technical Contacts(other ISPs, APNIC)
Information TechnologyServices
Data CollectorsData CollectorsData CollectorsData Collectors
Standard Format Usage Data
Raw Usage Data (various Formats)
Traffic Billing Rates
“Whois/DNS Data”Bills
Specific Technical Data or Triggers
Periodic Usage Data or Aggregates
Design – The System
RAWDATA15minBlocks
AggregateProcessor
TriggerProcessor
AggregateRules
TriggerRules
Aggregate and Trigger Data
Report WriterNetworksWho Is
Traffic Rates
Bill WriterCustomers
Aggregates
Triggers
Bills
Inputs – Sources Gateway Routers HTTP Proxy Logs Dial-in Quota Logs Mirror Logs Any chargeable
traffic sink or source.
Inputs – Compression
200 flows/second Raw router data of 200
bytes per flow Around 1GByte/day of
data Several days of
processing per month
The University of Queensland traffic collection from just the gateway router was so large that some form of customised compression was needed.
Inputs – Record Format
Source IP/Source Customer 4 bytesDestination IP/Destination Customer 4 bytesByte Count 4 bytesSource Port 2 bytesDestination Port 2 bytesDuration of Flow in Minutes 2 bytesStart of Flow in Minutes from start of file 4 bitsProtocol (e.g. UDP) 2 bitsSource is IP/Destination is IP 2 bitsBit mask of 8 traffic types (e.g. international) 8 bits
The standard input data structure is a custom compressed format designed to allow a large quantity of data to be kept online. This 20 byte format can be compressed further … but this was thought sufficient for current requirements. The data structure assumes 15minute blocks.
Inputs – CollectorsThere should be one collector for each source or sink that is being monitored. The collectors are responsible for examining and translating the native format data (logs, router output) into 15 minute blocks of standard data format and feeding that data to the central processor.
Process - Overview
The process needs to analyse the input data. In theory this is a single process – which must run through a list of rules and answer the questions posed by those rules. The outputs are the answers to those questions.
Process – 5 W’s and a H
WhoWhatWhyWhenWhere How
The six universal questions
Process – When? A Range of dates and
times this rule is valid How long a period
should an aggregate be over, or should a trigger wait.
How often should aggregates be sent, or how many triggers events before someone is alerted.
Process – Where/Who? Source and Destination IP
address – list, range or net/mask.
Source and Destination customers for dynamically allocated address space.
Source and destination port – list, range or name.
Process – What/How? What do we want to
measure? How do we want to
measure it? How do we want to
trigger? Number of packets? Sum of bytes? Trigger on certain
number of packets?
Process – Why? Why are we
measuring or triggering?
Are we aggregating for a customer? If so which customer
Are we triggering for input into a monitoring system?
Process – Example Question Start, Stop Date&Time
e.g. 2-Jan-2001 to 2-Dec-2001 Period of sample
e.g. 4 hourly Frequency of Message
e.g. every 6 periods Source/Destination
e.g. All 130.102.0.0 Source Port/Dest Port
e.g. ftp All What to count
e.g. bytes How to count
e.g. sum
Outputs - Overview
Aggregation – Sums, averages or just counts over time.
Triggers – Events that occur, too much traffic, too many connections etc.
Billing – Aggregation post processed adding customer detail and value.
There are three basic types of outputs produced by this system they are:
Outputs - Aggregation Personal e-mails with usage
data Departmental weekly
statements of activity. Statistics and predictive
analysis of trends in usage. Protocol usage (e.g. FTP vs
HTTP) Service usage (e.g. how
much use is the proxy getting)
Outputs - Triggers Department warnings of
prospective over use. Warnings within 90 minutes of
growing usage of particular ports or IP addresses (possible DOS attack developing).
Triggering can be done on the aggregate outputs – allowing warnings if daily usage is increasing beyond certain parameters.
Trigger on irresponsible or unacceptable usage by individual.
Outputs – Billing Post processing of
aggregate data combining network structure and ownership with traffic cost and optional commercial markup.
Optionally autmatically e-mail, fax and/or print based on billing periods.
Half period warning bills to prepare customer for likely costs.
Conclusion Currently the router collector
and part of the aggregate rules processor is working.
Although much of this system is still in the pipeline, the overall structure is very modular allowing each new step achieved to give immediate improvement on the existing system.
There is around 3 (wo)man-months work left to get much of what is presented here completed