9
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/336770545 Spatial Challenges of Maritime Risk Analysis Using Big Data Conference Paper · October 2019 CITATIONS 0 READS 69 3 authors: Some of the authors of this publication are also working on these related projects: eVACUATE View project Transforming Transport View project Andrew Rawson University of Southampton 7 PUBLICATIONS 21 CITATIONS SEE PROFILE Gianluca Correndo University of Southampton 19 PUBLICATIONS 58 CITATIONS SEE PROFILE Zoheir Sabeur Bournemouth University 115 PUBLICATIONS 552 CITATIONS SEE PROFILE All content following this page was uploaded by Zoheir Sabeur on 30 October 2019. The user has requested enhancement of the downloaded file.

Spatial Challenges of Maritime Risk Analysis using Big Data · 1.3 Modifiable Areal Unit Problem and Spatial Distortions of Grids . The issue of MAUP is a significant area of research

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Spatial Challenges of Maritime Risk Analysis using Big Data · 1.3 Modifiable Areal Unit Problem and Spatial Distortions of Grids . The issue of MAUP is a significant area of research

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/336770545

Spatial Challenges of Maritime Risk Analysis Using Big Data

Conference Paper · October 2019

CITATIONS

0READS

69

3 authors:

Some of the authors of this publication are also working on these related projects:

eVACUATE View project

Transforming Transport View project

Andrew Rawson

University of Southampton

7 PUBLICATIONS   21 CITATIONS   

SEE PROFILE

Gianluca Correndo

University of Southampton

19 PUBLICATIONS   58 CITATIONS   

SEE PROFILE

Zoheir Sabeur

Bournemouth University

115 PUBLICATIONS   552 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Zoheir Sabeur on 30 October 2019.

The user has requested enhancement of the downloaded file.

Page 2: Spatial Challenges of Maritime Risk Analysis using Big Data · 1.3 Modifiable Areal Unit Problem and Spatial Distortions of Grids . The issue of MAUP is a significant area of research

1 INTRODUCTION A multitude of models have been developed to at-tempt to quantify the likelihood of an incident. The variability in these approaches reflects the variety of causes which can lead to an incident; including, hu-man error, adverse metocean conditions, mechanical failure and a plethora of other factors (Mazaheri et al. 2015).

It can be reasoned that at the very least, an inci-dent requires the presence of a vessel. As an exten-sion, all factors being equal, then more vessel trans-its in an area would equate to more incidents. This therefore requires a method to calculate a measure of vessel activity, most commonly using data from the Automatic Identification System (AIS).

Due to the significant volume of AIS data gener-ated, it has been argued that it poses computational challenges associated with big data (AbuAlhaol et al. 2018). To meet these challenges many authors perform risk analysis by aggregating the AIS data in some way. This might be into a traffic flow to per-form geometric analysis (Pedersen, 1995; Friis-Hansen, 2008) or into a grid system to perform spa-tial statistics by leveraging off of big data technolo-gies (AbuAlhaol et al. 2018; Filipiak et al. 2018). By binning AIS data into discrete grid cells, it is possi-ble to efficiently combine vessel traffic data with other datasets such as metocean, bathymetry or regu-latory requirements to perform data mining tasks and intelligent big data analytics to assess navigational safety.

Whilst this aggregation is useful to overcome challenges associated with the magnitude of the da-

taset, it has the potential to introduce errors into as-sessments which have not been widely considered. Therefore, in order to implement effective spatial data systems for maritime risk analysis using big da-ta, a greater understanding of these impacts is re-quired.

This paper seeks to quantify the incident rate in UK waters per vessel movement. In particular, the impact of the Modifiable Areal Unit Problem (MAUP) on these results is considered and possible solutions to leverage maritime big data whilst main-taining spatial consistency are proposed.

1.1 Spatial Models of Maritime Risk The assumed relationship between vessel traffic and accidents is condensed into some of the earliest mar-itime risk analysis, with the probability of an acci-dent per unit time (P) equal to the number of acci-dent candidates (Pa) and the causation probability (Pc) (Fujii and Suibara, 1971; Macduff, 1974).

(1) In further models, the number of accident candi-

dates includes a function related to the number of movements. For example, Pedersen’s Cat. I. ground-ings include a function Qi which is equal to the number of movements (Pedersen 1995), contact modelling for offshore installations includes a func-tion of N for the number of movements (Mujeeb-Ahmed et al. 2018) and domain analysis would nec-essary result in more encounters the more traffic is in the system (Rawson et al. 2014).

Spatial Challenges of Maritime Risk Analysis using Big Data

A. Rawson University of Southampton, Electronics and Computer Science, IT Innovation Centre, United Kingdom

Z. Sabeur Bournemouth University, Department of Computing and Informatics, United Kingdom

G. Correndo University of Southampton, Electronics and Computer Science, IT Innovation Centre, United Kingdom

ABSTRACT: The establishment of incident rates, the number of accidents per unit measurement, can be used to characterize and compare navigational safety between areas. Whilst there are a multitude of factors which influence these rates, such an approach assumes some relationship between traffic volume and incidents. This paper characterizes the incident rates across the United Kingdom at different scales of resolution. The results suggest that maritime risk analysis is significantly influenced by the scale effect of the Modifiable Areal Unit Problem. In particular, the chosen spatial resolution has a significant effect on the strength of the relationship. This paper presents the Discrete Global Grid System as a possible method for more effective big data analysis of maritime risk to address these challenges.

Page 3: Spatial Challenges of Maritime Risk Analysis using Big Data · 1.3 Modifiable Areal Unit Problem and Spatial Distortions of Grids . The issue of MAUP is a significant area of research

Furthermore, risk models for vessel transits gen-erally include some factor relative to traffic; for ex-ample, intensity (DNV, 2003) or distribution (Mazaheri et al. 2016).

This relationship has been challenged by some studies. Mazaheri et al. (2015) assessed grounding accidents in the Gulf of Finland and found little rela-tionship between traffic density and the number of groundings, although identifying waterway com-plexity and traffic distribution as contributing fac-tors. They note that a contrary argument may exist that navigators would avoid certain areas and there-fore traffic density would be least where the greatest grounding risk exists. Statistical analysis by Bye and Aalberg (2018) suggested that navigational accidents occurred in areas with relatively few vessels than other accident types.

Yet, some locations have more accidents than others. In addition, it is important to understand the reasons why this variation exists to allow naviga-tional authorities to effectively target risk mitigation measures to reduce the risk where it is most warrant-ed. Furthermore, correctly characterizing the risk profile of an area can support marine spatial plan-ning by establishing a common baseline of accepta-bility against which future developments can be judged.

1.2 The Value of Incident Rates The establishment of incident rates can be valuable contribution to better understanding navigational safety. Ports use incident rates to benchmark per-formance as a Key Performance Indicators which accounts for changes in trade volume to the stand-ards of industry best practice. Associated British Ports in the UK publish annual Port Marine Safety Code Reports which track incidents per 1,000 movements between their ports and other time (ABP, 2017).

As a rate, there is a requirement for a standard-ized unit of measurement. Incidents per year are a common unit of measurement but cannot be com-pared between locations which have significantly different volumes of traffic. Incidents per movement are easily calculated using port statistics or AIS data such as through a crossing gate. However, some wa-terways are longer, and the increased travel distance and time allows for a greater potential for incidents to occur. A unit of measurement such as incidents per nautical mile might address this. Furthermore, as vessels travel at different speeds, it could be argued that a faster vessel spends less time in an area and therefore has less opportunity to be involved in an incident. This would be reflected in an incident rate per hour of exposure.

These three normalized metrics, by movement, nautical mile travelled, or hour of exposure must al-so be referenced to a study area. The definition of

this study area is non-trivial and may have a signifi-cant impact upon the results.

1.3 Modifiable Areal Unit Problem and Spatial Distortions of Grids

The issue of MAUP is a significant area of research within the geospatial community but is not often considered within safety studies. The problem oc-curs as spatial data is inherently continuous and therefore there an infinite number of possible loca-tions. Spatial analysis must reduce this complexity through the use of generalizations or approxima-tions, typically through binning data into some form of tessellation using a finite number of discrete ele-ments.

MAUP refers to the phenomenon that changes in the boundaries of spatial analysis can change the sta-tistical inference of the results (Openshaw, 1984). This is typically categorized into two components: the scale effect associated with the resolution of analysis and the zoning effect associated with the geometry of the analysis.

These issues have been well studied, research from the 1930s established that as the number of ar-eal units representing data decreased, the correlation coefficients between them increased (Gehlke and Biehl, 1934).

Within the maritime risk domain, this problem has only occasionally been recognized (Pelot and Plummer, 2008; Bertazzon et al. 2014). Research in other domains such as road transportation has demonstrated the impact that this would have on risk analysis (Xu et al. 2018). Analysis by Mazaheri and others (2014) determined that altering the grid size of analysis had an effect on the correlations between vessel traffic and groundings in the Gulf of Finland.

Related to these challenges are inevitable distor-tions in grid cells as a result of the geometry of the earth. A regular Cartesian grid using a 2 dimensional x and y system cannot accurately reflect the spheri-cal earth and some form of distortion is inherent ei-ther in angle, area of distance. For example, the Mercator projection maintains bearings to aid navi-gators at the expense of significant areal distortion.

In many projection systems, these distortions are greatest towards the poles (Sahr, 2003). A one by one degree grid would have a width of 111 km at the equator, but only 19km at 80 degrees. This inherent-ly introduces the MAUP scale effect within a single resolution and this undermines the ability to perform spatial calculations accurately under a single projec-tion system. For this reason, it is crucial that these issues are properly considered before developing maritime risk studies of the Polar Regions, where there is a significant need to enhance our knowledge. In addition, a spatial data structure which meets these challenges is required to enable big data ana-lytics of vessel traffic data in combination with other datasets.

Page 4: Spatial Challenges of Maritime Risk Analysis using Big Data · 1.3 Modifiable Areal Unit Problem and Spatial Distortions of Grids . The issue of MAUP is a significant area of research

2 DATASET The United Kingdom has a varied coastline includ-ing major European ports, some of the busiest ship-ping lanes in the world as well as small harbors. Therefore, it would be expected that there would be significant spatial variation in the frequency and likelihood of navigational incidents. This paper therefore seeks to characterize the incident rate across the UK’s Exclusive Economic Zone (EEZ).

To calculate incident rates, two key sources of da-ta are required; vessel traffic data and incident data which are described below. Furthermore, it is neces-sary to define a unit of study, and recognizing the limitations of regular grid systems discussed above, the use a Discrete Global Grid System is proposed (DGGS).

2.1 DGGS Recognizing the challenges of developing a global spatial model which overcomes the challenges of regular Cartesian grids, the Open Geospatial Consor-tium is championing the Discrete Global Grid Sys-tem (DGGS). DGGS constructs a global grid of equal area platonic shapes (Barnes, 2019) and is in-dependent of projections. Furthermore, DGGS ena-bles resolutions to be changed by subdividing each cell into finer tessellations without compromising the spatial relationships between them.

DGGS can implement a cell geometry of any pla-tonic shape. In this study a hexagonal DGGS is im-plemented which has a number of advantages such as a more compact topology, uniformly high sym-metry and uniform adjacency between adjacent cells (Sahr and White, 1998; Tong et al. 2013). Further-more, hexagonal grids offer some visual benefits such as reduced ambiguity at edges rather than cor-ners, a less regular structure which can be distracting and an additional axes of alignment (Birch et al. 2007).

2.2 AIS Data AIS data is available from the Marine Management Organization (MMO) who produced anonymized vessel track data for one week per month in 2015. The AIS data has been collected by the Maritime and Coastguard Agency (MCA) in the UK. The data was provided preprocessed into vessel tracks as ESRI shapefile polylines. These tracks have been further processed to provide unique transits per day per vessel which are defined in this study as a movement (see Figure 1).

It is recognized that this will under-represent some vessel types such as ferries and pilot boats which make multiple transits through the same area per day. Whilst, other measures such as per hour of exposure or nautical mile travelled would be prefer-

able, it is not possible to calculate these metrics on the MMO dataset.

2.3 Incident Data Incident data was provided by the Marine Accident Investigation Branch (MAIB) under a Freedom of Information Request for the years 2009 to 2017. The MAIB are responsible for investigating all incidents in UK waters and incidents involving UK flagged vessels. The data has been filtered to the UK EEZ and contains a total of 9,829 incidents, a rate of 1,092 per year.

The dataset includes all incident types including navigational incidents such as collisions, contacts and groundings but also near misses, accidents to persons and pollution events. This analysis focuses on collisions (n=1,394,) groundings (n=771) and all incidents categories (n=9,829).

The locations of the incidents are plotted in Fig-ure 2. The incidents are distributed across the UK although some clusters are evident around some of the UK’s major ports. The frequency is consistent between the years of analysis although some winter-summer trend is evident, with 41% more incidents per month in July and August when compared to January and February, likely the result of the in-creased numbers of recreational movements in the summer months. Fishing and Other vessel types ac-count for more than half the total number of inci-dents.

Figure 1: AIS Data for UK (2015).

Page 5: Spatial Challenges of Maritime Risk Analysis using Big Data · 1.3 Modifiable Areal Unit Problem and Spatial Distortions of Grids . The issue of MAUP is a significant area of research

Figure 2: All incidents (2009-2017).

3 ANALYSIS

3.1 Methodology To calculate the incident rates, pre-processing was required. Firstly, the incident data and vessel traffic data were standardized by vessel type to allow for comparison. Five vessel categories were defined; namely Cargo (e.g. Container, Bulk), Tankers (e.g. Oil/Chemical), Fishing, Recreational, Passenger (e.g. Cruise, Ferry) and Other (e.g. Tugs, Pilot Boats, SAR).

Secondly, the dggridR package is utilized (Barnes, 2018) with an ISEA3H system to establish three resolutions for analysis; namely resolutions 7 (23,000 km2), 9 (2,591 km2) and 11 (287 km2), see Figure 3. The number of incidents within each grid cell of each incident type and vessel type were cal-culated under the three resolutions based on the rec-orded latitude and longitudes. The number of movements per grid cell was calculated using unique intersections of vessel tracks derived from the MMO database.

Figure 3: DGGRID at Resolution 7 (green), 9 (black) and 11 (red)

across the English Channel.

Finally, incident rates per movement (Pa) are cal-culated for each grid cell as the number of accidents per year (Na) by the number of movements per year (Nm) (Kristiansen, 2005).

(2)

3.2 Results Table 1 gives the national incident rates per move-ment. The total incident rate is highest for fishing, other and passenger vessel types. It is likely that these three rates are elevated due to the influence of accidents to people which are more likely on these vessel types.

The incident rates for grounding and movements show a different relationship. Recreational and other vessel types have a high rate of collision, likely due to operating in denser inshore areas than other vessel types. Commercial vessel types such as tankers and cargo vessels have significantly lower incident rates than the other vessel types. Table 1: National Incident Rates per Movement Total

Incidents per Movement

Groundings per Movement

Collisions per Movement

Cargo 1.78 x 10-4 2.27 x 10-5 2.37 x 10-5 Fishing 2.19 x 10-3 1.61 x 10-4 1.86 x 10-4 Other 2.25 x 10-3 1.89 x 10-4 3.39 x 10-4 Passenger 2.49 x 10-3 9.83 x 10-5 2.36 x 10-4 Recreational 1.40 x 10-3 1.33 x 10-4 4.62 x 10-4 Tanker 5.59 x 10-4 2.88 x 10-5 1.34 x 10-4 Total 7.27 x 10-4 5.83 x 10-5 1.04 x 10-4

Page 6: Spatial Challenges of Maritime Risk Analysis using Big Data · 1.3 Modifiable Areal Unit Problem and Spatial Distortions of Grids . The issue of MAUP is a significant area of research

Figures 4 through 6 plots the incident rate for groundings by cell under different resolutions as an example. There is a significant variation between in-cident rates across the country. With the finer resolu-tions of DGGRID9 and DGGRID11, the highest in-cident rates are all ports or inland waterways. In particular, Humber, Manchester and the North-West of Scotland all have notably higher incident rates. Higher grounding rates in the approaches to ports and other constrained waterways is to be expected as vessels are frequently navigating closer to shallower waters in these areas as opposed to on passage. At DGGRID7 scale, there is little variation in incident rates across the country as local hot spots are gener-alized.

These three figures demonstrate that there is vari-ation both spatially across the country but also varia-tion between resolutions. For example, under the larger grid cells, the heightened incident rate for Humber is not evident. Similarly, finer resolutions demonstrate local hot spots in Thames Estuary which are not evident in the coarser resolutions.

Table 2 tests the correlation between groundings, collisions and all incident types at varying resolu-tions using the Pearson correlation coefficient. A preliminary study of the results suggest that a statis-tical relationship can be identified between vessel traffic and the number of incidents. However, this rate varies significantly between resolutions, vessel types and incident types.

Figure 4: Grounding Incident Rate at DGGRID7 (23,000 km2).

Figure 5: Grounding Incident Rate at DGGRID9 (2,591 km2).

Figure 6: Grounding Incident Rate at DGGRID11 (287 km2).

Firstly, groundings have a consistently lower cor-relation with vessel traffic as compared to collisions and total incidents. This likely reflects that ground-ings require a vessel to deviate from established ves-sel lanes to run aground and therefore the density of

Page 7: Spatial Challenges of Maritime Risk Analysis using Big Data · 1.3 Modifiable Areal Unit Problem and Spatial Distortions of Grids . The issue of MAUP is a significant area of research

traffic in locations where groundings took place would be inherently lower.

It would be expected that collisions would occur in busier areas and the results suggest that this rela-tionship is stronger than for groundings. Further-more, as total incidents include mechanical failure and accidents to person, which are often independent of spatial factors, the relationship with vessel density is relatively strong.

Secondly, significant variation exists between vessel types. In general, recreational crafts have the strongest relationship which is surprising as it is likely that both incidents and vessel activity are un-der-reported since there is no requirement for AIS carriage. Commercial shipping in the form of cargo and tanker vessels have the lowest correlations, sug-gesting that other factors may have a disproportion-ately greater impact on the likelihood of an accident than for other vessel types.

Thirdly, the results show a clear effect of the MAUP scale effect whereby the increasing size of the spatial units results in greater correlations be-tween the accidents and vessel traffic.

4 DISCUSSION

4.1 Data Quality The results have suggested a positive relationship

between vessel traffic incidents, with considerable variation in incident rates. A significant contributor to this result are likely data quality issues.

The AIS data has been sourced from the MMO and utilizes the MCA’s coastal network of AIS re-ceivers. However, in various locations it was evident that there was limited data coverage. In particular, inland waters such as rivers and ports had signifi-cantly less vessel tracks than was expected. The MCA acts as the coastal authority for the UK with the navigation authority for inland waterways ad-ministered by ports and harbors. As such, the MCA’s AIS network is likely targeted at coastal vessel traffic and gaps in coverage are more likely in those ports and harbors where the number of inci-dents was shown to be greater. This could skew the incident rates as a result of AIS coverage as opposed to risk.

The incident data as provided by the MAIB would also contain limitations, with underreporting among certain vessel types and errors possible. Sev-eral groundings were reported in locations with deep water. Previous studies have needed to remove ap-proximately 10% of the incident data to account for these errors (Mazaheri et al. 2015). Furthermore, underreporting of incidents is also likely, and there may be some bias in reporting by larger vessel types or within ports and harbors that would be more dili-gent in collecting the data than others.

Table 2: Pearson r correlation analysis for incident rates.

Grou

ndin

gs

DGGRID7 DGGRID9 DGGRID11

R p R p R p

Cargo 0.22 0.07 0.06 0.21 0.06 0.00

Fishing 0.41 0.00 0.25 0.00 0.17 0.00

Other 0.58 0.00 0.58 0.00 0.47 0.00

Passenger 0.66 0.00 0.50 0.00 0.26 0.00

Recreational 0.86 0.00 0.78 0.00 0.61 0.00

Tanker 0.34 0.00 0.11 0.02 0.10 0.00

Total 0.50 0.00 0.31 0.00 0.25 0.00

Colli

sions

DGGRID7 DGGRID9 DGGRID11

R p R p R p

Cargo 0.51 0.00 0.41 0.00 0.36 0.00

Fishing 0.68 0.00 0.53 0.00 0.35 0.00

Other 0.63 0.00 0.54 0.00 0.45 0.00

Passenger 0.71 0.00 0.48 0.00 0.33 0.00

Recreational 0.95 0.00 0.92 0.00 0.83 0.00

Tanker 0.58 0.00 0.33 0.00 0.26 0.00

Total 0.68 0.00 0.44 0.00 0.38 0.00 Al

l Inc

iden

ts

DGGRID7 DGGRID9 DGGRID11

R p R p R p

Cargo 0.46 0.00 0.29 0.25 0.26 0.00

Fishing 0.66 0.00 0.57 0.00 0.50 0.00

Other 0.71 0.00 0.68 0.00 0.57 0.00

Passenger 0.75 0.00 0.63 0.00 0.45 0.00

Recreational 0.95 0.00 0.93 0.00 0.85 0.00

Tanker 0.47 0.00 0.27 0.00 0.22 0.00

Total 0.68 0.00 0.48 0.00 0.40 0.00

AIS data was a sample of one year and therefore

would not reflect the change in traffic volume and distribution which would have occurred during the nine years of incident data. Incidents can alter traffic patterns as new controls are put in place to prevent re-occurrence (Rawson, 2017). New developments such as wind farms would also have a significant ef-fect on vessel traffic patterns and the types of inci-dents.

4.2 Scale Effect Finally, the results also show that there is a clear scale effect which can influence the results. As the size of the spatial units of analysis increase, the cor-relation between vessel traffic and incidents increas-es.

With larger study areas, more groundings may be included in the assessment; however, the specific lo-calized spatial factors which may have contributed to them are generalized. For example, a smaller grid cell may include a specific hazardous sand bank which demonstrates a high incident rate, but if con-sidered as part of a larger grid, with open water, this feature is generalized.

Page 8: Spatial Challenges of Maritime Risk Analysis using Big Data · 1.3 Modifiable Areal Unit Problem and Spatial Distortions of Grids . The issue of MAUP is a significant area of research

Conversely, with smaller grid cells, the sample size of incidents reduces, and the incident rates be-come more variable. One method to overcome this is to analyze only those areas where incidents occur (Mazaheri et al. 2015); however; this excludes those areas of similar characteristics where incidents have not occurred.

The choice of assessment scale therefore is an important consideration in calculating incident rates.

4.3 Zonal Effect As discussed in Section 1.3, the use of grid systems results in significant distortion in shape and size near the poles. Within the UK EEZ, a 0.25-degree Carte-sian grid system would vary in cell area between 334 km2 and 527 km2 (at WGS84 UTM30N). This dif-ference would be greater at a global scale and could introduce a scale effect through variable cell distor-tion.

Similarly, the use of a square grid system means that the distance between the center of a square and the edges and corners are different. Towards the poles these squares become rectangular in geometry, exacerbating this effect. Therefore, under this sys-tem each spatial cell would include data at greater distances in one direction than another, potentially introducing bias into the results.

To overcome this, a DGGS has been implement-ed to create equal area hexagons, enabling a global spatial model without significant distortion. Fur-thermore, a hexagon more accurately represents a circle and therefore the distance from both the center to the edge, and between cells, is more consistent.

4.4 Potential Solutions Whilst this paper has demonstrated that incident rates are susceptible to changes in scale, the follow-ing comments are made which might address some of these challenges.

One solution would be to perform analysis with-out aggregating data, such as a transit-based model. There are widespread examples of these approaches, including statistical techniques such as binary lo-gistic regression models (Knapp et al. 2011), domain analysis (Fujii and Tanaka, 1971; Wang et al. 2009) or Bayesian Networks (Mazaheri et al. 2016).

By treating vessels as individual actors, aggrega-tion is avoided but new challenges are introduced. Models must attempt to develop a causal relation-ship between a multitude of factors in a highly com-plex system which may not be possible (Hanninen, 2014). This is exacerbated by a lack of incident data and therefore many studies seek to implement expert judgement. Furthermore, analysis of the conditions for each transit to build these models, requires a method of combining AIS data with metocean or other datasets. This is typically achieved through combining Cartesian grids and therefore MAUP challenges would be replicated.

A need to perform spatial mapping of maritime risk still remains. By highlighting areas with in-creased incident rates, system-wide factors might be identifiable which would be hidden if only individu-al transits are examined. In addition, it is not often possible to access high quality AIS datasets, due to their cost of collection and commercial value, which would enable this type of analysis. Furthermore, ev-idence-based decision making on baseline levels of risk are necessary for decision makers when there is a spatial component, such as marine spatial plan-ning.

To meet the challenges of MAUP and enable ef-fective big data processing of vessel traffic data, the use of DGGS as opposed to Cartesian grid systems would yield more consistent results. This is particu-larly effective at developing global models, includ-ing Polar Regions where cell distortion would be greatest.

A characteristic of this analysis is the use of arbi-trary spatial units to form a grid. This means that some cells include significantly different environ-ments, which may not be meaningfully comparable. For example, at DGGRID9 with a 67km cross sec-tion some cells include both small harbors and open sea. A more appropriate approach may be to define similar and comparable spatial units. Whilst not de-tailed here, preliminary analysis suggests that if the incident rate is compared between ports and harbors only, the correlation is far stronger.

Finally, it is recommended that sensitivity analy-sis is performed on spatial models of incident rates to determine the effect of MAUP. For example, by repeating the analysis with different grid sizes to de-termine the effect (Mazaheri et al. 2014).

5 CONCLUSIONS AND FUTURE WORK

This assessment has investigated the relationship be-tween vessel traffic and the number of incidents as a predictor of the incident rates within an area. The re-sults suggest that this relationship exists, but its strength is dependent upon incident type, vessel type, but also the scale of assessment. Challenges of the MAUP are inherent in spatial analysis and there-fore should also be properly considered in maritime risk analysis. The use of DGGS has been proposed as a method to reduce the effects of MAUP as op-posed to traditional Cartesian grid systems.

This study has considered only vessel traffic ac-tivity and incident numbers and it is recognized that there are a multitude of factors which influence the likelihood of an incident occurring. A spatial data structure has been proposed which can be effectively implemented with big data technologies to efficient-ly fuse heterogenous datasets representing vessel traffic, metocean, bathymetric and historical incident databases. Through this, machine learning tech-

Page 9: Spatial Challenges of Maritime Risk Analysis using Big Data · 1.3 Modifiable Areal Unit Problem and Spatial Distortions of Grids . The issue of MAUP is a significant area of research

niques are being investigated to more accurately characterize the navigational safety of a region.

6 ACKNOWLEDGEMENTS

This work is partly funded by the European Re-search Council under the European Commission, under Horizon 2020 research and innovation pro-gram (grant agreement number: 723526), the South-ampton Marine and Maritime Institute (SMMI) and the University of Southampton’s School of Electron-ics and Computer Science.

7 REFERENCES

ABP 2017. Port Marine Safety Code: Annual Performance Re-view.

Abualhaol, I.Y., Falcon, R., Abielmona, R.S. & Petriu, E.M. 2018. Mining Port Congestion Indicators from Big AIS Da-ta. International Joint Conference on Neural Networks. Rio de Janeiro, 8-13 July 2018.

Barnes, R. 2018. dggridR: Discrete Global Grids for R. Avail-able at: https://CRAN.R-project.org/package=dggridR.

Barnes, R. 2019. Optimal Orientations of Discrete Global Grids and the Poles of Inaccessibility. International Journal of Digital Earth. 17

Bertazzon, S. Hara, P. Barrett, O. & Serra-Sogas N. 2014. Geo-spatial analysis of oil discharges observed by the National Aerial Surveillance Program in the Canadian Pacific Ocean. Applied Geography, 52: 78-89.

Birch, C.D. Oom, S.P. & Beecham, J.A. 2007. Rectangular and hexagonal grids used for observation, experiment and simu-lation in ecology. Ecological Modelling (206): 347-359.

Bye, R. & Aalberg, A. 2018. Maritime navigation accidents and risk indicators: An exploratory statistical analysis using AIS data and accident reports. Reliability Engineering and System Safety 176: 174-186.

DNV 2003. Formal Safety Assessment for Large Passenger Vessels.

Filipiak, D., Strozyna, M., Wecel, K., & Abramowicz, W. 2018. Anomaly Detection in the maritime Domain: Com-parison of Traditional and Big Data Approach. NATO IST-160-RSM Specialists’ Meeting on Big Data and Artificial Intelligence for Military Decision Making. Bordeaux, France, 06/01/2018.

Friis-Hansen, P. 2008. IWRAP MK II: Working Document: Basic Modelling Principles for Prediction of Collision and Grounding Frequencies.

Fujii, Y. & Shiobara, R. 1971. The analysis of traffic accidents. Journal of Navigation, 24: 534-543.

Fujii, Y. & Tanaka, K. 1971. Traffic Capacity. Journal of Nav-igation, 24: 543-552.

Gehlke, C.E. & Biehl, K. 1934. Certain effects of grouping up-on the size of the correlation in census tract material. Jour-nal of American Statistical Society, 29: 169-170.

Hanninen, M. 2014. Bayesian Networks for Maritime Traffic Accident Prevention: Benefits and Challenges. Accident Analysis and Prevention, 73: 305-312.

Jeong, M., Lee, E., Lee, M. and Jung, J. 2019. Multi-criteria route planning with risk contour map for smart navigation. Ocean Engineering, 172: 72-85.

Knapp, S. Kumar, S. Sakurada, Y, and Shen, J. 2011. Econo-metric analysis of the changing effects in wind strength and significant wave height on the probability of casualty in

shipping. Accident Analysis and Prevention, 43: 1252-1266.

Kristiansen, S. 2005. Maritime Transportation: Safety Man-agement and Risk Analysis. Elsevier, Oxford.

Macduff, T. 1974. Probability of vessel collisions. Ocean In-dustry, 9: 144-148.

Mazaheri, A. Montewka, J. Kotilainen, P. Sormunen, O. & Kujala, P. 2015. Assessing Grounding Frequency using Ship Traffic and Waterway Complexity. Journal of Naviga-tion: 68(1): 89-106.

Mazaheri, A. Montewka, & Kujala, P. 2016. Towards an evi-dence-based probabilistic risk model for ship-grounding ac-cidents. Safety Science: 86: 195-210.

MMO 2014. Mapping UK Shipping Density and Routes Tech-nical Annex: 1066. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/317771/1066-annex.pdf

Mujeeb-Ahmed M.P. Seo, J.K. & Paik J.K. 2018. Probabilistic approach for collision risk analysis of powered vessel with offshore platforms. Ocean Engineering 151: 206-221.

Openshaw, S. 1984. The Modifiable Areal Unit Problem. CATMOG 38. Geobooks, Norwich.

Pedersen, P. T. 1995. Collision and Grounding Mechanics. Proceedings of WEMT 95. Copenhagen, Denmark, The Danish Society of Naval Architecture and Marine Engineer-ing.

Pellot, R. & Plummer, L. 2008. Spatial analysis of traffic and risks in the coastal zone. Journal of Coastal Conservation, 11.

Rawson, A. Rogers, E. Foster, D. Phillips, D. 2014. Practical Application of Domain Analysis: Port of London Case Study. Journal of Navigation, 67(2): 193-209.

Rawson, A. 2017. An Analysis of Vessel Traffic Flow Before and After the Grounding of the MV Rena, 2011. Proceed-ings of 12th International Conference on Marine Navigation and Safety of Sea Transportation, Gdynia, 21 – 23 June 2017.

Riveiro, M., Pallotta, G. and Vespe, M. 2018. Maritime Anom-aly Detection: A Review. Data Mining and Knowledge Discovery, 8(5).

Sahr, K.M. & White, D. 1998. Discrete Global Grid Systems. Computing Science and Statistics, 30.

Sahr, K.M., White, D. & Kimerling, A.J. 2003. Geodesic Dis-crete Global Grid Systems. Cartography and Geographic Information Science, 30(2): 121-134.

Tong, X. Ben, J, Wang, Y. Zhang, Y & Pei, T. 2013. Efficient encoding and spatial operation scheme for aperture 4 hex-agonal discrete global grid system. International Journal of Geographical Information Science 27: 898-921.

Tu, E., Zhang, G., Rachmawati, L., Rajabally, E. and Huang, G. 2017. Explioting AIS Data for Intelligent Maritime Nav-igation: A Comprehensive Survey from Data to Methodol-ogy. IEEE Transactions on Intelligent Transportation Sys-tems, 19(5): 1559-1582.

Wang, N. 2013. A Novel Analytical Framework for Dynamic Quaternion Ship Domains. Journal of Navigation, 66: 265-281.

Xu, P. Huang, H. and Dong, N. 2018. The modifiable areal unit problem in traffic safety: Basic issue, potential solutions and future research. Journal of Traffic and Transportation Engineering, 5(1): 73-82.

View publication statsView publication stats