156
Clemson University Clemson University TigerPrints TigerPrints All Dissertations Dissertations August 2021 Large-Scale Optimization Models with Applications in Biological Large-Scale Optimization Models with Applications in Biological and Emergency Response Networks and Emergency Response Networks Mustafa Can Camur Clemson University, [email protected] Follow this and additional works at: https://tigerprints.clemson.edu/all_dissertations Recommended Citation Recommended Citation Camur, Mustafa Can, "Large-Scale Optimization Models with Applications in Biological and Emergency Response Networks" (2021). All Dissertations. 2844. https://tigerprints.clemson.edu/all_dissertations/2844 This Dissertation is brought to you for free and open access by the Dissertations at TigerPrints. It has been accepted for inclusion in All Dissertations by an authorized administrator of TigerPrints. For more information, please contact [email protected].

Large-Scale Optimization Models with Applications in

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Large-Scale Optimization Models with Applications in

Clemson University Clemson University

TigerPrints TigerPrints

All Dissertations Dissertations

August 2021

Large-Scale Optimization Models with Applications in Biological Large-Scale Optimization Models with Applications in Biological

and Emergency Response Networks and Emergency Response Networks

Mustafa Can Camur Clemson University, [email protected]

Follow this and additional works at: https://tigerprints.clemson.edu/all_dissertations

Recommended Citation Recommended Citation Camur, Mustafa Can, "Large-Scale Optimization Models with Applications in Biological and Emergency Response Networks" (2021). All Dissertations. 2844. https://tigerprints.clemson.edu/all_dissertations/2844

This Dissertation is brought to you for free and open access by the Dissertations at TigerPrints. It has been accepted for inclusion in All Dissertations by an authorized administrator of TigerPrints. For more information, please contact [email protected].

Page 2: Large-Scale Optimization Models with Applications in

Large-Scale Optimization Models with Applications inBiological and Emergency Response Networks

A Dissertation

Presented to

the Graduate School of

Clemson University

In Partial Fulfillment

of the Requirements for the Degree

Doctor of Philosophy

Industrial Engineering

by

Mustafa Can Camur

August 2021

Accepted by:

Dr. Thomas C. Sharkey, Committee Chair

Dr. Chrysafis Vogiatzis

Dr. Yongjia Song

Dr. Emily Tucker

Page 3: Large-Scale Optimization Models with Applications in

Abstract

In this dissertation, we present new classes of network optimization models and algorithms,

including heuristics and decomposition-based methods, to solve them. Overall, our applications

highlight the breadth which optimization models can be applied and include problems in protein-

protein interaction networks and emergency response networks. To our best knowledge, this is the

first study to propose an an exact solution approach for the star degree centrality (SDC) problem.

In addition, we are the first who introduce the stochastic-pseudo star degree centrality problem and

we are able to design a decomposition approach for this problem. For both problems, we introduce

new complexity discussions where the practical difficulty of problems based on different graph types

is classified. Moreover, we analyse an Arctic mass rescue event from an optimization perspective and

create a novel network optimization model that examines the impact of the event on the evacuees

and the time to evacuate them.

We first consider the problem of identifying the induced star with the largest cardinality

open neighborhood in a graph. This problem, also known as the SDC problem, has been shown to be

NP-complete. In this dissertation, we propose a new integer programming (IP) formulation, which

has a fewer number of constraints and non-zero coefficients in them than the existing formulation

in the literature. We present classes of networks where the problem is solvable in polynomial time,

and offer a new proof of NP-completeness that shows the problem remains NP-complete for both

bipartite and split graphs. In addition, we propose a decomposition framework which is suitable for

both the existing and the new formulations. We implement several acceleration techniques in this

framework, motivated by those techniques used in Benders decomposition. We test our approaches

on networks generated based on the Barabasi–Albert, Erdos–Renyi, and Watts–Strogatz models.

Our decomposition approach outperforms solving the IP formulations in most of the instances in

terms of both solution time and solution quality; this is especially true when the graph gets larger

ii

Page 4: Large-Scale Optimization Models with Applications in

and denser. We then test the decomposition algorithm on large-scale protein-protein interaction

networks, for which SDC was shown to be an important centrality metric.

We then introduce the stochastic pseudo-star degree centrality problem and propose methods

to solve it exactly. The goal is to identify an induced pseudo-star, which is defined as a collection of

nodes which form a star network with a certain probability, such that it maximizes the sum of the

probability values in the unique assignments between the star and its open neighborhood. In this

problem, we are specifically interested in a feasible pseudo-star, where the feasibility is measured as

the product of the existence probabilities of edges between the center node and leaf nodes and the

product of one minus the existence probabilities of edges among the leaf nodes. We then show that

the problem is NP-complete on general graphs, trees, and windmill graphs. We initially propose a

non-linear binary optimization model to solve this problem. Subsequently, we linearize our model via

McCormick inequalities and develop a branch-and-Benders-cut framework to solve it. We generate

Logic-Based-Benders cuts as alternative feasibility cuts and examine several acceleration techniques.

The performance of our implementation is tested on randomly generated networks based on small-

world (SW) graphs. The SW networks resemble large-scale protein-protein interaction networks

for which the deterministic star degree centrality has been shown to be an efficient group-based

centrality metric in order to detect essential proteins. Our computational results indicate that the

Benders implementation outperforms solving the model directly via a commercial solver in terms of

both the solution time and the solution quality in the majority of the test instances.

Lastly, we turn our attention to a network optimization problem with an application in

Arctic emergency response. We study a model that optimizes the response to a mass rescue event in

Arctic Alaska. The model contains dynamic logistics decisions for a large-scale maritime evacuation

with the objectives of minimizing the impact of the event on the evacuees and the average evacuation

time. Our proposed optimization model considers two interacting networks - the network that moves

evacuees from the location of the event to out of the Arctic (e.g., a large city in Alaska such as

Anchorage) and the logistics network that moves relief materials to evacuees during the operations.

We model the concept of deprivation costs by incorporating priority levels capturing the severeness

of evacuees’ current medical situation and period indicating the amount of time an evacuee has not

received key relief resources. Our model is capable of understanding the best possible response given

the current locations of response resources and is used to assess the effectiveness of an intuitive

heuristic that mimics emergency response decision-making.

iii

Page 5: Large-Scale Optimization Models with Applications in

Dedication

to Belma and Cansu.

iv

Page 6: Large-Scale Optimization Models with Applications in

Acknowledgements

First and foremost, I would like to express my gratitude to my Ph.D. advisor Dr. Thomas

C. Sharkey who has been a great supervisor and mentor throughout my four-year Ph.D. journey.

I absolutely feel lucky and blessed to have worked under his supervision. Second, I would like to

thank Dr. Chrysafis Vogiatzis who was a great influence on me to pursue my Ph.D. I would also like

to appreciate the support and guidance of the rest of my committee members; Drs. Yongjia Song

and Emily Tucker. Lastly, I would like to mention that I spent the first three years of my Ph.D.

program at Rensselaer Polytechnic Institute and thank all the great faculty members with whom I

took courses with there. I specifically want to mention Dr. John Mitchell, whose dedication to teach

is truly admirable.

During the years I spent thousands of miles away from home, my family and my friends have

been there to help and support me. I would like to especially mention my mom, Belma to whom

I owe everything I have accomplished so far. I thank both my father, Ali and my sister, Cansu

who has been like a second mother for me besides being a great sibling. All of my other family

members and friends, whose names could not appear here, should know that I am forever grateful

for everything they have done for me.

v

Page 7: Large-Scale Optimization Models with Applications in

Table of Contents

Title Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Star Degree Centrality: Definitions and Problem Statements . . . . . . . . . . . 62.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Problem Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 The Star Degree Centrality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.1 Mathematical Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.2 Complexity Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.3 Solution Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.4 Algorithmic Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4 The Stochastic Pseudo-Star Degree Centrality . . . . . . . . . . . . . . . . . . . . 474.1 Complexity Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.2 Mathematical Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.3 Solution Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.4 Algorithmic Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5 Optimizing the Response for Arctic Mass Rescue Events . . . . . . . . . . . . . . 755.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795.3 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845.4 An Optimization Model for Arctic MREs . . . . . . . . . . . . . . . . . . . . . . . . 955.5 Overview of Solution Methodologies . . . . . . . . . . . . . . . . . . . . . . . . . . . 1055.6 Computational Study: Data Set Description and Baseline Analysis . . . . . . . . . . 1085.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

vi

Page 8: Large-Scale Optimization Models with Applications in

6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .125

Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .126A Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .135

vii

Page 9: Large-Scale Optimization Models with Applications in

List of Tables

3.1 Parameter settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.2 Summary of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.3 The computational results for the BA Model . . . . . . . . . . . . . . . . . . . . . . 393.4 The computational results for the ER Model. . . . . . . . . . . . . . . . . . . . . . . 403.5 The computational results for the WS Model . . . . . . . . . . . . . . . . . . . . . . 413.6 The computational results for Helicobacter Pylori (n = 1, 570) . . . . . . . . . . . . 443.7 The computational results for Staphylococcus Aureus (n = 2, 852) . . . . . . . . . . 45

4.1 Summary of results (27 Instances) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684.2 The computational results with θ = 0.99 . . . . . . . . . . . . . . . . . . . . . . . . 694.3 The computational results with different θ values via BD-LB and BD-LB-WS . . . 72

5.1 Data on communities in Arctic Alaska . . . . . . . . . . . . . . . . . . . . . . . . . . 785.2 Decisions conducted at the end of time 5 and their consequences . . . . . . . . . . . 955.3 Set definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 965.4 Variable definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 975.5 Parameter definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 975.6 New variables defined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1075.7 Populations and capacities in locations . . . . . . . . . . . . . . . . . . . . . . . . . 1105.8 List of assets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1125.9 Initial inventory in each location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1135.10 Resource and equipment list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1135.11 Initial deployment locations for ships . . . . . . . . . . . . . . . . . . . . . . . . . . . 1135.12 Changes in sr when resource demand is met . . . . . . . . . . . . . . . . . . . . . . . 1135.13 Jumps in ABNN (p, sr, se) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1145.14 Jumps in AESN (p, sr, se) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

A1 Comparison of the initial optimality gaps in the baseline experiment . . . . . . . . . 133A2 Comparison of the initial optimality gaps in the experiment with the upgraded runways133A3 Comparison of the solution methods in the baseline experiment [Time (in mins), Gap

(%)] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134A4 Comparison of the solution methods in the experiment with the upgraded runways . 134

viii

Page 10: Large-Scale Optimization Models with Applications in

List of Figures

2.1 Examples of star graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Determining the star degree centrality of a given a node where the center and leaf

nodes are shown in red and blue, respectively. . . . . . . . . . . . . . . . . . . . . . . 112.3 A subgraph of the PPIN of Saccharomyces Cerevisiae . . . . . . . . . . . . . . . . . 112.4 Determining the stochastic pseudo-star degree centrality of a given node where the

center and leaf nodes are shown in red and blue, respectively. . . . . . . . . . . . . 132.5 Calculation of the SPSDC of proteins in a real-world PPIN . . . . . . . . . . . . . . 142.6 1 - The SPSDC of a non-essential protein . . . . . . . . . . . . . . . . . . . . . . . . 142.7 2 - The SPSDC of a non-essential protein (zoomed in) . . . . . . . . . . . . . . . . . 142.8 3 - The SPSDC of a non-essential protein (more zoomed in) . . . . . . . . . . . . . 142.9 4 - The SPSDC of an essential protein . . . . . . . . . . . . . . . . . . . . . . . . . . 142.10 5 - The SPSDC of an essential protein (zoomed in) . . . . . . . . . . . . . . . . . . . 142.11 6 - The SPSDC of an essential protein (more zoomed in) . . . . . . . . . . . . . . . . 14

3.1 A counter example where the optimal solution obtained in LP[NIP ] cannot be con-verted a feasible solution in LP[V CIP ]. . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.2 The transformation of Set Cover < U,S,k > to an instance < G(V,E), l > of StarDegree Centrality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.3 The impact of warm-start in the solution times in [NIP] in the BA model . . . . . . 353.4 The impact of warm-start in the optimality gaps in [NIP] in the BA model . . . . . 353.5 The impact of warm-start in the solution times in [VCIP] in the BA model . . . . . 353.6 The impact of warm-start in the optimality gaps in [VCIP] in the BA model . . . . 353.7 The impact of warm-start in the optimality gaps in [NIP] in the ER model . . . . . 363.8 The impact of warm-start in the optimality gaps in [VCIP] in the ER model . . . . . 363.9 The impact of warm-start in the optimality gaps in [NIP] in the WS model . . . . . 373.10 The impact of warm-start in the optimality gaps in [VCIP] in the WS model . . . . 373.11 The impact of warm-start in the solution times in [DNIP] in the WS model . . . . . 373.12 The impact of warm-start in the solution times in [DVCIP] in the WS model . . . . 373.13 Solution time comparison between [DNIP] and [DVCIP] in the BA model . . . . . . 423.14 Solution time comparison between [DNIP] and [DVCIP] in the ER model . . . . . . 423.15 Solution time comparison between [DNIP] and [DVCIP] in the WS model . . . . . . 423.16 The optimality gap comparisons in [NIP], [VCIP], [DNIP] and [DVCIP] in the BA

model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.17 The optimality gap comparisons in [NIP], [VCIP], [DNIP] and [DVCIP] in the ER

model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.18 The optimality gap comparisons in [NIP], [VCIP], [DNIP] and [DVCIP] in the WS

model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.1 The transformation of Knapsack < ~s,~v, C,V > to an instance < G(V,E), `, ~p, θ > ofStochastic Pseudo-Star Degree Centrality on a tree . . . . . . . . . . . . . . . . . . . 49

ix

Page 11: Large-Scale Optimization Models with Applications in

4.2 The transformation of Knapsack < ~s,~v, C,V > to an instance < G(V,E), `, ~p, θ > ofStochastic Pseudo-Star Degree Centrality on a windmill graph . . . . . . . . . . . . 52

4.3 The illustration of the Benders Decomposition algorithm including Logic-based Ben-ders cuts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4.4 Distribution of Interaction Scores in HP . . . . . . . . . . . . . . . . . . . . . . . . . 674.5 Distribution of Interaction Scores in SA . . . . . . . . . . . . . . . . . . . . . . . . . 674.6 Solution time comparison between BD-LB and BD-LB-WS . . . . . . . . . . . . . . 704.7 Optimality gap comparison between BD-LB and BD-LB-WS . . . . . . . . . . . . . 704.8 Solution time comparison between BD-TB and BD-TB-WS . . . . . . . . . . . . . . 704.9 Optimality gap comparison between BD-TB and BD-TB-WS . . . . . . . . . . . . . 70

5.1 Visualization of transportation network in North Slope . . . . . . . . . . . . . . . . . 855.2 Illustration of deprivation cost function . . . . . . . . . . . . . . . . . . . . . . . . . 875.3 Evacuees in community 1 at time 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935.4 Evacuees in community 2 at time 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935.5 Movements when resource demand is not satisfied . . . . . . . . . . . . . . . . . . . 945.6 Movements when resource demand is satisfied . . . . . . . . . . . . . . . . . . . . . 945.7 Evacuees in community 1 at time 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945.8 Evacuees in community 3 at time 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945.9 Incident locations selected on the Crystal Serenity’s planned routes . . . . . . . . . . 1095.10 Objective values in the baseline experiment . . . . . . . . . . . . . . . . . . . . . . . 1155.11 The villages used in the baseline experiment . . . . . . . . . . . . . . . . . . . . . . . 1155.12 The total objective values in the base- line experiment and Experiment 1 . . . . . . 1175.13 The deprivation costs incurred during travel in the baseline experiment and Experi-

ment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1175.14 The total objective values in the base- line experiment and Experiment 2 . . . . . . 1195.15 The percentage increase in the objective in Experiment 3 compared to the baseline

experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1195.16 The number of evacuees stayed in the C. ship at |T | in the baseline experiment and

Exp. 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1205.17 The total number of tours completed by the ships in the baseline experiment and Exp. 31205.18 The total objective values in Experiment 3 and Experiment 4 . . . . . . . . . . . . . 122

x

Page 12: Large-Scale Optimization Models with Applications in

Chapter 1

Introduction

Operations research (OR) has been playing a significant role in our lives since its first uses in

World War II and has applications ranging from the military, to the economy, to infrastructure anal-

ysis, to biological networks. The report on Operations Research: A Catalyst for Engineering Grand

Challenges (Sen et al., 2014) discusses OR as the catalyst for four major challenges: i) sustainability

(e.g., providing low-cost solar energy and higher water quality), ii) security (e.g., ensuring cyber-

security and nuclear safety), iii) healthcare (e.g., offering improved health service and engineering

higher quality medicines), and iv) joy of living (e.g., building smart houses and creating better online

recommendation systems). While it integrates computational and mathematical tools to overcome

challenges faced in real-world applications, its progress can be further enhanced through interdisci-

plinary studies. Well-known problems addressed by OR include the facility location problem, the

sports scheduling problem, the blending problem, the cutting stock problem, the diet problem and

the vehicle routing problem.

Network design models constitute an important class of network optimization. In most

cases, network design problems aim to identify optimal location selection (e.g., warehouse, shelter,

or distribution center) and allocation decisions (e.g., commodities, evacuees, or electricity) depending

on the application area. Dynamic decision making processes over a time horizon can be also modeled

as a network design problem and have been studied in the literature (Nurre et al., 2012; Garrett

et al., 2017; Nguyen et al., 2020). Important applications include, but are not limited to, evacua-

tion networks (Uster et al., 2018), transportation networks (Behbahani et al., 2019), supply chain

networks (Saif and Elhedhli, 2016), multi-commodity flow networks (Paraskevopoulos et al., 2016),

1

Page 13: Large-Scale Optimization Models with Applications in

sensor networks (Keskin, 2017), and distribution networks (De Corte and Sorensen, 2016). In this

dissertation, we direct our attention specifically to the area of biological networks, more specifically,

protein-protein interaction networks (PPINs) and emergence response networks uniquely designed

for a remote region, the Arctic Alaska. We propose novel optimization models for problems in these

application areas, which broadly fit into network design problems.

Researchers also have been working on solution methodologies to tackle network design prob-

lems due to their consistent popularity over the last century. We can group the solution method-

ologies into three different categories: i) heuristic approaches, ii) approximation algorithms, and

iii) exact solution methods. Since optimization models often carry high inherent computational

complexity, heuristics are highly utilized to obtain good or near-optimal solutions. Some popular

approaches include neighbourhood search heuristics (Eskandarpour et al., 2017; Canca et al., 2017),

the Lagrangian-based heuristics (Fortz et al., 2017; Alkaabneh et al., 2019), column-generation based

heuristics (Crainic et al., 2016; Keskin, 2017), and meta-heuristics (SteadieSeifi et al., 2017; Govin-

dan et al., 2019). In addition, there exists a wide range of network design studies where authors

design approximation algorithms, which aim to approximate the optimal solution under the conjec-

ture of P 6= NP (Goemans et al., 1994; Ravi et al., 2001; Bley and Rezapour, 2016; Grimmer, 2018;

Friggstad et al., 2019; Govindan et al., 2019). Lastly, exact solution methods are designed to tackle

the network design problems when reaching the optimal is preferred over obtaining a solution within

a short amount of time. Decomposition algorithms including Benders decomposition (BD) (Gabrel

et al., 1999; Uster et al., 2007; Zetina et al., 2019), Lagrangian relaxation (Aykin, 1994; Gendron,

2019), and branch-and-cut algorithms (Alibeyg et al., 2018; Leitner et al., 2020) are highly utilized

exact solution methods in network design problems.

In this dissertation, we mainly reference BD to solve the models proposed for biological

networks (i.e., PPINs) at scale. On the other hand, we utilize heuristic solution methods to solve

an Arctic emergency response model due to its high complexity and non-decomposable structure of

the formulation. In other words, we approach the models related to PPINs from a computational

optimization perspective that contains modeling and design of exact solution methods. For the

emergency response application, we focus on modeling and policy analysis perspectives and design

heuristic solution methods that could offer solutions in real-time.

BD is a remarkably popular solution method often used in large-scale, mixed-integer linear

programming models which possess a block structure (Benders, 1962). This solution method has

2

Page 14: Large-Scale Optimization Models with Applications in

been proven quite effective on solving problems where there exist some variables that are considered

‘complicating,’ i.e., once these variables are fixed, the remaining optimization problem can be solved

efficiently. In this case, the complicating variables become part of the ‘master problem (MP)’ and

the remaining optimization problem is referred to as the subproblem (SP). There is a set of decision

variables in the MP which may appear in the SP as well. It should be noted that it is quite

likely to have multiple SPs, especially in situations where the initial SP is separable; however, for

this discussion and without loss of generality, we assume that there exists one SP. We also assume

that we are concerned with a maximization problem. The algorithm first solves the MP and then

proceeds to solve the SP with the fixed complicating variables obtained via the last solution to MP.

If the solution fixed does not yield a feasible solution in the SP, a cutting plane called a feasibility

cut is generated to eliminate the infeasible solution. If the SP turns out to be feasible, then i) an

upper (UB) is obtained on the objective function by solving the MP, and ii) a lower bound (LB) is

produced on the objective by figuring out the actual cost of the decisions in the SP in an iteration.

An optimality cut is then added to the MP to ensure that the last solution to the MP’s objective in

the MP reflects its true objective rather than just an UB on that solution’s objective. These steps

are iteratively repeated until a user-defined convergence is obtained between the LB and UB. We

refer the reader to Benders (1962) and Geoffrion (1972) for detailed further information.

Although BD is widely used to solve large-scale problems, it often requires extra efforts to

obtain a fast convergence. In the traditional Benders approach, since the MP is solved from scratch

every time a new cut is incorporated, the solver likely visits the same nodes over and over again

during the branch and bound processes. To overcome this challenge, Modern Benders Decomposition

(Fischetti et al., 2016, 2017), where Benders cuts are added on-the-fly (if violated) when the solver

identifies both incumbent and fractional solutions, has been commonly utilized. This is also called

the branch-and-Benders cut approach implying that there exists only a single enumeration tree,

with which the solver never visits the same candidate nodes again. Whenever the solver identifies

an incumbent solution, a callback function (the generic callback function in CPLEX) is triggered

implying that the branch-and-bound tree is halted. If the incumbent solution overestimates the

objective (i.e., underestimates for a minimization problem) meaning that there is a cut violated by

the integer solution, then Benders cuts are generated through the dual solutions. Along with this,

at a non-integer solution before branching, the same function is used to generate a Benders cut

separating the fractional solutions. If no cut violated exists, then branching takes place as usual.

3

Page 15: Large-Scale Optimization Models with Applications in

However, the cut generation might not be as straightforward as the cut generation taking place at

an incumbent solution implying that extra effort including employing heuristic approaches might be

necessary.

In literature, there are several acceleration techniques for BD, one of which is utilizing valid

inequalities based on constraint tightening, with which MPs are solved more efficiently (Sherali et al.,

2010; Taskın et al., 2012; Frank and Rebennack, 2015). Providing initial bounds on the objective

value in the MP also plays an important role to reach quicker convergence of the selected solution

methods. In the literature, there are some helpful methods in that sense including introducing

valid inequalities (Ahat et al., 2017), solving the relaxed version of the model (Chen and Miller-

Hooks, 2012), using the Lagrangian relaxation (Holmberg, 1994), and employing heuristic approaches

(Contreras et al., 2011). It is also shown that tuning certain solver parameters when solving the

MP might yield a faster convergence (Bai and Rubin, 2009; Botton et al., 2013; Dalal and Uster,

2017). It should be noted that in most cases, changing default settings does not really provide

significant improvement in terms of the solution times when solving the original model via branch-

and-bound process. Lastly, warm starting the solution methods could also be helpful as it identifies

incumbent solutions, especially if the method is struggling to identify such solutions. Several warm

starting methods have been shown to be effective strategies. Extreme points or valid cuts might

be generated via solving relaxed primal SP (Adulyasak et al., 2015), deflecting the current master

solution (Rahmaniani et al., 2018), or designing meta-heuristic algorithms (Emde et al., 2020).

Besides, BD might be utilized even if the SP is not a linear programming (LP) model.

With this regard, Logic-based Benders Decomposition (LBBD) is formally introduced by Hooker

and Ottosson (2003). It distinguishes itself from the traditional BD by using the inference dual

rather than LP duality to generate the Benders cuts so as to eliminate the infeasible MP solutions.

Although it has been predominantly used in scheduling problems (Hooker, 2007; Roshanaei et al.,

2017; Emde et al., 2020; Guo et al., 2021), it is recently adapted to plant location (Fazel-Zarandi

and Beck, 2012), route planning (Kloimullner and Raidl, 2017), and network interdiction (Enayaty-

Ahangar et al., 2019) problems as well.

The remainder of this dissertation is outlined as follows. In Chapter 2, we introduce the

star degree centrality (SDC) problem tasking itself with identifying an induced star with the largest

open neighborhood together with its stochastic variant called the stochastic pseudo-star degree cen-

trality (SPSDC). In Chapter 3, we focus on the deterministic version and introduce a new integer

4

Page 16: Large-Scale Optimization Models with Applications in

programming (IP) formulation, which stands stronger than the existing IP formulation in the litera-

ture in terms of the number of constraints and non-zero coefficients. Next, we present a complexity

discussion where the SDC problem is examined on certain network types. We then propose a BD

framework and conduct an extensive experimental studies on both randomly generated networks

and real-word PPINs. In the next chapter, we propose a non-linear binary optimization model and

provide complexity discussions for the SPSDC problem (see Chapter 4). We first convert the model

into a linear form and design a BD framework which contains optimality cuts, and both traditional

and logic-based Benders feasibility cuts. In addition, a wide range of computational experiments

are presented based on randomly generated networks according to small-world networks. Chapter 5

focuses on Arctic mass rescue events motivated by the entrance of large-sized cruise ships into the

region in the last decade. We first provide an initial comprehensive overview on the background of

Arctic Alaska. We discuss the changes and challenges that have occurred in the past and are to

occur in the future due to environmental, geographical, political, as well as economic reasons in the

region. We then provide a literature review related to our work and then provide a formal problem

definition containing our modelling assumptions and objective components. We next introduce an

IP formulation, which includes both transportation and logistics decisions, and discuss the solution

methodology created to solve this problem. We conclude this work with conducting an extensive

’what-if’ analysis and answering some policy questions. Lastly, we present the conclusion and a

summary of this dissertation in Chapter 6.

5

Page 17: Large-Scale Optimization Models with Applications in

Chapter 2

Star Degree Centrality: Definitions

and Problem Statements

In this chapter, we first review the star graph terminology and then detail the central-

ity concept, a well-recognized metric in graph theory and network analysis. After introducing the

protein-protein interaction networks, we discuss both the star degree centrality and the stochastic

pseudo-star degree centrality problems (see Section 2.1). Lastly, we provide formal problem defini-

tions together with illustrative examples detailing how the star degree centrality mechanism with

its two variants works in Section 2.2.

2.1 Introduction

A star graph can be defined as a tree graph with a maximum diameter of two, where

the diameter is defined as the maximum distance between any two nodes (see Fig. 2.1). Different

variations of star graphs have been drawing researchers’ attention since the late 1980s. Akers and

Krishnamurthy (1989) are the first who introduce the notion of a star graph as a new class of

networks. Day and Tripathi (1992) expanded this idea to generalized (n, k)-star graphs where n and

k are user-defined values tuning the number of nodes and the degree / diameter trade-off. The idea

was then used up by Akers et al. (1994) who proposed star graphs as an alternative to hyperbolic

structures. Afterwards, Chou et al. (1996) propose bubble-sort star graphs as a new interconnection

6

Page 18: Large-Scale Optimization Models with Applications in

network structure. Past and recent studies heavily focus on the topological and functional analysis

of star graphs (Chiang and Chen, 1998; Lin et al., 2020; Li et al., 2020a).

Figure 2.1: Examples of star graphs

S1,2S1,2 S1,3S1,3S1,3 S1,4S1,4S1,4S1,4 S1,5S1,5S1,5S1,5S1,5 S1,6S1,6S1,6S1,6S1,6S1,6

Centrality, on the other hand, is one of the best-studied concepts in network analysis. It has

been used in a variety of applications to quantify the importance of nodes or entities in a network.

The main idea is that the more central a node is, the more importance it has. Expectedly, not

every measure of importance is equally valid in every application. Hence, a series of simpler or

more complex notions of centrality have been proposed over the years. They range from the early

work by Bavelas (1948, 1950) and Leavitt (1951) on task-oriented group creation, as well as the

introduction of eigenvector and bargaining centrality by Bonacich (1972, 1987), to more recent ideas

about subgraph (Estrada and Rodrıguez-Velazquez, 2005), residual (Dangalchev, 2006) or diffusion

(Banerjee et al., 2013) centrality. In this dissertation, we turn our focus to a concept referred to as

group centrality (Everett and Borgatti, 1999).

In a fundamental contribution, Freeman (1978) examined three distinct and recurring con-

cepts in centrality studies, namely degree, betweenness, and closeness. The basic definitions involved

with each of the concepts are as follows. Degree is related to the number of connections that a node

has (i.e., number of nodes adjacent to a given node i, often normalized by the number of nodes in

the network minus 1); betweenness can be quantified as the fraction of shortest (geodesic) paths

that use a specific node i; finally, closeness is a function of the shortest (geodesic) paths that a node

i has to every other node in the network. A common theme behind the above definitions is their

focus on a specific node.

Group extensions to centrality have been proposed to help address questions of importance

for a group as a whole, as well as for introducing importance that can be attributed to the node

versus to the group it belongs. This idea was presented by Everett and Borgatti (1999, 2005)

and was immediately picked up and expanded upon by a series of researchers. Prominent extensions

include the definition of clique (cohesive subgroup) centrality (Vogiatzis et al., 2015; Rysz et al., 2018;

7

Page 19: Large-Scale Optimization Models with Applications in

Nasirian et al., 2020). Identifying a general group of nodes with highest betweenness centrality is also

studied by Veremyev et al. (2017), where they also mention the possibility to introduce additional

“cohesiveness” constraints.

More specifically, we study the recently introduced measure of star degree centrality (SDC)

by Vogiatzis and Camur (2019) where SDC has been shown to be a highly efficient centrality metric to

identify the essential proteins in protein-protein interaction networks (PPINs). The results indicate

that it performs better than other well-known metrics (i.e., degree, closeness, betweenness, and

eigenvector) in the determination of the essential proteins. The contributions of Vogiatzis and Camur

(2019) are in approximation algorithms for finding nodes with high SDC whereas we contribute to

the literature by providing exact solution approaches that are able to solve problems of significant

size.

The SDC tasks itself with identifying the induced star centered at a given node i that

possesses the maximum cardinality open neighborhood. An induced star centered at i will include i

and a subset of its neighbors as part of the star under the condition that no two neighbors in the star

have an edge between them. The open neighborhood is the set of all nodes not in the induced star

that are adjacent to a node in the induced star. Vogiatzis and Camur (2019) study the problem in

the context of a PPIN. The authors derive the computational complexity of the problem and show it

is NP-hard; additionally, they provide an integer programming (IP) formulation and approximation

algorithms to solve it efficiently. More importantly, they show that this is indeed a viable proxy

for predicting essentiality in PPINs. Essential genes (and their essential proteins) are ones whose

absence leads to lethality or the inability of an organism to properly reproduce themselves (Kamath

et al., 2003). Thus, identifying the node with the highest star degree centrality finds an important

application in PPINs.

PPINs are networks where nodes represent proteins and arcs represent protein-protein in-

teractions. Each arc is associated with an interaction score indicating the strength of the interaction

where a higher score implies a stronger interaction. These networks have been heavily studied over

the last two decades: for a series of surveys on computational methods for complex detection, clus-

tering, detecting essentiality, among others, in PPINs, we refer the interested reader to the recent

reviews by Wang et al. (2013); Bhowmick and Seah (2015), and Rasti and Vogiatzis (2019). Cen-

trality has been a staple in the study of biological networks, and specifically PPINs: CentiServer

(Jalili et al., 2015) is a database that has collected a large number of centrality-based approaches

8

Page 20: Large-Scale Optimization Models with Applications in

for biological networks at https://www.centiserver.org.

Jeong et al. (2001) proposed the “lethality-centrality” rule, in which the more central a

protein is, the higher the probability it is essential. This work led to significant research interest in

centrality metrics in PPINs (see the works by Joy et al. (2005) on betweenness, Estrada (2006) on

subgraph centrality, Wuchty and Stadler (2003) on closeness centrality). An updated survey and

comparison of 27 commonly used centrality metrics (including degree, betweenness, and closeness)

is presented in the work by Ashtiani et al. (2018).

At this point, we should mention that the high computational complexity in PPINs did not

allow Vogiatzis and Camur (2019) to conduct a full analysis across the entire network. That is why

they used two different approaches to simplify the problem: i) setting extremely high thresholds to

prune the edges in the networks and ii) utilizing a probabilistic approach to create the interactions

between the proteins. In addition, the essential protein analysis is performed by selecting k (i.e., a

user-defined value) top proteins for each of which an individual IP is solved assuming each as the

center. On the other hand, our decomposition implementation opens the door to a full analysis

of large-scale networks by being able to identify the node with the highest SDC across the entire

network. Our computational results indicate that we can avoid using high thresholds to perform

analysis in real-world PPINs.

Furthermore, we introduce the stochastic pseudo-star degree centrality (SPSDC) problem

where the goal is to detect an induced pseudo-star, that is truly a star with a high probability, where

‘high probability’ examines a) the probability that the center has an edge to each leaf node , and

b) there are no edges between leaf nodes. The objective is to maximize the connection probability

of each neighbor node to the pseudo-star. From an application perspective, the SPSDC metric may

help to identify new proteins that should be investigated to determine their essentiality (see Section

2.2). It may also help to confirm that essential proteins identified through the SDC metric are

important.

In PPINs, there exist interaction scores that represent the strength of the interactions be-

tween two proteins. In fact, one can normalize the interaction scores and treat them as probability

values that would indicate the likelihood of two proteins interacting. Our first goal is to make sure

that i) probability values between the center node and each leaf nodes are high, and ii) the proba-

bility values between each connected leaf node is low in order to ensure the feasibility (i.e., existence

of star). It is crucial to point out that we now allow leaf nodes to connect as long as the induced

9

Page 21: Large-Scale Optimization Models with Applications in

pseudo-star satisfies the “feasibility condition”, which will be introduced shortly. Therefore, we use

the term of pseudo-star rather than star.

In the SPSDC problem, the main objective is to assign each neighbor node to a single

pseudo-star element (i.e., either the center or a leaf) which yields the largest probability value. In

other words, our goal is to maximize the maximum probability value of the connection between a

neighborhood node and the pseudo-star. This offers one potential way to evaluate the centrality of

the pseudo-star; different metrics could be applied in the future.

2.2 Problem Definitions

Let G = (V,E) be an undirected graph consisting of a vertex set V and an edge set E

where |V | = n and |E| = m∗. We define the open neighborhood of a node i ∈ V as the set of nodes

adjacent to i; in other words, N(i) = j ∈ V : (i, j) ∈ E. Similarly, the closed neighborhood of a

node i ∈ V is defined as N [i] = N(i) ∪ i. For a set of nodes S, we define the open neighborhood

as N(S) = j ∈ V : i ∈ S, j 6∈ S, (i, j) ∈ E. Additionally, we define the k-neighborhood of a node

i ∈ V as the set of nodes whose shortest path from i is exactly k edges and denote it as Nk(i). In

other words, Nk(i) represents the set of nodes that are reachable from i within at least k-edge hops.

Note that Nk(i) ∩ Nk+1(i) = ∅,∀k ≤ K where k ∈ Z+ and K is the length of the longest shortest

path from node i to all other nodes in the network. Finally, we let pij represent the probability of

existence between each edge (i, j) ∈ E.

Definition 1. The star degree centrality of node i represented by Di is a centrality metric which

aims to form an induced star Si centered at i with the largest size open neighborhood, where Di =

max|N(Si)| : Si is an induced star.

In the deterministic setting, we are not concerned with the probability values, in other

words, we assume that all the edges in the network exist with probability one.

Example 1. Below we present a small example showing how to identify the SDC of a given node

(see Fig. 2.2). We select node c as the candidate center. First, note that N(c) = l1, l2 represents

the set of candidate leaf nodes. In a deterministic induced star, no two leaf nodes can be connected,

therefore, l1 and l2 cannot be the elements of the same star. Since the objective is to maximize the

∗We will redefine notations as needed through each chapter. We will be consistent in terms of not changing thenotation used for a specific definition.

10

Page 22: Large-Scale Optimization Models with Applications in

open neighborhood of the induced star, node l2 is preferable over node l1 where it gives access to

more nodes. Thus, we obtain Sc = c, l2 and N(Sc) = l1, n3, n4, n5.

Figure 2.2: Determining the star degree centrality of a given a node where the center and leaf nodes areshown in red and blue, respectively.

l1 l2

c

n1

n2

n3

n4

n5

Figure 2.3: An example of why a star structure helps identify essential proteins. In this figure, we present asubgraph of the PPIN of Saccharomyces Cerevisiae (yeast) using a threshold of 92%. The node in red corre-sponds to non-essential protein YMR300C and is the node of highest degree; the node in green correspondsto essential protein YHL011C and is the node of highest star degree centrality.

Example 2. In Fig. 2.3, we present some of the notions in this work using a real-life example from

the yeast proteome (Saccharomyces Cerevisiae) keeping only interactions above a threshold of 92%

(so that the induced subgraph is sparse enough for visualization purposes).

The highest degree centrality protein is also known as YMR300C (marked in red) and

despite its central location and its many documented interactions, it is not essential. We observe

that YMR300C is adjacent to two main protein complexes (dense subgraphs). This means that many

of the connections that YMR300C has to other nodes are also shared among the nodes themselves.

Hence, if we were to discard connections between neighbors (that is, we enforced a “star” constraint),

its importance would be sure to decrease.

11

Page 23: Large-Scale Optimization Models with Applications in

On the other hand, the highest star degree centrality protein is known as YHL011C (marked

in green), an essential protein for many cell activities as it is used to synthesize phosphoribosyl

pyrophosphate. We observe that while its degree centrality is small (the number of neighbors it has

is only 7, compared to a degree centrality of 23 for YMR300C), it is adjacent to nodes that connect

different protein complexes and communities.

We now move to the SPSDC problem and first formally define the feasibility condition. For

a given pseudo-star Sk centered at node k, let L be the set of leaf nodes. Also, let θ ∈ [0, 1] be a

user-defined value.

Definition 2. Given a pseudo-star Sk, the feasibility condition is defined as∏j∈L

pkj∏

i,j∈L:(i,j)∈E

(1− pij) ≥ 1− θ (2.1)

where the first product term focus on the probability that edges exist between the center and the

leaf nodes and the second product term focuses on the probability of edges existing between two

leaf nodes. We can use the log transformation (i.e., a data transformation where each data point is

inserted into the logarithm function) to get rid of the multiplication operation in Ineq. (2.1) and

obtain an equivalent expression:∑j∈L

log(pkj) +∑

i,j∈L:(i,j)∈E

log(1− pij) ≥ log(1− θ) (2.2)

Definition 3. The stochastic pseudo-star degree centrality of node i represented by Di is a central-

ity metric which aims to form an induced pseudo-star Si centered at i with maximizing the maxi-

mum probability value of each neighbor’s connection to the pseudo-star, where Di = max∑j∈N(Si)

maxk∈Si pkj : Si is an induced pseudo-star satisfying the feasibility condition (2.1).

Example 3. In Fig. 2.4, we provide an example to see how SPSC mechanisms works, where the

probability values are shown on the edges and θ is given as 0.2. Considering node c as the center,

we can first create a candidate pseudo-star where node l1 is the only leaf node (see the figure on

the left). In this scenario, the feasibility condition is satisfied since 0.99 ≥ 1 − θ and we obtain an

objective of 0.5 + 0.5 + 0.8 = 1.8. Note that node c is assigned to node l2 since c provides a stronger

connection compared to node l1 (i.e., 0.8 vs. 0.01).

However, probability values associated with the edges between the center node and nodes

l1 and l2 are relatively large. Also, even though nodes l1 and l2 share an edge, the corresponding

12

Page 24: Large-Scale Optimization Models with Applications in

Figure 2.4: Determining the stochastic pseudo-star degree centrality of a given node where the center andleaf nodes are shown in red and blue, respectively.

l2l1

n3n1

n2

c

0.01

0.80.99

0.99

0.5

0.5

l2l1

n3n1

n2

c

0.01

0.80.99

0.99

0.5

0.5

probability value between those two nodes shows that they are highly likely not to interact. There-

fore, we can create an alternative induced pseudo-star centered at c where leaf nodes are selected as

l1 and l2 (see the figure on the right). Such pseudo-star would still satisfy the feasibility condition

(i.e., 0.99 ∗ 0.8 ∗ (1 − 0.01) = 0.8821 > 1 − θ). In addition, it yields a better objective, which is

calculated as 0.99 + 0.5 + 0.5 = 1.99.

It is important to mention that, there cannot be any guarantee that a pseudo-star gives a

better deterministic objective (i.e., the largest size of open neighborhood) than the deterministic

induced star since the threshold used in the feasibility condition impacts the size of the pseudo-

star. Therefore, a fair comparison cannot be made between the SDC and the SPSDC even if they

are associated with the same objective function. However, the goal of each of these problems in

our motivating application is to identify essential proteins and, therefore, it may be that each of

their solutions helps to diversify the set of proteins that should be investigated to determine their

essentiality or confirm the likeliness of certain proteins being essential (i.e., if they appear in both

the SDC and SPSDC problems).

The advantages of the SPSDC compared to the deterministic counterpart that has been

studied before are twofold. First and foremost, it allows us to solve the problem in a PPIN without

the need to trim edges based on their probability of existence. In the deterministic version, a thresh-

old is employed to remove edges below a certain probability of existence. This can be problematic,

as certain edges may interact with high enough probabilities just below the threshold and hence

get removed; on the other hand, other edges that are just above the threshold are considered as

present. As an example to showcase the success of the SPSDC in PPINs, we point the attention to

Figure 2.5. There we show two pseudo-stars obtained for a non-esssential protein (i.e., YBL072C or

RPS8A) and for an essential protein (i.e., YAL001C or TFC3) in Saccharomyces cerevisiae, which

is a species of yeast, with different zooming perspectives in the first three and last three images,

13

Page 25: Large-Scale Optimization Models with Applications in

respectively. The pseudo-star obtained with the essential protein as the center leads to higher overall

objective function than the pseudo-star obtained for the non-essential protein. On the other hand,

had we employed a threshold of 60% (i.e., removing all edges with likelihood smaller than 60%), the

objective functions of the two stars would be reversed, leading to the non-essential one possessing a

higher value.

Figure 2.5: In this example, we show the pseudo-stars with maximum objective function value obtainedfor a non-essential and an essential proteins in a real-world PPIN, Saccharomyces cerevisiae. The networkconsists of a giant connected component that includes 6,416 nodes out of the 6,418 proteins documented inSTRING-DB (Szklarczyk et al., 2015), and 939,997 edges of varying reliability (probability of existence).The networks presented are the full connected component that contains the two proteins (left), a zoomedin perspective (middle) and an even more zoomed in perspective showing the pseudo-star centered at eachprotein (right). The pseudo-stars are obtained with α = 0.99; in other words, they form induced stars withprobability 1%. To show the two pseudo-stars, we show in red the center and in yellow the leaves; edges fromthe center to the leaves are solid, whereas edges connecting two leaves are dashed. The pseudo-star obtainedfor the essential protein (see 4-6) leads to a higher objective function value (equal to 1293.72) than the valueobtained for the pseudo-star centered at the non-essential protein (see 1-3) which is equal to 1002.57.

1 - The SPSDC of a non-essentialprotein

2 - The SPSDC of a non-essentialprotein (zoomed in)

3 - The SPSDC of a non-essentialprotein (more zoomed in)

4 - The SPSDC of an essential pro-tein

5 - The SPSDC of an essential pro-tein (zoomed in)

6 - The SPSDC of an essential pro-tein (more zoomed in)

We conclude this chapter by providing the definitions of the DSDC and SPSDC problems

(see Definitions 4 and 5, respectively). We examine each problem in a detailed way in Chapters 3

and 4 in order.

Definition 4. The deterministic star degree centrality problem aims to identify the node which has

the largest star degree centrality in a given network.

14

Page 26: Large-Scale Optimization Models with Applications in

Definition 5. The stochastic pseudo-star degree centrality problem aims to identify the node which

has the largest stochastic pseudo-star degree centrality in a given network.

15

Page 27: Large-Scale Optimization Models with Applications in

Chapter 3

The Star Degree Centrality∗

In this chapter, we provide IP mathematical formulations to the SDC problem. We begin

the discussion in Section 3.1 from the previously introduced formulation by Vogiatzis and Camur

(2019) and then proceed to propose a new, compact formulation. Section 3.2 presents classes of net-

works where the problem is solvable in polynomial time and offers a new proof of NP-completeness

that shows the problem remains NP-complete for bipartite and split graphs (thus tightening the

complexity analysis of Vogiatzis and Camur (2019)). In Section 3.3, we provide a branch-and-cut

implementation motivated by Benders decomposition for solving the problem on real-life, large-

scale networks, such as the ones typically encountered in computational biology and specifically in

PPINs. Section 3.4 discusses acceleration techniques utilized to speed up our implementation. All

our algorithmic efforts are put to the test in Section 3.5 which is divided into two subsections for

randomly generated instances and protein-protein interaction networks instances. We conclude with

a summary of our findings and recommendations for future work in Section 3.6.

3.1 Mathematical Formulations

First, we present the formulation that appears in the literature (the Vogiatzis and Camur

(2019) integer programming (VCIP) formulation). Then, we introduce a new formulation, which

is more compact in theory with respect to the number of constraints. In the original formulation,

there are three sets of binary variables: (i) xi is equal to 1 if and only if i ∈ V is the center of the

∗The paper has been accepted at INFORMS Journal on Computing.

16

Page 28: Large-Scale Optimization Models with Applications in

star, (ii) yi is equal to 1 if node i is in the star, and (iii) zi is equal to 1 if node i is in the open

neighborhood of the star. The IP model is now provided in (3.1).

[VCIP]:

max∑i∈V

zi (3.1a)

s.t. yi + zi ≤ 1, ∀i ∈ V (3.1b)

zi ≤∑

j∈N(i)

yj , ∀i ∈ V (3.1c)

yi ≤∑j∈N [i]

xj , ∀i ∈ V (3.1d)

xi ≤ yi, ∀i ∈ V (3.1e)

yi + yj ≤ 1 + xi + xj , ∀(i, j) ∈ E (3.1f)∑i∈V

xi = 1, (3.1g)

xi, yi, zi ∈ 0, 1, ∀i ∈ V. (3.1h)

The objective function (5.33) maximizes the number of the nodes adjacent to the star. Constraints

(3.1b) indicate that no node can be in the star and the neighborhood. Constraints (3.1c) ensure

that for a node to be a neighbor to the star, it must be adjacent to at least one node in the star. In

addition, every node in the star must be in the closed neighborhood (i.e., a neighborhood containing

the node itself) of the center node by constraints (3.1d). We should point out that constraints (3.1e)

ensuring that the center node is part of the star were absent in the printed version in Vogiatzis and

Camur (2019). Constraints (3.1f) prevent two adjacent nodes from being in the star if neither is

the center. This computationally stands as the most expensive constraint due to the fact that it

must appear for every edge. Constraint (3.1g) makes sure that the model identifies a single star by

selecting one center node. Last, constraints (3.1h) dictate the binary requirements for each variable.

Note that there is a total of 4n+m+ 1 constraints in [VCIP]. Further, we can examine the number

of total non-zero coefficients across each type of constraint: (3.1b) has 2n; (3.1c) has n+ 2m; (3.1d)

has 2n + 2m (since i ∈ N [i]); (3.1e) has 2n; (3.1f) has 4m; and (3.1g) has n. These sum to a total

of 8n+ 8m non-zero coefficients.

In the former formulation [VCIP], though there is a specific variable used for the center

node (i.e., xi), variable yi corresponds to any node in the star without making any distinction. An

17

Page 29: Large-Scale Optimization Models with Applications in

important observation is that leaf nodes in a star carry a unique characteristic which differentiates

them from the center node. That is, while a leaf node has solely one edge connecting it to the star

via the center node, the center node shares an edge with every leaf node. Hence, we remove variable

yi and introduce a new variable to represent the leaf nodes.

li =

1, if node i ∈ V is a leaf of the star

0, otherwise.

After this conversion, we can remodel the problem with a new IP (NIP) formulation.

[NIP]:

max∑i∈V

zi (3.2a)

s.t. xi + li + zi ≤ 1, ∀i ∈ V (3.2b)

zi ≤∑

j∈N(i)

(lj + xj), ∀i ∈ V (3.2c)

li ≤∑

j∈N(i)

xj , ∀i ∈ V (3.2d)

∑j∈N(i)

lj ≤ |N(i)|(1− li), ∀i ∈ V (3.2e)

∑i∈V

xi = 1, (3.2f)

xi, li, zi ∈ 0, 1, ∀i ∈ V. (3.2g)

First of all, constraints (3.2a), (3.2f), and (3.2g) correspond to constraints (5.33), (3.1g), and (3.1h),

respectively. Constraints (3.2b) guarantee that a node cannot be the center, a leaf, and a neighbor of

the star at the same time, which is similar to original constraints (3.1b). Constraints (3.2c) replace

(3.1c) and indicate that a node should be adjacent to either the center node or at least one of the

leaf nodes, if it is adjacent to the star. Each leaf node is connected to the center node to form a

feasible star, which is enforced by constraints (3.2d). With the new variable definition (i.e., li), we

eliminate two constraints (that is, (3.1e) and (3.1f)), and no longer need to account for all edges

in the graph. Constraints (3.2e) state that if a node is selected as a leaf, none of the nodes which

are adjacent to it can also be a leaf node. Note that there is a total of 4n+ 1 constraints in [NIP].

Further, we can examine the number of total non-zero coefficients across each type of constraint:

18

Page 30: Large-Scale Optimization Models with Applications in

(3.2b) has 3n; (3.2c) has n+ 4m; (3.2d) has n+ 2m; (3.2e) has n+ 2m; and (3.2f) has n. These sum

to a total of 7n+ 8m non-zero coefficients.

We now examine the tightness of the linear programming (LP) relaxations of these two

formulations.

Theorem 1. The LP relaxation of [VCIP] is stronger than the LP relaxation of [NIP].

Proof. Given two LP formulations LP i and LP j , let Pi and Pj be the polyhedra defined by LP i

and LP j , respectively. LP j is said to be stronger than LP i, if i) there exists at least once instance

and one point contained by Pi while not contained by Pj , and ii) all the points contained by Pj are

also contained by Pi.

First of all, note that constraints (1g) and (2f) are equivalent, and do not need an explicit

comparison. Now, let li = yi − xi,∀i ∈ V be the mapping from LP[V CIP ] to LP[NIP ] between

the variables. When replacing each li by yi − xi in LP[NIP ], it is straightforward to see that

constraints (1b) and (1c) imply constraints (2b) and (2c), respectively. When we replace yi by

li + xi in constraints (1d), they imply constraints (2d) since yi = li + xi ≤∑j∈N [i] xj =⇒ li ≤

−xi +∑j∈N [i] xj =

∑j∈N(i) xj . In addition, constraints (1e) implies the non-negativity of variables

li due to the fact that xi ≤ yi =⇒ 0 ≤ yi − xi =⇒ 0 ≤ li. If we rearrange constraints (1f) based

on the map definition, we obtain li + lj ≤ 1,∀(i, j) ∈ E. For a given node i, we then openly write

constraints (1f) and aggregate them.

(li + lj1) + · · ·+ (li + lj |N(i)|) ≤ |N(i)| =⇒∑

j∈N(i)

lj ≤ |N(i)|(1− li)

It can be seen that, constraints (1f) imply constraints (2e) with a slight modification. There-

fore, we can conclude that all the points contained by the polyhedron generated by LP[V CIP ] is

contained by the polyhedron generated by LP[NIP ]; in other words, OBJLP[V CIP ]≤ OBJLP[NIP ]

.

Below we present a counter example where a solution produced by LP[NIP ] cannot be

converted a feasible solution in LP[V CIP ].

For this example, LP[NIP ] sets x3, x4 and x5 0.2, 0.2, and 0.6, respectively while the leaf

variables of the same nodes (i.e., li) are set as 1 − xi where i = 3, 4, 5 in an optimal solution.

As a result, the objective value becomes nine. On the other hand, since nodes 3 and 4 share an

edge, the same solution becomes infeasible in LP[V CIP ] due to constraints (1f) (i.e., 1.6 1.4). The

19

Page 31: Large-Scale Optimization Models with Applications in

Figure 3.1: A counter example where the optimal solution obtained in LP[NIP ] cannot be converted a feasiblesolution in LP[V CIP ].

0

1 2

3

4

5

6

7

8 9

10

11

solver returns 8.5 as optimal solution in LP[NIP ]. Hence, we can conclude that [VCIP] is a tighter

formulation than [NIP] with respect to LP-relaxations.

Even though [VCIP] is a stronger formulation than [NIP] in terms of the LP-relaxation, we

observe here that while the constraint set is bounded by O(n+m) in [VCIP], the new formulation

[NIP] is associated with a constraint set bounded by O(n). Furthermore, the number of non-zero

coefficients are slightly higher in [VCIP] (i.e., 8n+8m) compared to [NIP] (i.e., 7n+8m). It is worth

mentioning that the number of non-zero coefficients can be reduced with a constraint tightening in

[NIP], which is discussed in Section 3.4.1. All of these factors may impact the computational per-

formance of solving these problems. This is further examined in Section 3.5, where we demonstrate

that [NIP] is the foundation for more efficient methods to solve the problem.

3.2 Complexity Discussion

The SDC problem over general graphs was shown to be NP-complete by Vogiatzis and Ca-

mur (2019). In this section, we provide graphs where the SDC problem can be solved in polynomial-

time and prove that the SDC problem remains NP-complete on certain networks.

3.2.1 Polynomial-Time Cases

Theorem 2. The SDC problem is solvable in polynomial time on trees.

Proof. We propose Algorithm (1) that identifies the optimal induced star with the maximum size

20

Page 32: Large-Scale Optimization Models with Applications in

neighborhood in O(m) time for a tree. For the sake of simplicity, we assume that the given graph is

connected and n ≥ 3. The algorithm goes through each edge (i, j) ∈ E and determines whether an

adjacent node is considered a leaf node or a neighbor node. For a given edge (i, j), there exist three

cases, considering each node as a center of a star.

1. If |N(i)| > 1 and |N(j)| = 1, then i would be a leaf for a star centered at j and all nodes

N(i)\j would serve as the neighbors of the star. In this case, j would be selected as being in

the neighborhood of the star centered at i since having it as a leaf would result in no additional

neighbors.

2. If |N(i)| = 1 and |N(j)| > 1, then j would be leaf for a star centered as i and i would be in

the neighborhood for a star centered at j.

3. If both |N(i)| and |N(j)| are greater than one, then they would each be a leaf for a star centered

at the other. Note that after identifying a node i ∈ V as a leaf, we can directly compute its

contribution to the objective with |N(i)| − 1 due to the fact that the graph is acyclic.

Thus, we can conclude that the problem can be solved efficiently if the given graph is a tree.

Algorithm 1: An algorithm to solve the SDC problem on a tree

Input: G = (V,E), L, S1 L[i]← ∅; ∀i ∈ V | L[i] : list of leaf nodes connected to center i2 S(i) = 0; ∀i ∈ V | S(i) : number of nodes adjacent to the star whose center is i3 for (i, j) ∈ E do4 if |N(i)| > 1 and |N(j)| = 1 then5 S(i) + +;6 L[j]← L[j] ∪ i;7 S(j) = S(j) + |N(i)| − 1;

8 else if |N(i)| = 1 and |N(j)| > 1 then9 L[i]← L[i] ∪ j;

10 S(i) = S(i) + |N(j)| − 1;11 S(j) + +;

12 else13 L[i]← L[i] ∪ j;14 S(i) = S(i) + |N(j)| − 1;15 L[j]← L[j] ∪ i;16 S(j) = S(j) + |N(i)| − 1;

17 i∗ = arg maxi∈V

Si;

18 return i∗, L[i∗]

21

Page 33: Large-Scale Optimization Models with Applications in

Definition 6. A graph Wd(k, n) where k ≥ 2 and n ≥ 2 is called a windmill graph, with n copies

of Kk complete graphs with a shared universal vertex.

Proposition 1. Given a windmill graph Wd(k, n), there exists a unique optimal solution solely

containing the universal vertex for the SDC problem.

Proof. By the definition of the windmill graph, there exist n identical complete graphs with k vertices

each of which is connected to the universal vertex u. A star whose center is u with no selected leaves

has a neighborhood of size |V | − 1 = (k− 1)n. Note that any node selected as a leaf node decreases

the objective by one since all its neighbors are already in the star’s neighborhood. For any node

j ∈ V \u as a center, we must have the universal node u as a leaf node in order to gain access

to the nodes j does not have an edge to. If u is not a leaf node, then the maximum neighborhood

would be k − 1 (all nodes incident to j are in the neighborhood). If u is a leaf node, then the

maximum neighborhood is for all nodes besides j and u to be in it, which implies the maximum size

is |V | − 2 < |V | − 1. Hence, the optimal solution is unique and provided by the universal vertex u

with no leaf nodes.

3.2.2 NP-Complete Classes

Vogiatzis and Camur (2019) show that the SDC problem is NP-complete via a reduction

from a well-recognized combinatorial problem, the Maximum Independent Set (MIS). It is widely

known that according to the Konig’s theorem, the MIS can be efficiently determined if the graph is

bipartite. Yet, we show that the SDC problem preserves its complexity even in a bipartite graph.

We first provide the decision versions of the SDC problem and the Set Cover Problem (SCP) via

which we will perform a reduction.

Definition 7. (Star Degree Centrality) Given an undirected graph G = (V,E) and an integer

`, does there exist a node i and an induced star C centered at i such that |N(C)| ≥ `?

Definition 8. (Set Cover) Given a set of elements U = u1, u2, · · · , un (i.e., the universe), a

collection of subsets, S = S1, S2, · · · , Sm where ∪mi=1Si = U , and an integer k, does there exists a

set I ⊆ S such that |I| ≤ k and ∪i∈ISi = U?

Theorem 3. The SDC problem is NP-complete on bipartite graphs.

22

Page 34: Large-Scale Optimization Models with Applications in

Proof. Given a potential induced star centered at node i, we must verify if any two leaf nodes share

an edge to verify if it is truly an induced star. One can then verify if |N(C)| ≥ ` easily. This shows

that SDC problem is in NP if the graph is bipartite.

Now, let < U,S, k > be an instance of the SCP where k represents the number of sets to

cover all the elements in U . We can then construct an instance of SDC problem < G, ` > on a

bipartite graph as follows:

V [G] = V1 ∪ V2 where V1 = S1, S2, · · · , Sm, d1 and V2 = u1, u2, · · · , un, d2, d3, d4 · · · , d|S|+3

E[G] = ∪mi=1∪j∈Si(Si, uj) ∪ ∪mi=1(d2, Si) ∪ (d1, d2) ∪ ∪|S|+3i=3 (d1, di).

The construction proposed (see Fig 3.2) can be explained as follows. Each set Si ∈ S, and each

element ui ∈ U are considered a node in V1 and V2, respectively. Then, we add edges between each

set and all elements contained in the set. A dummy node d2 is placed in V2 and is connected with

each Si ∈ V1. Another dummy node d1 is added into V1 and is connected to d2. Finally, we add

|S| + 1 dummy nodes into V2, each of which shares an edge with d1. After this configuration, we

obtain a bipartite graph. Lastly, we set ` = 2|S|+ |U |+ k− 1. We examine the potential size of the

induced stars centered at five different potential nodes: a set node, an element node, di with i ≥ 3,

d1, and d2 which helps us to show that a particular choice of the star centered at d2 corresponds to

a set cover (if one exists).

Figure 3.2: The transformation of Set Cover < U,S,k > to an instance < G(V,E), l > of Star DegreeCentrality.

S1 S2 S3 Sn d1

u1 u2 u3 um d2 d3 d4 d|S|+3

1. If Si ∈ V1 is the center, then the upper bound (UB) on the size of the potential neighborhood

is (|U | − 1) + (|S| − 1) + 1 = |U |+ |S| − 1 since either d1 or d2 can be in the neighborhood and

then all other Sj and uk nodes may be in it.

2. If ui ∈ V2 is the center, then the UB on the size of the potential neighborhood is (|S| − 1) +

(|U | − 1) + 1 = |U |+ |S| − 1 since either d2 can be in the neighborhood and then all other Sj

23

Page 35: Large-Scale Optimization Models with Applications in

and uk nodes may be in it.

3. If a dummy node di where i ≥ 3 is the center, the size of the neighborhood is |S| + 1. Every

dj such that (j ≥ 3 and j 6= i) and d2 are neighbor nodes while d1 is a leaf.

4. If dummy node d1 is the center, then the size of the neighborhood is 2|S|+ 1 by picking d2 as

a leaf node.

5. If dummy node d2 is the center, then d1 is considered a leaf and |S| + 1 nodes become the

neighbors (i.e., ∀dj , j ≥ 3). Every Si node can appear as either a leaf or in the star’s neighbor-

hood. Consider a partition of the set nodes into leaves and those in the star’s neighborhood.

If there is a node that is a leaf such that all elements uj in it are covered by other leaf node

sets, then we can move that set node to the neighborhood of the star and increase its size. If

there is a node in the neighborhood which contains one or more uj that are not in the star’s

neighborhood, then we can move that node to be a leaf and either keep the size the same (if

exactly one uj is uncovered) or increase the size of the neighborhood. This latter point shows

that we can create another star whose neighborhood size is greater than or equal to the size

of our current star. This means that all uj nodes should be in the neighborhood of the star.

Note that if |U | ≤ k in SCP, then the problem is solvable in polynomial time by verifying

that each element appears in one set. We focus our analysis on situations where |U |−k > 0. Suppose

there is a set cover, I such that |I| ≤ k. Consider the star centered at d2 with the set of leaf nodes

being d1, Si : i ∈ I. From Point 5, we know that all dj , j ≥ 3 are in the neighborhood, all Si′ for

i′ ∈ I are in the neighborhood, and all uj are in the neighborhood since I is a cover. This means

that this star has a size of |S| + 1 + |U | + |S| − |I| ≥ 2|S| + 1 + U − k = `. Alternatively, suppose

we have a star whose neighborhood is greater than or equal to `. This star has to be centered at d2

by Points 1-4 above. By Point 5, we know that we can convert this star (if necessary) to one where

all uj are in the neighborhood of the same or greater size. By accounting for the dummy nodes

dj , j ≥ 3 and the uk nodes, we have that |S| − k or more set nodes must be in the neighborhood.

Note that since all uj are in the neighborhood, this means that the set nodes that are leaves (there

are at most k of these) must cover all the elements. Therefore, there exists a set cover of less than

or equal to k sets.

24

Page 36: Large-Scale Optimization Models with Applications in

Definition 9. A graph is called a split graph, if the vertices can be partitioned into two sets where

one is a clique and the other one is an independent set.

Theorem 4. The SDC problem is NP-complete on split graphs.

Proof. We can create a reduction via a set cover instance in the following way.

V [G] = V1 ∪ V2 where V1 = S1, S2, · · · , Sm, d1 and V2 = u1, u2, · · · , un, d2, d3, d4 · · · , d|S|+3

E[G] = ∪mi=1∪j∈Si(Si, uj) ∪∪nj=1∪mp=1(uj , up)

∪ ∪mi=1(d2, Si) ∪ (d1, d2)

∪∪|S|+3i=3 (d1, di)

Note that we connect all the elements in the universe set with one another to create a clique instance.

With this formation following the similar steps discussed to prove Theorem 3, if we solve the SDC

problem, the dummy node d2 would be the center of the star with the largest objective value implying

that we obtain the solution for the set cover instance. Hence, we conclude that the SDC problem is

NP-complete when a split graph is concerned.

3.3 Solution Methodology

While both models proposed contain 3n binary variables, the number of constraints are

O(n + m) and O(n) in [VCIP] and [NIP], respectively. Solving the IP models via a commercial

solver is computationally challenging (see Section 3.5); especially, as the graph gets larger and/or

denser. Therefore, we first examine Benders Decomposition (Benders (1962)) for both formulations.

We find that the most computationally effective implementation of this decomposition approach is

a branch-and-cut framework that adds violated constraints from the original problem back into the

master problem. We propose to find a feasible induced star in the master problem (MP) and then

check the size of the neighborhood in the subproblem (SP), i.e., the z variables move to the SP in

both formulations. Hence, we are only concerned with optimality cuts.

We split the variables into (x, y) and (x, l) in the first stage for [VCIP] and [NIP], respectively.

This means that we have 5n+ 6m non-zero coefficients in the MP for the method using [VCIP] and

3n+ 4m for the method based on [NIP]. Given a fixed (y) or (l, x), we obtain the following SPs by

isolating ~z in the second stage:

25

Page 37: Large-Scale Optimization Models with Applications in

φV CIP (y) := maxz

∑i∈V

zi

s.t. zi ≤ 1− yi, ∀i ∈ V

zi ≤∑

j∈N(i)

yj , ∀i ∈ V

z ∈ 0, 1n

φNIP (l, x) := maxz

∑i∈V

zi

s.t. zi ≤ 1− li − xi, ∀i ∈ V

zi ≤∑

j∈N(i)

(lj + xj), ∀i ∈ V

z ∈ 0, 1n

We first note that the primal SPs represented above are separable over each node as shown

below. As a result, multiple Benders cuts can be generated at the same time.

φV CIP (y) =∑i∈V

φV CIPi (y) :=∑i∈V

maxzi∈0,1

zi : zi ≤ 1− yi, zi ≤∑

j∈N(i)

yi

φNIP (l, x) =

∑i∈V

φNIPi (l, x) :=∑i∈V

maxzi∈0,1

zi : zi ≤ 1− li − xi, zi ≤∑

j∈N(i)

(lj + xj)

We refer the reader to see Cordeau et al. (2019) for similar Benders frameworks generated

for both large-scale partial set covering and maximal covering problems, where the authors discuss

different ways of generating feasibility cuts (e.g., normalized and facet-defining feasibility cuts). Note

that for our methods, the procedure to generate cuts added based on fractional and integer solutions

are the same.

In examining both SPs for integer incumbent solutions (y) or (l, x), the binary decision

variables zi are bounded by integer values. Therefore, we can solve these SPs by relaxing the zi

variables which will be helpful in deriving Benders cuts for both integer and fractional values of (y)

and (l, x). Moreover, whenever an incumbent solution is passed to the relaxed SPs, the optional

solution to these problems is indeed binary, which shows the correctness of the traditional Benders

decomposition method to solve the problem. In particular, we can use LP duality to generate the

Benders cuts.

i. For [VCIP], since 0 ≤ yi ≤ 1, (1− yi) also lies in [0, 1] implying zi ≤ 1. Further,∑j∈N(i) yi is

a non-negative integer. Taking this into consideration with the fact we maximize over zi, we

do not need to explicitly enforce zi ≥ 0. Hence, we can relax the integrality and non-negativity

requirements on zi. We obtain:

φV CIPi (y) = maxzi

zi : zi ≤ 1− yi, zi ≤∑

j∈N(i)

yi

26

Page 38: Large-Scale Optimization Models with Applications in

ii. For [NIP], using the same reasoning, (1− li− xi) also lies in [0, 1], because a node cannot be a

leaf and center at the same time, implying zi ≤ 1. The right hand side (RHS)∑j∈N(i)(lj + xj)

also implies a non-negative integer. Hence, we obtain:

φNIPi (l, x) = maxzi

zi : zi ≤ 1− li − xi, zi ≤∑

j∈N(i)

(lj + xj)

Both MPs guarantee that the corresponding SP is always feasible and bounded. Therefore,

the dual SP (DSP) is also feasible and bounded by strong duality. We create following DSPs for

each SP introduced above.

ΦV CIPi (y) = minαi,βi≥0

αi(1− yi) + βi∑

j∈N(i)

yj : αi + βi = 1

ΦNIPi (l, x) = min

λi,ωi≥0

λi(1− li − xi) + ωi∑

j∈N(i)

lj + xj : λi + ωi = 1

As a result, we obtain the following Benders optimality cuts from solution y for [VCIP] and from

solution (x, l) for [NIP]:

µi ≤ αi(1− yi) + βi∑

j∈N(i)

yj ,∀i ∈ V

µi ≤ λi(1− li − xi) + ωi∑

j∈N(i)

(lj + xj),∀i ∈ V

Observe that the feasible region of the DSPs are independent from the upfront fixed master

variables. In fact, we can analytically approach these problems rather than solving their linear

programs. Let (1− yi) and∑

j∈N(i)

yj be represented by ΦiV CIP1 and Φi

V CIP2 , respectively. Further,

let (1− li − xi) and∑

j∈N(i)

(lj + xj) be represented by ΦiNIP1 and Φi

NIP2 , respectively. Without loss

of generality, we only present Algorithm 2 which solves the primal and dual formulations presented

above for [NIP] (i.e., φNIPi and ΦNIPi , respectively). Note that models φV CIPi and ΦV CIPi can be

solved in the same way. We then show that the algorithm satisfies the LP optimality conditions.

Proposition 2. The primal and dual variables calculated through Algorithm 2 are optimal solutions.

Proof. First of all, since constraint λi + ωi = 1 is satisfied (i.e., tight) for every (λ, ω, θ) in all

the assignment cases, the algorithm produces a dual feasible solution for a given solution vector

27

Page 39: Large-Scale Optimization Models with Applications in

Algorithm 2: Solution of φiNIP and Φi

NIP

Input: i ∈ V , 0 ≤ θ ≤ 1, ~l, ~x1 if Φi

NIP1 > 0 then

2 if ΦiNIP1 > Φi

NIP2 then

3 zi = ΦiNIP2 , λi = 0, ωi = 1;

4 else if ΦiNIP1 < Φi

NIP2 then

5 zi = ΦiNIP1 , λi = 1, ωi = 0;

6 else

7 zi = ΦiNIP1 , λi = θ, ωi = 1− θ;

8 else

9 if ΦiNIP2 = 0 then

10 zi = 0, λi = θ, ωi = 1− θ;11 else12 zi = 0, λi = 1, ωi = 0;

(l, x). As for the primal problem, we set zi = 0 for a node i if the RHS of either constraints

in φNIPi (l, x) is zero. On the other hand, if the RHSs of constraints are positive, then we set

zi = min1− li − xi,∑

j∈N(i)

(lj + xj). Therefore, we also obtain a primal feasible solution.

In addition, the objective values of φNIPi (l, x) and ΦNIPi (l, x) are the same (i.e., the strong

duality holds). In the case of primal variable zi = (1 − li − xi), we set the dual variables λi and

ωi accordingly to keep the contribution to the dual objective the same. When zi =∑

j∈N(i)

(lj + xj),

we set λi = 0, wi = 1 which yields the same objective in ΦNIPi (l, x). When zi = 0, based on the

value of∑

j∈N(i)

(lj + xj), we keep the contribution of node i to the dual objective as zero by tuning

the dual variables λi and ωi accordingly. Therefore, the algorithm produces primal/dual solutions

that satisfy the complementary slackness. As a result, the primal and dual variables calculated are

indeed optimal solutions.

We note that the Benders cut generated through this algorithm carries the same violation

characteristic independent from the value of θ. Ahat et al. (2017) provide a detailed discussion

including the proof conducted on an algorithm that solves a Bender SP in a similar fashion. However,

in our problem, setting θ to one of the integral bounds (i.e., 0 or 1) is preferred over fractional values

as to avoid cuts with fractional coefficients.

Remark 1. In Algorithm 2, setting θ = 1 produces sparser Benders cuts.

In fact, our preliminary results indicated that generating Benders cuts with θ = 1 produces

slightly better results compared to setting either fractional values (e.g., 0.5) or 0 values in terms of

28

Page 40: Large-Scale Optimization Models with Applications in

the solution time.

It is necessary to observe that setting θ between 0 and 1 yields Benders cuts that are the

convex combinations of the original constraints (i.e., Constraints (3.1b)-(3.1c) and (3.2b)-(3.2c) in

[VCIP] and [NIP], respectively) removed to obtain the MPs. This is due to the fact that there exist

a one-to-one correspondence between variables µi and zi. By setting θ to be either 0 or 1, the cuts

are the original constraints from the IP models. Therefore, we refer to our decomposition approach

as a general branch-and-cut method and examine common acceleration techniques used in Benders

decomposition.

3.4 Algorithmic Enhancements

In this section, we discuss the acceleration techniques that we utilize to speed up both

decomposition methods and directly solving the IP formulations.

3.4.1 Constraint Tightening

Recall that, constraints (3.2e) make sure that no leaf node shares an edge with another leaf.

The constraints also indicate that if a node i is not selected as a leaf, then any node j within its

neighborhood (i.e., j ∈ N(i)) can be a potential leaf. However, it is highly likely that some nodes

within N(j) are connected which implies that we might determine a better bound on the RHS of

the constraint.

Definition 10. Given a graph G = (V,E), the independence number of G is defined as the cardinality

of the maximum independent set. Formally, it can be stated as Θ(G) = max|U | : U ⊂ V, (i, j) /∈

E ∀i, j ∈ U.

Definition 11. Given a graph G = (V,E) and set of nodes S ⊂ V , the induced subgraph G[S] is a

graph which contains nodes in S and all the edges that connect any two nodes contained by S.

Proposition 3. Given a graph G = (V,E), the number of leaves of any star centered at some node

i ∈ V is upper bounded by Θ(G[N(i)]).

Proof. Considering the constraint that no leaf node is connected in a star, let us answer the following

question: “What is the largest number of nodes that can be selected as leaf nodes within N(i)?”.

In fact, this question is equivalent to the MIS which is the maximum number of nodes such that

29

Page 41: Large-Scale Optimization Models with Applications in

none of which is connected to the other in a given graph. Hence, a feasible star centered at node

i cannot have more leaves than the cardinality of MIS for the induced graph formed by the nodes

within N(i).

Remark 2. For a given graph G = (V,E), the total number of feasible stars can be computed by

enumerating the independent sets in G[N(i)],∀i ∈ V (see Kleitman and Winston (1982), Samotij

(2015) for discussions on how to count the number of independent sets).

We can interpret Proposition 3 in another way such that in an induced subgraph G, we

cannot select more leaves than Θ(G). That is why if one solves the MIS problem for the induced

graph generated by the neighborhood of each node, a good bound for the RHS of Constraint (3.2e)

is obtained. However, MIS cannot be solved efficiently due to its complexity. Yet, for each induced

graph, we can place a bound for the cardinality of the MIS.

For a given network G = (V,E), let I and Θ(G) be the MIS and the independence number,

respectively. Then, the number of edges for the nodes included in I is bounded above by Θ(G)(n−

Θ(G)). In addition, the number of edges between all the nodes j ∈ V \I and k ∈ I is bounded

above by(

Θ(G)2

). Therefore, it can be stated that m ≤ Θ(G)(n − Θ(G)) +

(Θ(G)

2

). Rearranging

the mathematical inequality, one can obtain the following standard UB for Θ(G) stated as γ(G)

(Schiermeyer, 2019):

Θ(G) ≤ γ(G) =1

2(1 +

√(2n− 1)2 − 8m) (3.5)

For every node i, we first form an induced graph G[N(i)]. Then, we calculate the bound

(i.e., γ(G[N(i)])) presented in Inequality (3.5) and rephrase constraints (3.2e);

∑j∈N(i)

lj ≤ γ(G[N(i)])(1− li), ∀i ∈ V (3.6)

3.4.2 Upper Bounds

it is important to initially bound the objective function∑i∈V µi to get high quality initial

solutions thereby obtaining faster convergence. We first state few natural UBs on the objective value

through introducing valid inequalities, and then propose a heuristic approach that approximates the

objective value. The very first natural UB on the objective value is calculated as n− 1. A star can

have at most n − 1 adjacent nodes where such star consists of a single center node. Then, the UB

30

Page 42: Large-Scale Optimization Models with Applications in

can be stated as:

∑i∈V

µi ≤ n− 1 (3.7)

Another important point is that the objective function (i.e., the size of the neighborhood

of a star) is only affected by the first and second degree nodes of the center node. Hence, we can

introduce another UB which changes according to the node selected as center and is calculated by

the summation of the size of the first and second degree nodes of the center.

∑i∈V

µi ≤∑i∈V

(|N(i)|+ |N2(i)|)xi (3.8)

Note that once a first degree node j ∈ N(i) is accepted as a leaf node, the RHS presented

in inequality (3.8) decreases by one. The key observation is that if node j produces a unique path

to any second degree node, then it can be considered a leaf node. In this case, we can decrease

|N(i)|+ |N2(i)| by one thereby tightening the RHS. If node j is not a leaf node in a feasible solution,

then its contribution would be one to the objective value, which is bounded above by the contribution

of the second degree nodes uniquely reached via node j. Hence, it stays as a valid bound. Based

on this argument, we propose Algorithm 3 which approximates a bound on the objective value for

every candidate node as the center.

After running Algorithm 3, a new bound δi,∀i ∈ V , which is in practice tighter than the

former ones, is obtained. Then, the following is a valid inequality for the IPs and MPs of the Benders

decomposition algorithms:

∑i∈V

µi ≤∑i∈V

δixi (3.9)

Notice that µi replaces zi in the original formulations where zi is a binary variable. There-

fore, the next natural UB is to bound each single µi based on the binary restriction. We note that

this one-to-one correspondence between µi and zi also indicates that the Benders cuts generated are

the convex combination of the original constraints removed from the model to obtain a restricted

MP. In other words, our Benders framework can be seen like a cutting-plane algorithm. The upper

bound constraints are:

µi ≤ 1, ∀i ∈ V (3.10)

31

Page 43: Large-Scale Optimization Models with Applications in

Algorithm 3: Bound strengthening at a given star-center i ∈ VInput: i ∈ V

1 δi = σ = 0;2 for k ∈ N2(i) do3 pred[k] = −1;4 visited[k] = 0;

5 for j ∈ N(i) do6 unique[j] = |(j, k) ∈ E : k ∈ N2(i)|;7 for k ∈ N2(i) do8 if (j, k) ∈ E then9 if visited[k] = 0 then

10 pred[k] = j;11 else if visited[k] = 1 then12 unique[j]−−;13 unique[pred[k]]−−;

14 else15 unique[j]−−;16 visited[k] + +;

17 for j ∈ N(i) do18 if unique[j] > 0 then19 σ + +;

20 if σ > 0 then21 δi = |N(i)|+ |N2(i)| − σ;22 else23 if |N2(i)| = 0 then24 δi = |N(i)|;25 else26 δi = |N(i)|+ |N2(i)| − 1;

27 return δi

Although constraints (3.10) are the tightest UB one can obtain for each individual µi, we

emphasize on that incorporating this UB increases the solution time and decreases solution quality

in every single instance of the decomposition implementation. We believe that this is attributed to

the fact that its addition changes the pre-solve and heuristic routines of the solver and that this tight

UB is simple enough for the solver to identify on its own. Therefore, the benefits of its potential

addition are outweighed by its drawbacks. Note that we could take a similar approach and remove

the binary restriction on zi in the IP models; however we observed that the average optimality gap

across instances increases in this situation. Therefore, our discussion remains valid only for the

restricted MPs.

32

Page 44: Large-Scale Optimization Models with Applications in

3.4.3 Parameter Tuning

For our decomposition implementation, we switch the MIP emphasis to optimality. Since

finding a feasible star is a relatively easy task, we prefer CPLEX to focus on optimality over feasibility.

Second, the strategy for variable selection is changed to strong branching with which CPLEX puts

more effort on identifying the most favorable branch. Note that strong branching goes through

each branch to identify the best one in terms of the contribution to the objective value. In certain

scenarios, this operation might be computationally challenging. Last, we set the relaxation induced

neighborhood search (RINS) as 1,000 where CPLEX applies the RINS heuristic at every 1,000 nodes.

When solving the IPs directly, we prefer the default CPLEX settings since no consistent improvement

in terms of the solution time and/or quality is observed.

3.4.4 Warm-Start

In our experiments, we use the ratio-based greedy approach proposed by Vogiatzis and

Camur (2019) to generate a set of high quality initial solutions. The heuristic is shown to have an

approximation guarantee of O(∆i) for node i where ∆i is the degree of node i ∈ V which is the

center of a candidate induced star.

The algorithm has two phases and continuously checks the ratio between the possible gain

and loss of adding a node into a star in terms of the cardinality of the open neighborhood. In the

first phase, we pick a node with the highest contribution to the objective where placing the node

into the star does not decrease the contribution of the other candidate leaves. In the second phase,

we look for a node which yields the highest ratio whose denominator keeps track of the potential loss

that could occur due to the adjacent nodes. For more details about the heuristic and its pseudocode,

we refer to reader to Vogiatzis and Camur (2019).

While the UBs introduced in Section 3.4.2 help the solver to tighten the dual bounds, our

intention with using warm-start is to help with the primal bounds. It is crucial to point out that

we use the valid inequalities (see Sections 3.4.1 and 3.4.2) if applicable for both IP models for a fair

comparison. For the warm-start strategy, we have a set of experiments to see its impact on each

model in Section 3.5.1.1.

33

Page 45: Large-Scale Optimization Models with Applications in

3.5 Experimental Results

All the experiments are conducted using Java and CPLEX 12.8.1 on an Intel Core i7-6500

CPU at 3.10GHz laptop with 16 GB of RAM. During the implementation of the decomposition

algorithm, we utilize the callback function feature to add the Benders cuts as lazy cuts and user

defined cuts. While Algorithm 3 and ratio-based heuristic are implemented in Java, the UB (3.5)

introduced in Section 3.4.2 is calculated in R using the igraph library. All data sets and code sources

used in our study are available online at https://github.com/mcamur/SDC.

3.5.1 Randomly Generated Instances

We first randomly generate test cases according to three well-known models through igraph

(Igraph, 2020): i) Barabasi–Albert (BA) (i.e., scale-free networks), ii) Erdos–Renyi (ER) (i.e.,

random networks), and iii)Watts–Strogatz (WS) (i.e., small-world networks). We consider instances

with n ∈ 500, 600, 700, 800, 900, 1000 regardless of the model type, as each model has its own

parametric settings, which are summarized in Table 3.1.

Table 3.1: Parameter settings

Model Parameter DefinitionBA g the number of edges generated at each stepER pr probability of adding an edge between randomly selected two nodes

WSr the rewiring probabilitynei the average degree of each node

In the BA model, we consider g in the set 10, 12, 14, 16. For ER model, we set pr as

in where i ∈ 10, 20, 30, 40, 50 and i ∈ 20, 30, 40, 50, 60 for 500, 600, 700 and 800, 900, 1000

nodes, respectively. Finally, in the WS model r is pulled from the set 0.3, 0.5, 0.7 in every instance,

and nei is in the set 12, 14, 16 and 14, 16, 18 for 500, 600, 700 and 800, 900, 1000 nodes,

respectively. Overall, the total number of instances generated in the BA, ER, and WS models are

24, 30, and 54, respectively.

During our computational studies, we set a time limit of 3,600 seconds, where we also take

the time required by Algorithm 3 into consideration. We will first test the impact of warm-start

on each solution technique and then proceed to the full set of analysis conducted on the randomly

generated networks. We present the comparisons between [NIP], [VCIP], [DNIP], and [DVCIP] for

each model where [DNIP] and [DVCIP] represent the decomposition implementations for the IP

34

Page 46: Large-Scale Optimization Models with Applications in

models [NIP] and [VCIP], respectively.

3.5.1.1 Warm-Start Analysis

We examine the impact of warm-start on the randomly generated networks where n ∈

500, 700, 900. The main goal is to decide whether performing the full analysis with or without

warm-start in each solution technique (i.e., [NIP], [VCIP], [DNIP] and [DVCIP]) should be done.

Note that we take the time taken to run the ratio-based greedy approach into consideration in each

instance.

Figure 3.3: The impact of warm-start in the solutiontimes in [NIP] in the BA model

-1800

-1600

-1400

-1200

-1000

-800

-600

-400

-200

0

14 12 10

500 700 900

Sec

on

ds

g / n

Barabási–Albert model [NIP]

Time Difference

Figure 3.4: The impact of warm-start in the opti-mality gaps in [NIP] in the BA model

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

14 12 16

500 900Op

tim

alit

y G

ap

g / n

Barabási–Albert model [NIP]

Gap Difference

Figure 3.5: The impact of warm-start in the solutiontimes in [VCIP] in the BA model

-500

0

500

1000

1500

2000

12 14 16 10 12 10 12 14

500 700 900

Sec

on

ds

g / n

Barabási–Albert model [VCIP]

Time Difference

Figure 3.6: The impact of warm-start in the opti-mality gaps in [VCIP] in the BA model

-0.12

-0.1

-0.08

-0.06

-0.04

-0.02

0

0.02

0.04

14 16 12 16

700 900

Op

tim

alit

y G

ap

g / n

Barabási–Albert model [VCIP]

Gap Difference

We compare the solutions obtained with and without warm-start from two different per-

spectives: (i) difference between the solution times when either produces a feasible solution, and (ii)

difference between the optimality gaps when either produces an optimal solution. We set thresholds

of 30 seconds and 0.5% for (i) ad (ii), respectively. If the absolute value of a difference value is less

than the corresponding threshold, we neglect to report such result. Note that the negative improve-

ment in both solution time and optimality gap indicates that warm-start improves the performance

of the solution technique utilized.

35

Page 47: Large-Scale Optimization Models with Applications in

In the BA model, we observe that while warm-start helps [NIP] to improve the solution time

a considerable amount in three instances out of 12, an inconsistent pattern takes place in terms of

optimality gaps (see Figs. 3.3 and 3.4). Furthermore, [VCIP] does not show a clear trend in both

solution times and optimality gaps as depicted in Figs. 3.5 and 3.6.

In the ER model, while warm-start increases the solution time in [NIP] in solely one instance

by roughly 2300 seconds (i.e., n = 900, pr = 0.033), we do not observe any instance where it helps

with the solution time. As for [VCIP], there is no instance with respect to the solution time that

meets our threshold definition of improvement (30 seconds). Furthermore, similar to the BA model,

no consistent pattern appears in terms of optimality gaps in both IP models as depicted in Figs. 3.7

and 3.8.

Figure 3.7: The impact of warm-start in the opti-mality gaps in [NIP] in the ER model

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.08 0.033 0.044 0.056

500 900

Op

tim

alit

y G

ap

pr / n

Erdős–Rényi model [NIP]

Gap Difference

Figure 3.8: The impact of warm-start in the opti-mality gaps in [VCIP] in the ER model

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.1 0.043 0.057 0.044 0.067

500 700 900

Op

tim

alit

y G

ap

pr / n

Erdős–Rényi model [VCIP]

Gap Difference

Lastly, in the WS model, we observe that warm-start helps [NIP] with the solution time in

two instances (i.e., n = 500, nei = 12, r = 0.3, n = 700, nei = 12, r = 0.5) to a great extent, which is

a decrease of nearly 3500 seconds. On the other hand, while [VCIP] shows a worse performance in

one instance (n = 500, nei = 12, r = 0.5) with an increase of around 1200 seconds via warm-start, no

apparent improvement is seen in any of the instances. Similar to other network models, we cannot

see a distinguishable performance with respect to the optimality gaps in both IP formulations when

warm-starting (see Figs. 3.9 and 3.10). Therefore, it becomes hard to reach a solid conclusion.

As for the decomposition implementations, we do not observe big changes with respect to

solution time and optimality gaps either; especially in both BA and ER models in the majority of

the instances. For the changes occurring, they turn out to be more erratic patterns compared to the

IP models. As an example, we share Figs. 3.11 and 3.12 which illustrate the solution time changes

in the WS model with warm-start in [DNIP] and [DVCIP], respectively.

36

Page 48: Large-Scale Optimization Models with Applications in

Figure 3.9: The impact of warm-start in the opti-mality gaps in [NIP] in the WS model

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.3 0.5 0.7 0.5 0.7 0.3 0.5 0.7 0.5 0.3 0.3 0.5 0.7 0.3 0.5 0.5 0.7

12 14 16 12 14 16 16 18

500 700 900

Op

tim

alit

y G

ap

r - nei - n

Watts–Strogatz model [NIP]

Gap Difference

Figure 3.10: The impact of warm-start in the opti-mality gaps in [VCIP] in the WS model

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.3 0.5 0.3 0.5 0.7 0.3 0.3 0.5 0.7 0.3 0.7 0.3 0.5 0.7

12 14 14 16 16 18

500 700 900

Op

tim

alit

y G

ap

r - nei - n

Watts–Strogatz model [VCIP]

Gap Difference

Figure 3.11: The impact of warm-start in the solu-tion times in [DNIP] in the WS model

-600

-500

-400

-300

-200

-100

0

100

200

300

400

0.5 0.3 0.5 0.7 0.3 0.7 0.5 0.7 0.3 0.5 0.7 0.7 0.3 0.7 0.5

12 14 16 12 14 16 14 16

500 700 900

Sec

on

ds

r - nei - n

Watts–Strogatz model [DNIP]

Time Difference

Figure 3.12: The impact of warm-start in the solu-tion times in [DVCIP] in the WS model

-1200

-1000

-800

-600

-400

-200

0

200

400

0.5 0.7 0.3 0.5 0.7 0.3 0.5 0.3 0.5 0.7 0.3 0.7 0.3 0.5 0.7 0.5

12 14 16 12 14 16 14 16

500 700 900

Sec

on

ds

r - nei - n

Watts–Strogatz model [DVCIP]

Time Difference

Our results have three main findings: i) the solver does not face a difficulty in improving the

primal bounds, which can also be practically observed when engine logs are analysed, ii) warm-start

does not improve the solution quality in terms of optimality gaps in many instances, and iii) one

cannot reach a sharp conclusion whether warm-starting both IP models and MPs via an effective

heuristic solution works well or not. As a result, we decide to move into the full analysis without

using warm-start as an acceleration technique.

3.5.1.2 Full Analysis

In this section, we compare the performance of the solution techniques on all randomly

generated networks. If the optimal solution is not obtained by the time limit (TL), we report the

optimality gap provided by CPLEX. For each instance, we share: i) the time taken to reach the

solution in seconds, ii) the optimality gap returned in %, and iii) the number of branch-and-bound

nodes saturated by the solver. In addition, we show n, m, the density of the graph represented by

D (i.e., 2m/[n(n − 1)]), and the corresponding parameters (see Table 3.1). Tables 3.3, 3.4 and 3.5

show the results for the BA, ER, and WS models, respectively.

37

Page 49: Large-Scale Optimization Models with Applications in

Table 3.2: Summary of results

BA Model - 24 instances ER Model - 30 instances WS Model - 54 instances

[NIP][VCIP][DNIP][DVCIP][NIP][VCIP][DNIP][DVCIP][NIP][VCIP][DNIP][DVCIP]

Optimal 10 14 20 19 11 12 14 14 13 21 35 34Pct 42 58 83 79 37 40 47 47 24 39 65 63Ave Gap 8.82 7.16 0.44 1.66 12.06 10.97 4.02 4.29 24.71 20.95 2.22 2.61Best 3 6 12 3 3 4 14 9 12 6 30 6

We start our analysis with a summary of the computational results in Table 3.2. For

each network model, we compare all four methods in terms of: i) the number of instances solved

to optimality, ii) the percentage of instances where optimal solutions were found, iii) the average

optimality gap over all instances, and iv) the number of instances where a method shows the best

performance. Note that the best performance is first identified based on the optimality gaps. If

more than one method reaches the optimal solution for the same instance, then we compare the

solution times.

We observe that the decomposition implementations significantly outperform the [NIP] and

[VCIP]. We do note that [VCIP] turns out to be the slightly better IP formulation; however, our

analysis indicates that [DNIP] outperforms [DVCIP].

To start with, both decomposition algorithms show a considerably high performance in the

BA model where [DNIP] and [DVCIP] solve two-times and and two-third-times more instances to

optimality compared to their corresponding IPs, respectively. However, when it comes to the ER

model, the performance of the two algorithms worsens, yet is still better than the IPs, and they

can only solve 14 of the instances, which is roughly half of the total number of ER instances. It is

important to mention that the instances that cannot be solved to optimality are the same instances

in both algorithms with two exceptions (n = 800, pr = 0.038 and n = 1000, pr = 0.03). Furthermore,

it is worth mentioning that there is no single instance in both BA and ER models where either of

the IP models reaches the optimal solution while decomposition methods do not.

The reason behind the lower performance shown via decomposition implementations in the

ER model compared to the BA models can be explained from two perspectives. First, the average

edge numbers and the average graph densities are 9, 656/0.036 and 13, 016/0.046 in the BA and ER

models, respectively. In other words, the problem gets harder to solve with higher edge numbers

and/or a denser graph. Also, the density of graphs in the ER model increases at a faster rate than

the other models for our selected parameters. Second, we examine the number of clique inequalities

added by the solver. For instance, while the solver generates 184 clique inequalities on average in the

38

Page 50: Large-Scale Optimization Models with Applications in

Table 3.3: The computational results for the BA Model

[NIP] [VCIP] [DNIP] [DVCIP]

n m D gTime(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

500 4945 0.04 10 58.16 0 9639 19.95 0 3144 78.97 0 1678 140.22 0 1517500 5922 0.048 12 88.61 0 17839 228.01 0 21646 133.51 0 2992 233.48 0 2482500 6895 0.055 14 TL 11.08 238608 309.08 0 22597 593.65 0 9663 569.76 0 4364500 7864 0.063 16 TL 9.48 363414 3292.34 0 162751 1328.52 0 21795 1668.44 0 11623600 5945 0.033 10 18.43 0 3910 260.31 0 10526 89.04 0 2024 106.83 0 1053600 7122 0.034 12 1824.26 0 139597 310.81 0 16615 203.28 0 3361 324.7 0 2580600 8295 0.046 14 171.11 0 22949 459.01 0 22185 641.98 0 8188 624.58 0 3950600 9464 0.053 16 TL 10.81 169044 TL 13.21 58488 1605.87 0 24178 2777.47 0 16317700 6945 0.028 10 141.95 0 13754 363.34 0 18020 316.04 0 4811 169.49 0 1785700 8322 0.034 12 3519.86 0 183841 485.29 0 29795 700.25 0 8734 442.82 0 3670700 9695 0.04 14 TL 13.95 131019 TL 14.50 65264 1883.1 0 23709 2021.94 0 14561700 11064 0.045 16 TL 13.47 148775 TL 15.82 49246 TL 2.15 37234 TL 3.09 18598800 7945 0.025 10 201.33 0 9630 51.12 0 4839 154.34 0 2390 100.9 0 816800 9522 0.03 12 3059.28 0 125518 TL 17.33 51590 818.19 0 7947 596.68 0 3782800 11095 0.035 14 TL 19.08 102405 TL 20.75 54750 1528.24 0 15288 2311.62 0 10275800 12664 0.04 16 TL 17.09 111822 TL 20.13 57051 TL 1.65 34356 TL 11.77 8614900 8945 0.022 10 1018.83 0 58500 122.62 0 4480 275.33 0 3626 339.92 0 2156900 10722 0.027 12 TL 3.00 135767 TL 10.80 49816 961.72 0 8640 1393.53 0 7405900 12495 0.031 14 TL 16.41 90017 946.41 0 36842 1964.96 0 19329 2432.43 0 11498900 14264 0.035 16 TL 19.04 130576 TL 17.77 41565 TL 1.15 29232 TL 10.71 83751000 9945 0.02 10 TL 20.45 82900 589.65 0 21920 631.54 0 5953 503.15 0 29791000 11922 0.024 12 TL 15.68 80103 2993.83 0 62964 1596.08 0 16203 1925.37 0 102231000 13895 0.028 14 TL 22.97 94927 TL 23.15 33999 2416.05 0 20431 TL 2.63 111921000 15864 0.032 16 TL 19.12 90223 TL 18.40 38594 TL 5.66 20919 TL 11.65 6166

BA model in [DNIP], this average drops to 10 in the ER model. For the [DVCIP], it produces, on

average, 2 clique inequalities in the BA model and only 0.8 in the ER model. As a potential future

research direction, one might be interested in incorporating clique inequalities for each triangle in a

cutting-plane manner to test whether it would strengthen the decomposition implementations.

In the WS model, while [DNIP] solves nearly threefold the number of instances solved by

[NIP], [DVCIP] solves one and a half times more than the instances solved by [VCIP]. For the

instances that are not solved to optimality, [DNIP] and [DVCIP] give an average of 6.30% and

7.05% optimality gaps, respectively. While both decomposition implementations far outperform the

corresponding IPs in the majority of the instances with respect to the solution status, we observe

only two instances where they fail to reach the optimal solution while [VCIP] does (see the instances

(n = 1000, nei = 16, p = 0.5) and (n = 1000, nei = 16, p = 0.7) in Table 3.5).

Note that both IP formulations show poorer performances on the WS model compared to

the other network models. First, we believe that the number of clique inequalities is again a driving

factor to reach the optimal solution especially in [VCIP]. For example, for the instances solved to

39

Page 51: Large-Scale Optimization Models with Applications in

Table 3.4: The computational results for the ER Model.

[NIP] [VCIP] [DNIP] [DVCIP]

n m D prTime(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

500 2469 0.02 0.02 7.98 0 159 5.13 0 139 0.52 0 0 0.77 0 0500 4999 0.041 0.04 52.88 0 1047 31.19 0 857 21.61 0 231 39.86 0 147500 7537 0.061 0.06 TL 19.21 82190 TL 18.81 79125 1611.1 0 12939 2406.97 0 11377500 9870 0.08 0.08 TL 11.13 99443 TL 11.10 72571 TL 3.90 22869 TL 6.15 5690500 12466 0.1 0.1 TL 7.71 109527 TL 7.69 75616 TL 7.36 10978 TL 7.30 3426600 2948 0.017 0.017 5.05 0 0 7.34 0 0 1.01 0 0 1.09 0 0600 6009 0.034 0.033 22.81 0 1215 34.24 0 971 101.22 0 721 121.21 0 437600 8993 0.051 0.05 TL 26.15 50422 TL 24.34 89329 2056.21 0 11162 1578.78 0 5446600 11967 0.067 0.067 TL 13.97 113537 TL 15.06 55589 TL 7.33 10120 TL 8.55 4122600 14993 0.084 0.083 TL 7.26 115613 TL 11.78 39343 TL 7.64 10802 TL 10.11 3057700 3483 0.015 0.014 11.7 0 57 7.12 0 0 0.78 0 0 1.28 0 0700 6895 0.029 0.029 35.81 0 1064 31.77 0 973 30.94 0 186 54.6 0 265700 10526 0.044 0.043 TL 30.30 22403 TL 33.38 98316 3182.75 0 155342024.99 0 5393700 13943 0.057 0.057 TL 18.36 48110 TL 17.07 33905 TL 5.80 5886 TL 6.47 6557700 17713 0.073 0.071 TL 11.60 48468 TL 11.29 26307 TL 7.36 7383 TL 9.12 3477800 7890 0.025 0.025 28.81 0 903 7.34 0 0 7.89 0 50 12.11 0 25800 11969 0.038 0.038 TL 33.70 102977 34.24 0 971 TL 1.45 10888 3440.81 0 6737800 15859 0.05 0.05 TL 24.06 62404 TL 24.34 89329 TL 9.63 0 TL 10.44 0800 20003 0.063 0.063 TL 15.14 52575 TL 15.06 55589 TL 9.12 3269 TL 8.48 2920800 19910 0.063 0.075 TL 14.47 36246 TL 11.78 39343 TL 8.21 3842 TL 7.55 2873900 9064 0.023 0.022 50.42 0 1241 39.3 0 1025 50.43 0 343 60.67 0 202900 13418 0.034 0.033 1285.1 0 14926 265.01 0 1737 2182.82 0 0 1957.7 0 2250900 17979 0.045 0.044 TL 29.64 28104 TL 23.34 21991 TL 8.47 2585 TL 8.90 4589900 22397 0.056 0.056 TL 19.28 17336 TL 16.33 9846 TL 8.84 3976 TL 8.32 1829900 22349 0.056 0.067 TL 15.60 36047 TL 16.58 4784 TL 8.60 3742 TL 8.46 21791000 10003 0.021 0.02 35.65 0 1408 32.41 0 838 8.28 0 34 8.82 0 281000 14926 0.03 0.03 164.97 0 5218 333.14 0 1918 3540.07 0 5114 TL 3.26 96551000 20008 0.041 0.04 TL 25.81 50235 TL 33.36 11367 TL 8.66 2083 TL 9.69 40131000 24896 0.05 0.05 TL 18.53 30891 TL 20.61 3325 TL 9.84 1850 TL 8.04 24321000 25015 0.051 0.06 TL 19.97 31503 TL 17.24 3470 TL 8.43 1846 TL 7.90 1784

optimality by [VCIP], the solver produces 364 clique inequalities on average. On the other hand, this

number drops to 30 for instances that fail to solve to optimality. Further, we expect to have more

feasible stars in WS model than in both BA and ER models. We believe this is due to the fact that the

small world nature of the WS model implies that there are many stars with open neighborhoods of

similar size centered at i because nodes tend to share a common neighbor. Potentially, this symmetry

may cause issues in solving the IP models. One might be interested in examining symmetry breaking

techniques during the search process for WS networks in the future. Lastly, since [NIP] is not as

tight as [VCIP] (please see the proof of Theorem 1 in the online supplement), we believe that the

graphs generated by the WS model may be more challenging for [NIP].

We now look at the cases where both decomposition algorithms reach the optimal solution

and make a comparison in terms of the solution time. As shown in Fig. 3.13, [DNIP] outperforms

40

Page 52: Large-Scale Optimization Models with Applications in

Table 3.5: The computational results for the WS Model

[NIP] [VCIP] [DNIP] [DVCIP]

n m D nei rTime(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

500 6000 0.049 12 0.3 TL 9.02 25308 TL 21.28 91678 51.7 0 227 24.28 0 105500 6000 0.049 12 0.5 TL 23.43 77774 2417.97 0 128770 574.56 0 4370 663.63 0 3293500 6000 0.049 12 0.7 TL 20.69 84833 2151.62 0 73650 232.84 0 1520 358.92 0 1173500 7000 0.057 14 0.3 TL 28.14 76538 TL 28.50 72283 482.08 0 3171 612.67 0 2075500 7000 0.057 14 0.5 TL 18.96 106315 TL 19.36 127357 221.26 0 0 278.13 0 1355500 7000 0.057 14 0.7 TL 20.04 86474 TL 17.37 86253 752.48 0 4378 873.46 0 3004500 8000 0.065 16 0.3 TL 24.59 195299 TL 25.43 77650 1831.41 0 14462 2941.25 0 12349500 8000 0.065 16 0.5 TL 18.76 177452 TL 20.46 78010 2915.64 0 21736 TL 5.16 10613500 8000 0.065 16 0.7 TL 20.17 199549 TL 20.96 89842 3462.82 0 24098 TL 4.97 9936600 7200 0.041 12 0.3 62.47 0 3084 63.42 0 1209 240.25 0 1472 343.41 0 1123600 7200 0.041 12 0.5 TL 20.20 62763 63.52 0 1149 145.35 0 681 114.59 0 353600 7200 0.041 12 0.7 37.89 0 2018 60.7 0 1153 357.76 0 1785 355.04 0 1176600 8400 0.047 14 0.3 TL 24.59 36714 TL 34.15 80105 114.52 0 390 118.75 0 201600 8400 0.047 14 0.5 TL 32.60 58742 TL 30.85 110016 2154.81 0 12498 3193.15 0 10954600 8400 0.047 14 0.7 TL 30.73 31864 TL 32.09 119002 2175.71 0 11386 1162.07 0 5026600 9600 0.054 16 0.3 TL 36.59 69550 TL 31.87 67384 2617.79 0 13338 3161.54 0 11501600 9600 0.054 16 0.5 TL 20.88 128234 TL 19.55 67171 1370.85 0 0 2021.53 0 0600 9600 0.054 16 0.7 TL 21.17 108631 TL 24.01 61322 2842.7 0 14824 TL 2.39 8895700 8400 0.035 12 0.3 48.16 0 1368 79.6 0 1339 10.64 0 71 41.4 0 173700 8400 0.035 12 0.5 TL 20.46 73254 93.55 0 1319 310.13 0 1105 385.78 0 1608700 8400 0.035 12 0.7 43.91 0 2246 93.88 0 1314 241.96 0 1209 327.95 0 795700 9800 0.041 14 0.3 TL 36.23 101988 TL 50.48 86444 468.17 0 1091 300.63 0 479700 9800 0.041 14 0.5 TL 31.33 63199 183.8 0 1379 795.9 0 2575 835.1 0 1843700 9800 0.041 14 0.7 TL 25.24 55767 125.84 0 1363 195.25 0 889 513.84 0 1325700 11200 0.046 16 0.3 TL 26.22 37692 TL 27.42 56269 105.26 0 202 131.25 0 170700 11200 0.046 16 0.5 TL 33.45 40710 TL 30.36 61073 TL 4.99 10109 TL 5.57 6050700 11200 0.046 16 0.7 TL 29.74 45132 TL 23.14 57484 1399.7 0 0 3391.95 0 2902800 11200 0.036 14 0.3 98.88 0 3825 286.84 0 1602 1306.57 0 3667 1412.73 0 3405800 11200 0.036 14 0.5 105.97 0 6467 172.33 0 1576 TL 4.00 7536 2785.38 0 8209800 11200 0.036 14 0.7 106.27 0 4124 169.39 0 1559 1188.81 0 3737 1340.02 0 2292800 12800 0.041 16 0.3 TL 51.67 72137 TL 51.78 52088 TL 2.45 9949 2029.77 0 6406800 12800 0.041 16 0.5 TL 39.25 36231 TL 33.31 60910 TL 3.48 5719 TL 3.66 7985800 12800 0.041 16 0.7 TL 32.89 58367 TL 34.33 52240 TL 6.27 9798 TL 8.36 6806800 14400 0.046 18 0.3 TL 41.58 49452 TL 45.56 42441 TL 6.88 5977 TL 7.96 7436800 14400 0.046 18 0.5 TL 26.99 80005 TL 26.96 41281 TL 4.82 5382 TL 5.50 7904800 14400 0.046 18 0.7 TL 30.80 50056 TL 25.96 45543 TL 7.06 4540 TL 7.90 6258900 12600 0.032 14 0.3 108.51 0 3025 280.17 0 1793 1591.75 0 3298 1441.56 0 1868900 12600 0.032 14 0.5 107.42 0 4075 258.75 0 1749 1136.86 0 2300 999.72 0 3231900 12600 0.032 14 0.7 104.11 0 5085 252.56 0 1733 1641.69 0 4338 1950.74 0 5006900 14400 0.036 16 0.3 TL 56.89 90708 TL 57.94 50475 TL 3.86 6021 TL 2.94 9397900 14400 0.036 16 0.5 TL 33.72 69199 TL 35.05 44805 1333.74 0 2868 1093.58 0 3233900 14400 0.036 16 0.7 TL 39.68 66614 TL 42.36 47839 TL 5.48 3897 TL 6.89 7582900 16200 0.041 18 0.3 TL 46.37 56945 TL 47.88 30751 TL 8.94 4912 TL 9.60 5537900 16200 0.041 18 0.5 TL 34.79 74381 TL 34.74 29603 TL 10.20 3297 TL 10.96 3975900 16200 0.041 18 0.7 TL 32.97 51064 TL 35.54 27628 TL 5.68 2982 TL 6.55 56521000 14000 0.029 14 0.3 127.2 0 2867 241.73 0 1978 2027.07 0 3646 1490.76 0 49501000 14000 0.029 14 0.5 100.75 0 3070 217.52 0 1922 1328.25 0 2301 894.33 0 25601000 14000 0.029 14 0.7 84.9 0 1996 184.2 0 1781 202.7 0 375 281.6 0 7191000 16000 0.033 16 0.3 TL 77.90 82180 TL 75.20 35581 TL 9.31 4730 TL 10.46 76211000 16000 0.033 16 0.5 TL 50.80 113819 505.56 0 1992 TL 8.35 3539 TL 8.35 69671000 16000 0.033 16 0.7 TL 39.52 121941 317.29 0 1961 TL 4.19 4891 TL 5.47 71861000 18000 0.037 18 0.3 TL 48.99 54205 TL 47.41 23574 TL 7.15 4069 TL 9.52 74271000 18000 0.037 18 0.5 TL 43.92 88314 TL 42.44 18365 TL 10.69 3309 TL 11.64 53191000 18000 0.037 18 0.7 TL 32.27 61227 TL 37.70 23845 TL 5.87 2756 TL 7.08 6828

[DVCIP] solution-time-wise and reaches the optimal solution quicker in 12 instances. As for the ER

model, we observe slightly a different trend. For the instances where both method take more than

41

Page 53: Large-Scale Optimization Models with Applications in

1,000 seconds to solve (i.e., four instances), [DVCIP] performs better and outperforms [DNIP] in

three instances (see Fig. 3.14). Even though overall [DNIP] produces a better solution time in more

instances ( i.e., 10 out of 13 instances), [DVCIP] is 75 seconds faster than [DNIP] on average. Lastly,

as for the WS model, [DNIP] notably outperforms [DVCIP] as depicted in Figs. 3.15 and reaches

the optimal solution faster in 22 instances out of 32. On average, [DNIP] is 139 seconds faster than

[DVCIP].

Figure 3.13: Solution time comparison between[DNIP] and [DVCIP] in the BA model

0

500

1000

1500

2000

2500

3000

10 12 14 16 10 12 14 16 10 12 14 10 12 14 10 12 14 10 12

500 600 700 800 900 1000

So

luti

on

Tim

e (s

ec)

g / n

Barabási–Albert model

[DNIP]

[DVCIP]

Figure 3.14: Solution time comparison between[DNIP] and [DVCIP] in the ER model

0

500

1000

1500

2000

2500

3000

3500

0.02 0.04 0.06 0.017 0.033 0.05 0.014 0.029 0.043 0.025 0.022 0.033 0.02

500 600 700 800 900 1000

So

luti

on

Tim

e (s

ec)

pr / n

Erdős–Rényi model

[DNIP]

[DVCIP]

Figure 3.15: Solution time comparison between [DNIP] and [DVCIP] in the WS model

0500

1000150020002500300035004000

0.3

0.5

0.7

0.3

0.5

0.7

0.3

0.3

0.5

0.7

0.3

0.5

0.7

0.3

0.5

0.3

0.5

0.7

0.3

0.5

0.7

0.3

0.7

0.3

0.7

0.3

0.5

0.7

0.5

0.3

0.5

0.7

12 14 16 12 14 16 12 14 16 14 14 16 14

500 600 700 800 900 1000

So

luti

on

Tim

e (s

ec)

r - nei - n

Watts–Strogatz model

[DNIP]

[DVCIP]

Although the new IP formulation [NIP] could not compete with the formulation [VCIP], the

decomposition implementation [DNIP] shows a better performance compared to [DVCIP] in terms of

both solution time and solution quality in more instances. First, as mentioned earlier, the number of

constraints is bounded by O(n) in [NIP]. and its number of non-zero coefficients are lower compared

to [VCIP]. Second, the number of non-zero coefficients is further decreased in [NIP] by constraint

tightening( Section 3.4.1). Third, when decomposing [NIP], the two constraints causing the increase

in the number of non-zero coefficients – constraints (3.2b) and (3.2c) – are placed in the SP. In fact,

as discussed previously, one the MPs of [VCIP] and [NIP] have i) 5n + 6m and 3n + 4m non-zero

coefficients, and ii) 2n + m + 1 and 2n + 1 constraints, respectively. All these facts imply that the

restricted MP generated via [NIP] is more efficient than the MP generated via [VCIP]. Note that

42

Page 54: Large-Scale Optimization Models with Applications in

even though Theorem 1 states that [VCIP] is stronger than [NIP] with respect to LP-relaxations,

we observe that the root node relaxations turn out to be same in all randomly generated instances,

implying that the size of the formulations likely plays an important role in the quality of solving

them. Lastly, the number clique inequalities created by the solver in [DNIP] is significantly higher

than [DVCIP] on average in all three network models. Taking all these into consideration, it makes

sense that [DNIP] produces more fruitful results than [DVCIP].

Figure 3.16: The optimality gap comparisons in[NIP], [VCIP], [DNIP] and [DVCIP] in the BAmodel

0%

5%

10%

15%

20%

25%

14 16 16 14 16 12 14 16 12 14 16 10 12 14 16

500 600 700 800 900 1000

Op

tim

alit

y G

ap

g / n

Barabási–Albert model

[NIP]

[VCIP]

[DNIP]

[DVCIP]

Figure 3.17: The optimality gap comparisons in[NIP], [VCIP], [DNIP] and [DVCIP] in the ER model

0%

5%

10%

15%

20%

25%

30%

35%

40%

0.0

6

0.0

8

0.1

0.0

5

0.0

67

0.0

83

0.0

43

0.0

57

0.0

71

0.0

38

0.0

5

0.0

63

0.0

75

0.0

44

0.0

56

0.0

67

0.0

3

0.0

4

0.0

5

0.0

6

500 600 700 800 900 1000

Op

tim

alit

y G

ap

pr / n

Erdős–Rényi model

[NIP]

[VCIP]

[DNIP]

[DVCIP]

Figure 3.18: The optimality gap comparisons in [NIP], [VCIP], [DNIP] and [DVCIP] in the WS model

0%10%20%30%40%50%60%70%80%90%

0.3 0.7 0.5 0.3 0.7 0.3 0.7 0.5 0.5 0.5 0.3 0.7 0.3 0.7 0.5 0.3 0.7 0.5 0.3 0.7 0.5

12 14 16 12 14 16 12 14 16 14 16 18 16 18 16 18

500 600 700 800 900 1000

Op

tim

alit

y G

ap

r - nei - n

Watts–Strogatz model

[NIP]

[VCIP]

[DNIP]

[DVCIP]

Lastly, we compare all four methods in terms of the optimality gaps to solidify our point

when one of the methods cannot reach the optimal solution. We present Figs. 3.16, 3.17, and 3.18

where it can be clearly seen that both decomposition implementations show a better performance

than their corresponding IPs. Fig. 3.16 illustrates that [DNIP] is the best method when we have a

graph following the properties of the BA model. When we cannot reach the optimal solution with

it, the optimality gap does not exceed 5.66%. On the other hand, both IP models returns over

12.5% optimality gaps for the instances shown in Fig 3.16. The ER model turned out to be the most

challenging model where even decomposition methods had a hard time to converge to the optimal

solution for certain instances (see Fig. 3.17) whose potential reasons are discussed earlier. Yet,

[DNIP] and [DVCIP] never return an optimality gap larger than 9.84% and 10.44%, respectively.

43

Page 55: Large-Scale Optimization Models with Applications in

As for the WS model, Fig. 3.18 depicts that as the number of nodes go up, both IP models start

returning poorer optimality gaps with few exceptions. On the other hand, both decomposition

implementations show a strong performance with the instances up to 800. When the number of

nodes is 800 or more, the average optimality gaps become 6% and 6.4% in [DNIP] and [DVCIP],

respectively; in certain cases which is still better than solving the IP model directly.

3.5.2 Protein-Protein Interaction Networks (PPINs)

In this section, we analyze the datasets of two organisms: i) Helicobacter Pylori (HP) and

ii) Staphylococcus Aureus (SA) obtained by Szklarczyk et al. (2015). Each data set is converted

into a PPIN as follows. A protein is represented by a node that is connected by an edge to all

other proteins if there exists an interaction. Each interaction is associated with an interaction score

defined within the range of [0, 1000].

With this configuration, the networks created turn out to be highly dense graphs with

diameter equal to six. The number of nodes and edges are (n = 1, 570,m = 89, 507) and (n =

2, 852,m = 146, 783) for HP and SA, respectively. Hence, we prune the interactions which are below

a certain threshold. In this study, we set the interaction threshold κ as 600, 500, 400, 300 and

500, 400, 300, 200 for the organisms HP and SA, respectively. As a result, we obtain four networks

per organism studied. In addition, we increase the time limit to 10,800 seconds (i..e, 3 hours) due

to the size of the networks.

Table 3.6: The computational results for Helicobacter Pylori (n = 1, 570)

[NIP] [VCIP] [DNIP] [DVCIP]

κ mTime(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

600 17735 888.09 0 7617 3117.88 0 43721 78.19 0 595 415.9 0 1522500 27570 TL 16.20 59510 TL 17.28 63273 741.73 0 2709 3356.8 0 8329400 33663 TL 18.32 68789 TL 21.16 55947 9572.51 0 28965 TL 5.69 11412300 45123 TL 15.03 53843 TL 13.53 36859 TL 4.32 12271 TL 6.32 9815

We first share the computational results for HP (see Table 3.6). As κ decreases, the difficulty

in solving the problem increases since the graph gets denser. We initially point out that [VCIP] shows

the worst performance where it takes 51 minutes to reach an optimal solution when all other methods

converge to optimality in under 15 minutes when κ = 600. In addition, when κ is set as 500 and 400,

we obtain the worst optimality gaps employing [VCIP]. This is an interesting finding since [VCIP]

showed marginally a better performance than [NIP] on the randomly generated graphs as discussed

44

Page 56: Large-Scale Optimization Models with Applications in

in the previous section. On the other hand, [DNIP] outperforms all three methods by reaching

the optimal solution in three instances out of four. Even though none of the methods reaches the

optimal solution when κ is 300, [DNIP] provided the best optimality gap (4.32%).

We now share Table 3.7 and the results for SA. Once again, we observe that [VCIP] shows

a poorer performance compared to the others. For instance, when κ is set as 400, even though all

three other methods converge to the optimal solution, [VCIP] returns an optimality gap of 16.37%.

Similar to the results seen in HP, [DNIP] produces the best optimality gaps when no other method

can reach the optimal solution. Yet, even though [DNIP] gives the best optimality gap when κ = 200,

the result does not seem as good as the other instances (i.e., 18.09%). Therefore, it might be better

to increase the solution time limit when κ ≤ 200. Lastly, it is worth mentioning that [NIP] reaches

the optimal solution roughly two times faster than both decomposition methods when κ = 400.

Table 3.7: The computational results for Staphylococcus Aureus (n = 2, 852)

[NIP] [VCIP] [DNIP] [DVCIP]

κ mTime(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

Time(sec)

Gap(%)

BBNodes

500 21549 65.18 0 415 89.39 0 621 94.19 0 0 45.34 0 43400 30276 202.13 0 2576 TL 16.37 29888 429.81 0 1557 504.22 0 957300 45645 TL 37.30 48008 TL 32.64 36084 TL 3.40 10957 TL 13.20 6671200 87607 TL 26.54 21999 TL 27.93 18250 TL 18.09 5873 TL 27.99 2687

Our computational results on the real-world PPINs indicate that [DNIP] is the best method

among all others methods where the optimal solution can be reached for the most of the instances

for both organisms tested (i.e., 75% and 50% success rate for HP and SA, respectively). On the

other hand, the new IP formulation showed a better performance compared to the one existing in the

literature that is different than the observation made in the previous section. We can interpret this

from two different points of view: i) [NIP] might be more effective in larger and denser graphs, and/or

ii) [NIP] works better specifically in PPINs which carry different characteristics (e.g., following

different probability distributions) than the well-known networks models.

3.6 Conclusion

In this chapter, we first introduce a new IP formulation for the SDC problem where the goal

is to identify the induced star with the largest open neighborhood. We then show that while the

SDC can be efficiently solved in tree graphs, it remains NP-complete in bipartite and split graphs

45

Page 57: Large-Scale Optimization Models with Applications in

via a reduction performed from the set cover problem. In addition, we implement a decomposition

algorithm inspired by the Benders Decomposition together with several acceleration techniques to

both the new IP formulation and the existing formulation in the literature. Finally, we share ex-

tensive computational results on three well-known network models (Barabasi–Albert , Erdos–Renyi,

and Watts–Strogatz model), and large-scale PPINs generated for two organisms (Helicobacter Pylori

and Staphylococcus Aureus).

Our findings include: i) the existing formulation performs better with respect to the solution

time and solution quality when solving the IP models via a branch-and-cut process on randomly

generated graphs; ii) the new formulation starts showing its effectiveness in real networks as the size

and density increase; iii) the decomposition approaches significantly outperform both IP models in

every network model; and iv) the decomposition approach based on the new IP model is shown to be

a more effective decomposition framework than the one designed based on the previously proposed

IP model.

In the future, it might be interesting to investigate the weighted SDC problem and analyze

the impact of the weights on the identification of the essential proteins, rather than employing

thresholds to cut off less frequent protein-protein interactions. In addition, from an algorithmic

perspective, it could be a good direction to accelerate the decomposition implementations by: i)

working on determining new valid inequalities and ii) incorporating clique inequalities especially for

triangles.

46

Page 58: Large-Scale Optimization Models with Applications in

Chapter 4

The Stochastic Pseudo-Star Degree

Centrality

We show that the SPSDC problem is NP-complete on general graphs, trees, and windmill

graphs in Section 4.1. Next, we introduce a non-linear binary optimization model for the SPSDC

problem and convert it into a linear form via McCormick inequalities (see Section 4.2). While Section

4.3 discusses the solution methodology that we adapt (i.e., Benders Decomposition), we present the

algorithmic enhancements in Section 4.4. We focus on the data generation phase and provide a wide

range of computational experiments in Section 4.5. Lastly, we summarize our contribution and share

our insights for a future research in Section 4.6.

4.1 Complexity Discussion

We first discuss the computational complexity of the problem of detecting the node as the

center of the stochastic pseudo-star with the maximum-connection probability. Below, we present

the decision version of the problem.

Definition 12. (Stochastic Pseudo-Star Degree Centrality) Given an undirected graph

G = (V,E), probability vector #»p , user-defined value θ and a positive real-number `, does there exists

an induced pseudo-star Sk centered at any node k such that the total assignment probability is at

least T ?

47

Page 59: Large-Scale Optimization Models with Applications in

We show that SPSDC problem is NP-complete under the assumption of P 6= NP.

Theorem 5. The stochastic pseudo-star degree centrality is NP-complete.

Proof. Given an instance of SDC < G, ` >, let us generate an instance of SPSDC as < G, T , #»p , θ >

where G = G, T = `, #»p =#»1 , θ = 0. With this formation, one can realize that the SPSDC problem

solves the SDC due to the fact that (i) no leaf can share an edge according to Ineq. (2.1), (ii) the

objective function becomes the maximization of the number of nodes in the open neighborhood.

Hence, the proof is relatively straightforward and we can conclude that the problem at hand is

NP-complete.

The SDC problem is shown to be solvable in polynomial time on trees by proposing an

algorithm running in O(m) in Chapter 3. However, we show that the SPSDC problem remains

NP-complete even the given graph is a tree by a reduction from the knapsack problem.

Definition 13. (Knapsack) Given a set of q items (I = 1, 2, · · · , q) with sizes s1, s2, · · · , sq and

values v1, v2, · · · , vq, capacity C, and value V, does there exists a subset K ⊆ 1, 2, · · · , q such that

the total size and total value of the subset are less than or equal to C and greater than or equal to V,

respectively?

Theorem 6. The stochastic pseudo-star degree centrality is NP-complete on trees.

Proof. Given an instance of the knapsack problem < ~s,~v, C,V >, we will create an instance of the

SPSDC problem as < G(V,E), `, #»p , θ >. The underlying graph G(V,E) is a tree where nodes and

edges are defined as follows (see Fig. 4.1 for a visual representation of the network):

V = d1, d2, d3, d4, d12, d

22, · · · , d

q2, d

13, d

23, · · · , d

q3, d

14, d

24, · · · d

q4, 1, 2, · · · q, 1′, 2′, · · · q′, 1′′, 2′′, · · · q′′

E = ∪qi=1(d1, i) ∪ (d1, d2) ∪ (d1, d3) ∪ (d1, d4) ∪ ∪qi=1(d2, di2) ∪ ∪qi=1(d3, d

i3)

∪ ∪qi=1(d4, di4) ∪ ∪qi=1(i, i′) ∪ ∪qi=1(i, i′′)

where each i ∈ I represents an item and the rest of the nodes are considered dummy nodes whose

total number is 5q+ 3. For the sake of simplicity, let us assume that q ≥ 2 and vi ∈ Z+, which does

not change the complexity of the knapsack problem. We set ` = Vvmax

+ 3q +∑j∈I e

−sj/C where

vmax := maxvi : i ∈ I.

Let us consider the SPSDC problem for a given node d on a tree, which helps us to define

both ~p and θ. Since trees are acyclic graphs, we are not concerned with the connections between leaf

48

Page 60: Large-Scale Optimization Models with Applications in

nodes. In such a scenario, any node j adjacent to d can contribute to the objective in two different

ways. If it is a leaf node, then its contribution is∑k∈N(j):d 6=k pkj ; pdj otherwise. Moreover, the only

constraint that should be satisfied is the feasibility condition which is defined as∏

j∈N(k)

pdj ≥ (1− θ).

We are now ready to determine the probability values used in network G. First, we assign

pd1i = e−si/C ,∀i ∈ I. We then set pii′ = vivmax

and pii′′ = e−si/C ,∀i ∈ I. All remaining probability

values are set equal to one. Lastly, we set θ = (1− 1e ). Our transformation is presented in Fig. 4.1,

where edges are labeled with their probability values.

Figure 4.1: The transformation of Knapsack < ~s,~v, C,V > to an instance < G(V,E), `, ~p, θ > of StochasticPseudo-Star Degree Centrality on a tree

d1

d2

d3

d4

d13

d23

dq3

d14

d24

dq4

dq2

d22

d12

1′

q′

1′′

q′′

1

q

1

e−s1/C

e−sq/C

1

1

1

1

1

1

1

1

1

v1vmax

vqvmax

1

1

e−s1/C

e−sq/C

We examine the problem in order to demonstrate that the induced pseudo-star with largest

pseudo-star degree centrality will occur with a center of d1. We consider each possible center:

i. If dummy node d1 is selected as the center, d2, d3 and d4 are directly selected as leaf nodes

which adds 3q to the objective. The other candidate leaf nodes are in set I. Remember that

for any node i ∈ I, the objective value is guaranteed to increase at least by e−si/C regardless

of i being a leaf or a neighbor node. Thus, one can see that the objective is bounded below by

3q.

ii. If node dj is selected as the center where j = 2, 3, 4, then the pseudo-star centered at dj

selects d1 as the only leaf since adding another node into the star would only decrease the

objective value. Such a pseudo-star clearly satisfies the feasibility condition since 1 > (1− θ)

49

Page 61: Large-Scale Optimization Models with Applications in

and the pseudo-star degree centrality becomes (q + 2) +∑i∈I e

−si/C < 2q + 2.

iii. If node dij is selected as the center node where j = 2, 3, 4 and i ∈ I, then the corresponding

node i can be the only possible leaf node and the pseudo-star degree centrality would be q.

iv. If node i ∈ I is selected as the center node, then the objective can become at most vi/vmax +

3 + e−si/C +∑j∈I:i 6=j e

−sj/C < q + 4.

v. If node i′ where i ∈ I is selected as the center node, then there are two possibilities where the

objective is either vi/vmax or 2e−si/C . As a result, the objective is bounded above by two.

vi. If node i′′ where i ∈ I is selected as the center node, then similar to the previous point,

there are two possibilities where the objective is either e−si/C or e−si/C + vi/vmax. Thus, the

objective is bounded above by two.

The points above indicate that the best pseudo-star is centered at d1, thus, we continue our

analysis using d1 as our basis.

Now, suppose there exists a feasible knapsack instance K such that the total value is greater

than or equal to V. We argue that there is a feasible pseudo-star with centrality greater than or

equal to `. Consider the pseudo-star centered at d1 with leaf nodes d2, d3, d4 and j ∈ K. First, we

examine the feasibility condition of the star:

∏(d1,j):j∈K

e−sj/C ≥ 1− (1− 1

e)

take the log⇐======⇒of both sides

∑(d1,j):j∈K

log(e−sj/C) ≥ log(1

e)

We then examine the generic knapsack capacity constraint below.

∑j∈K

sj ≤ Cdivide by C⇐=====⇒

∑j∈K

sjC≤ 1

raise e to the⇐==========⇒value on each side

e∑j∈K sj/C =

∏j∈K

esj/C ≤ e take the log⇐======⇒

∑j∈K

log(esj/C) ≤ log(e)multiply by -1⇐========⇒

∑j∈K

log(e−sj/C) ≥ log(1

e)

One can see that the feasibility condition is satisfied since the knapsack feasibility is sat-

isfied. For each j ∈ K, the contribution to the objective isvjvmax

+ esj/C , which is in total Vvmax

+∑(d1,j):j∈K e

sj/C . Each j /∈ K becomes a neighbor node covered by d1, which increases the objective

by∑j /∈K e

−sj/C . Nodes d2, d3, and d4 increases the objective by 3q as explained in Point 1. Overall,

the pseudo-star centered at d1 produces an objective of 3q + Vvmax

+∑j∈I e

−sj/C ≥ `.

50

Page 62: Large-Scale Optimization Models with Applications in

Alternatively, suppose we have a pseudo-star which yields an objective greater than or equal

to `. This pseudo-star must be centered at d1 by Points 2-6. We can extract a knapsack solution

by isolating leaf nodes selected in set I. The leaf nodes selected satisfy the feasibility condition, in

other words, the knapsack capacity constraint is satisfied,which can be verified by working the steps

above ‘backwards’.

We know that nodes d1, d2 and d3 contribute to the objective by 3q implying that the

rest of the objective value is increased by nodes in I as being either a leaf node or a neighbor

node. Let L be the set of leaf nodes selected from 1, 2, . . . , q. Then, the objective increases by∑j∈L e

sj/C+∑j /∈L e

sj/C =∑j∈I e

sj/C , whose value is always the same and completely independent

from which or how many nodes are in L. In addition, for each j ∈ L, node j addsvjvmax

to the

objective where this total contribution must be at least Vvmax

since the pseudo-star is assumed to

have an objective bounded below by `. Thus, we can isolate all those nodes in L and rescale the

summation ofvjvmax

s through multiplying it by vmax. As a result, we can obtain a knapsack solution,

based on those nodes in L, whose objective is at least V.

In addition, it is shown that the SDC problem has a trivial unique optimal solution on a

windmill graph in Chapter 3. However, we present a reduction through the knapsack problem to

prove that the SPSDC problem preserves its complexity. The problem remains NP-complete on

windmill graphs.

Theorem 7. The stochastic pseudo-star degree centrality is NP-complete on windmill graphs.

Proof. Given a knapsack instance, we create a SPSDC instance on a windmill graph whose con-

struction is presented below. We create q + 1 cliques of size three all of which are connected to a

universal vertex named d (see Fig. 4.2 for the visualization).

V = d, d1, d2, d3, 1, 2, · · · q, 1′, 2′, · · · q′, 1′′, 2′′, · · · q′′

E = ∪3i=1(d, di) ∪ ∪3

i=2(d1, di) ∪ (d2, d3) ∪ ∪qi=1(d, i) ∪ ∪qi=1(d, i′) ∪ ∪qi=1(d, i′′)∪

∪qi=1(i, i′) ∪ ∪qi=1(i, i′′) ∪ ∪qi=1(i′, i′′)

Prior to setting the probabilities in our reduction, we first identify ω ∈ (0, 1) such that

2qω < min

1vmax

, e−s1/C , · · · , e−sq/C , 1e

. This helps us to ensure that we cannot create a feasible

pseudo-star i) when it is centered at node d with any node i′ and / or i′′ selected as a leaf, and ii)

51

Page 63: Large-Scale Optimization Models with Applications in

when it is centered at node i′ or i′′ with d being a leaf node. We then set ` = Vvmax

+∑i∈I e

−si/C+ 2

and θ = 1− 1e .

Now, we assign the probability values. We set pdi = pii′′ = e−si/C ,∀i ∈ I, pii′ = vivmax

,∀i ∈ I,

pdd1 = 1, and pjd1 = 1,∀j ∈ N(d1). Lastly, ω is assigned for the rest of the edges as depicted in Fig.

4.2. Let us examine the potential objective values for different pseudo-stars centered at each node.

Figure 4.2: The transformation of Knapsack < ~s,~v, C,V > to an instance < G(V,E), `, ~p, θ > of StochasticPseudo-Star Degree Centrality on a windmill graph

d

1

1

2

1′

3

1′′

q q′ q′′

d1

d2

d3

e−s1/C

e−s1/C

v1vmax

ω

ω ω

e−sq/C

e−sq/C

vqvmax

ω

ω ω

1

ω

ω

1

ω

1

i. If node d is selected as the center of the pseudo-star, then node d1 is selected as a leaf node

which does not strain the feasibility condition and increases the objective by two. Then, node

i ∈ I in each clique becomes the only possible node to be selected as a leaf. In this scenario,

the objective is bounded below by∑i∈I e

−si/C + 2qω + 2 + mini∈I vivmax.

ii. If node i ∈ I is the center node, then node d is preferred as a leaf node over nodes i′ and i′′

to have access to the rest of the network. The objective value can be at most∑i∈I e

−si/C +

2qω + vivmax

.

iii. If the pseudo-star is centered at node d1, then node d becomes the only leaf node, and the

objective becomes∑i∈I e

−si/C + 2qω + 2.

iv. A pseudo-star centered at any other node cannot have d as a leaf, thus, cannot compete with

the other pseudo-stars in terms of the objective value.

Our discussion above shows that the best pseudo-star with the largest SPSDC is obtained when it

is centered at node d. Also, note that both pseudo-star and knapsack instances obtained in each

52

Page 64: Large-Scale Optimization Models with Applications in

direction of the proof must satisfy the feasibility conditions due to the same reasons presented in

Theorem 6.

Suppose we are given with a knapsack instance K whose total value is greater than or

equal to V. Let us examine the pseudo-star centered at d with leaf nodes d1 and j ∈ K. While

d1 contributes to the objective by two, each node j ∈ K produces an objective ofvjvmax

+ esj/C as

discussed in Theorem 6. Hence, we obtain a pseudo-star whose objective is at least `.

Now, suppose we have a pseudo-star instance with an objective of at least `. Such pseudo-

star must be centered at node d due to Points 2-4 and the selection of ω where ω < 1vmax

. The

pseudo-star cannot have any node i′, i′′, d2 or d3 as a leaf since ω is guaranteed to be less than

1 − θ. Let us now show how the objective value is calculated by considering the rest of the nodes

as a candidate leaf. The objective value increases by∑

(d,i):i∈I esj/C regardless of i ∈ I being a leaf

or neighbor due to the way we constructed the network. In addition, d1 becomes a leaf since it

increases the objective by two as a leaf, while only increasing it by one as a neighbor. Then, leaf

nodes included from I must increase the objective by Vvmax

to ensure that the total objective is at

least `. Then, we can isolate the leaf nodes selected in I and obtain a knapsack instance whose

objective is at least V similar to the previous proof.

4.2 Mathematical Formulation

In this section, we propose an optimization model to solve the SPSDC problem that extends

the improved formulation proposed for the SDC problem in Chapter 3. The model contains three

sets of binary variables: i) xi is 1 if node i is selected as the center; 0 otherwise, ii) yi is 1 if node i

is selected as a leaf node; 0 otherwise, and iii) zij is 1 if pseudo-star element i covers node j in the

pseudo-stars open neighborhood; 0 otherwise. The formulation is:

IP:

max∑

(i,j)∈E

pijzij (4.1a)

s.t. xi + yi +∑

j∈N(i)

zji ≤ 1 ∀i ∈ V (4.1b)

zij ≤ xi + yi ∀(i, j) ∈ E (4.1c)

53

Page 65: Large-Scale Optimization Models with Applications in

yi ≤∑

j∈N(i)

xj ∀i ∈ V (4.1d)

∑i∈V

xi = 1 (4.1e)

∑(i,j)∈E

log(pij)xiyj +∑

i<j:(i,j)∈E

log(1− pij)yiyj ≥ log(1− θ) (4.1f)

xi, yi ∈ 0, 1 ∀i ∈ V (4.1g)

zij ∈ 0, 1 ∀(i, j) ∈ E (4.1h)

The objective function (4.1a) maximizes the total probability of neighborhood assignments.

Constraints (4.1b) indicate that a node i can be either i) the center, ii) a leaf or iii) selected in the

open neighborhood and assigned to a node j that has an edge into i. For case (iii) to hold, node j

has to be connected to the center or a leaf, which is guaranteed by Constraints (4.1c). Note that

case (iii) ensures the unique assignment of a neighbor node to the pseudo-star. While Constraints

(4.1d) make sure that each leaf node is connected to the center node, Constraint (4.1e) enforces the

model to select a single pseudo-star. Constraint (4.1f) states that the pseudo-star selected satisfies

the feasibility condition. Lastly, Constraints (4.1g)-(4.1h) enforces the binary conditions on the

variables.

The model proposed is a non-linear binary optimization problem where the numbers of

variables and constraints are both O(m). Thus, it remains as a challenging problem to solve even

if the given graph is small and sparse. However, we can linerailize Constraints (4.1f) with the

well-known McCormick inequalities. We will introduce variables to represent the product of binary

variables as follows: aij = xiyj ,∀(i, j) ∈ E and bij = yiyj ,∀i < j : (i, j) ∈ E. We then obtain the

following linear model, which is equivalent to IP.

LIP:

max (4.1a) (4.2a)

s.t. (4.1b)− (4.1c)− (4.1d)− (4.1e)− (4.1g)− (4.1h)∑(i,j)∈E

log(pij)aij +∑

i<j:(i,j)∈E

log(1− pij)bij ≥ log(1− θ) (4.2b)

aij ≥ xi + yj − 1 ∀(i, j) ∈ E (4.2c)

bij ≥ yi + yj − 1 ∀(i, j) ∈ E (4.2d)

54

Page 66: Large-Scale Optimization Models with Applications in

aij ∈ 0, 1 ∀(i, j) ∈ E (4.2e)

bij ∈ 0, 1 i < j : ∀(i, j) ∈ E (4.2f)

We first note that since 0 ≤ pij ≤ 1, each log(pij) value is non-positive. This implies that

whenever bij and / or aij variables takes a positive value, then the left hand side (LHS) of Constraint

(4.2b) decreases. As a result, assigning a positive value for either variable when it is not ‘necessary’

would only strain the feasibility condition and does not impact the objective function. That is why

the McCormick upper bound (UB) constraints (e.g., bij ≤ xi,∀i ∈ V and bij ≤ yj ,∀j ∈ V ) are not

necessarily needed during the linear transformation and we omit those constraints. Also, we will be

using the same aij and bij variables whenever McCormick inequalities are introduced for the same

transformations. Although we end up with a linear model, the number of variables and constraints

are still bounded by O(m). We propose a decomposition algorithm to solve the model at a scale,

which is discussed in the next section in a detailed way.

4.3 Solution Methodology

We will be referencing Benders Decomposition (BD) as our solution methodology. Our

method will remove constraints (4.1b)-(4.1c) and (4.1f) to design a master problem (MP) whose

aim is to identify a candidate pseudo-star. We then obtain two different sub problems (SP) where

we focus on the feasibility (i.e., Constraints (4.1f)) and the open neighborhood (i.e., Constraints

(4.1b)-(4.1c)) components of the problem separately in that order. This is because the feasibility

component has no impact on the objective.

At every candidate solution, we first check the feasibility condition. If the condition does not

hold, then we eliminate the current solution via either Benders feasibility cuts or logic-based Benders

cuts (LBBCs). If the pseudo-star is feasible, then we proceed to the next SP to check whether an

optimality cut that aims to approximate the objective value (i.e., the open neighborhood with the

maximum total probability assignment) can be generated. Below we present the MP without the

optimality, feasibility and LBBCs where t represents the estimation of the true objective. Benders

cuts will be presented shortly and be incorporated into the MP at every iteration as needed.

MP = maxtt : (4.1d)− (4.1e)− (4.1g), t ≤ UB

55

Page 67: Large-Scale Optimization Models with Applications in

4.3.1 Benders Feasibility Cuts

An important observation is that McCormick variables (i.e., aij and bij) used to linearize

IP can be relaxed and both take binary values due to the same reason why we do not include the

UB constraints.

Proposition 4. Variables aij and bij take binary values when they are relaxed.

Proof. Without loss of generality (WLOG), let us examine aij . If both xi and yj take the value

of one, then aij = 1 by Constraint (4.2c). If (a) they are both zero or (b) either of them is zero,

then aij becomes free. However, since increasing aij would only decrease the LHS of Constraint

(4.2b), the model would not prefer to assign a non-negative value for aij . Even if there could be

a degenerate case where Constraint (4.2b) is satisfied by assigning aij a positive value, we can set

aij = 0 and obtain the same objective value. As a result, we obtain a binary optimal solution when

variable aij is relaxed.

Proposition (4) enables us to generate traditional Benders feasibility cuts. The second

important observation is that, the feasibility condition has two components where we look at the

connections between the center and the leaf nodes (i.e., xiyj), and among the leaf nodes (i.e., yiyj).

Hence, one can generate two different cuts as i) a local feasibility cut where infeasibility occurs with

both center and leaf nodes (i.e.,∏j∈Lpkj

∏i,j∈L:(i,j)∈E(1−pij) ≥ 1− θ), or ii) a global feasibility cut

where infeasibility occurs directly within the set of leaf nodes (i.e.,∏i,j∈L:(i,j)∈E(1− pij) ≥ 1− θ).

The reason we refer to the latter as global is that it is applicable to any potential center that

could be connected to that set of leaf nodes. We first present the traditional Benders local feasibility

problem. Let δ, νij and µij be the penalty variables defined for Constraints (4.2b), (4.2c), and (4.2d),

respectively. They aim to approximate how much we should perturb the current fixed solution ensure

it satisfies the subproblem constraints.

BLF:

min δ +∑

(i,j)∈E

νij +∑

i<j:(i,j)∈E

µij (4.3a)

s.t.∑

(i,j)∈E

log(pij)aij +∑

i<j:(i,j)∈E

log(1− pij)bij + δ ≥ log(1− θ) (4.3b)

aij + νij ≥ xi + yj − 1 ∀(i, j) ∈ E (4.3c)

bij + µij ≥ yi + yj − 1 ∀i < j : (i, j) ∈ E (4.3d)

56

Page 68: Large-Scale Optimization Models with Applications in

aij , νij ∈ R+ ∀(i, j) ∈ E (4.3e)

bij , µij ∈ R+ ∀i < j : (i, j) ∈ E (4.3f)

δ ∈ R+ (4.3g)

We then take the dual of the problem, where dual variables ρ, υ, and Υ correspond to

Constraints (4.3b), (4.3c), and (4.3d), respectively.

DBLF:

max log(1− θ)ρ+∑

(i,j)∈E

(xi + yj − 1)υij+ (4.4a)

∑i<j:(i,j)∈E

(yi + yj − 1)Υij

s.t. log(pij)ρ+ υij ≤ 0 ∀(i, j) ∈ E (4.4b)

log(1− pij)ρ+ Υij ≤ 0 ∀i < j : (i, j) ∈ E (4.4c)

0 ≤ ρ ≤ 1 (4.4d)

0 ≤ υij ≤ 1 ∀(i, j) ∈ E (4.4e)

0 ≤ Υij ≤ 1 ∀i < j : (i, j) ∈ E (4.4f)

The following is what we call as a local Benders feasibility cut that can be added into MP

to eliminate the infeasible candidate solution.

log(1− θ)ρ+∑

(i,j)∈E

υij(xi + yj − 1) +∑

i<j:(i,j)∈E

Υij(yi + yj − 1) ≤ 0 (4.5)

However, if the infeasibility takes place because of the leaf nodes selected even without taking

the center node into consideration, then we solve a smaller size LP to obtain a global feasibility cut.

WLOG, let us use the same penalty variables (i.e., δ and νij) and define the following feasibility

problem.

BGF:

min δ +∑

i<j:(i,j)∈E

µij (4.6a)

s.t.∑

i<j:(i,j)∈E

log(1− pij)bij + δ ≥ log(1− θ) (4.6b)

57

Page 69: Large-Scale Optimization Models with Applications in

bij + µij ≥ yi + yj − 1 ∀i < j : (i, j) ∈ E (4.6c)

bij , µij ∈ R+ ∀i < j : (i, j) ∈ E (4.6d)

δ ∈ R+ (4.6e)

WLOG, let variables ρ and Υ be the dual variables corresponding to Constraints (4.6b) and

(4.6c). The dual of BGF can be presented as follows.

DBGF:

max log(1− θ)ρ+∑

i<j:(i,j)∈E

(yi + yj − 1)Υij (4.7a)

s.t. log(1− pij)ρ+ Υij ≤ 0 ∀i < j : (i, j) ∈ E (4.7b)

0 ≤ ρ ≤ 1 (4.7c)

0 ≤ Υij ≤ 1 ∀i < j : (i, j) ∈ E (4.7d)

In this case, we obtain a tighter feasibility cut than Ineq. (4.5) since it is not tied to any

center node. The constraint is:

log(1− θ)ρ+∑

i<j:(i,j)∈E

(yi + yj − 1)Υij ≤ 0 (4.8)

Note that both Benders feasibility cuts introduced are associated with dual solutions, which

are mostly fractional values. Our preliminary results indicate that such feasibility cuts are not able to

yield a quick convergence even in small scale instances. Hence, in the following section, we examine

LBBCs. In Section 4.5.3, we test both sets of cuts and discuss their impacts on the solution time.

4.3.2 Logic-Based Benders Cuts

For a fixed pseudo-star Sk centered at node k with a set of leaf nodes denoted by L, if

the condition is not satisfied, then we can generate a generic no-good cut that aims to change the

current solution by removing a single leaf node from Sk. Note that the cut should not take into

consideration adding another leaf node since adding a new leaf node would only decrease the LHS

of (2.2). Hence, we define the following LBBC:∑j∈L

yj ≤ (|L| − 1)xk + |L|(1− xk) (4.9)

58

Page 70: Large-Scale Optimization Models with Applications in

Theorem 8. The LBB feasibility cut (4.9) is valid.

Proof. To prove that a LBBC is valid, we show that (i) the constraint cuts off the current master

solution since it is infeasible, and (ii) it does not eliminate a global feasible solution. We use the

same methodology to prove the similar theorems presented in the rest of this chapter.

Note that if node k is selected as the center node (i.e., xk = 1), then the right hand side

(RHS) implies that at least one of the leaf nodes in L of Sk must be turned off, thereby eliminating

the current solution. Otherwise, the center node alternates without enforcing any restriction on the

nodes in L, thus, the infeasible solution is eliminated. As a result, pseudo-star Sk is guaranteed to

be removed from consideration.

In the following iterations, when we obtain a new candidate pseudo-star (feasible or not),

if it is centered at a node different then k, then it becomes clear that the cut does not eliminate a

feasible solution since the RHS becomes the aggregating of the binary restrictions for the leaf nodes,

in other words, a trivial constraint. If k is the candidate center node in an alternative S′

k, since the

RHS changes at most one leaf node, it guarantees that S′

k 6= Sk. It also makes sure that the only

solution removed is Sk, hence, no global feasible solution is removed.

Note that cut (4.9) does not aggressively change the current solution and is not effective in

general due to the fact that it only targets to eliminate Sk rather than understanding the subset of

nodes at the ‘root’ of its infeasibility. Thus, we can design an integer SP with a fixed center k to

check if we can eliminate more leaf nodes for Sk to be feasible.

WF := max~y∈0,1

∑j∈L

yj :∑j∈L

pkjxkyj +∑

i<j:i,j∈L(1− pij)yiyj ≥ log(1− θ)

The Model WF aims to identify the maximum number of leaf nodes that could be selected in

L to obtain a feasible pseudo-star structure via a knapsack-type constraint. Since its nonlinearities

come from the product of two binary variables, we can use the McCormick inequalities. We then

obtain an equivalent linear formulation:

LWF := max~y,~b∈0,1

∑j∈L

yj :∑j∈L

pkjxkyj +∑

i<j:i,j∈L(1− pij)bij ≥ log(1− θ), bij ≥ yi + yj − 1,

∀i < j : i, j ∈ L

59

Page 71: Large-Scale Optimization Models with Applications in

Let δ∗ be the optimal solution to LWF. Then we define a new LBBC.

∑i∈L

yj ≤ δ∗xk + |L|(1− xk) (4.10)

Theorem 9. The LBB feasibility cut (4.10) is valid.

Proof. Similar to Theorem 8, the second component of the RHS (i.e., |L|(1 − xk)) guarantees that

the current infeasible solution is cut off and no global feasible solution is eliminated. Therefore, we

examine the non-trivial scenario when xk = 1.

The objective aims to select as many leaf node as possible while ensuring the feasibility

by the constraint defined in WF. First, it makes sure that at least one leaf node is removed from

the candidate pseudo-star, hence, δ∗ ≤ |L| − 1. As a result, the cut removes the current infeasible

solution. Second, it implies that any alternative candidate pseudo-star S′

k that has more than δ∗

leaf nodes is infeasible. Since the cut prevents S′

k from having more δ∗ leaf nodes and all solutions

having less than or equal to δ∗ leaf nodes are still feasible, it does not cut off any global feasible

solution.

This cut stands stronger than cut (4.9) since we have that δ∗ ≤ |L| − 1, however it still

depends on the selection of xk as the center meaning that once the center node is changed, the cut

does not help. Therefore, we name (4.10) as a local LBBC. The |L| term plays the role of a big-M,

hence, the question becomes if we can further improve cut (4.10).

Based on the same argument provided in the previous section,if the infeasibility occurs

directly due to the connections between the leaf nodes selected (i.e.,∏i,j∈L(1− pij) < 1− θ), then

we can focus on a smaller size IP model to obtain a better cut. This leads to a different integer SP,

as well as a more general and stronger cut.

SF := max~y∈0,1

∑j∈L

yj :∑

i<j:i,j∈L(1− pij)yiyj ≥ log(1− θ)

Similar to model WF, we have non-linear terms and should use the McCormick inequalities.

LSF := max~y,~b∈0,1

∑j∈L

yj :∑

i<j:i,j∈L(1− pij)bij ≥ log(1− θ), bij ≥ yi + yj − 1,∀i < j : i, j ∈ L

Let ∆∗ be the optimal objective of LSF. Then, we define the following LBBC, which does

60

Page 72: Large-Scale Optimization Models with Applications in

not depend on variable x and is called global LBBC.

∑j∈L

yj ≤ ∆∗ (4.11)

Theorem 10. The LBB feasibility cut (4.11) is valid.

Proof. LSF guarantees that ∆∗ < |L| as a result of which the current infeasible solution is eliminated.

The cut also carries the information of how many nodes in L can be selected by any pseudo-star.

Note that it does not necessarily guarantee feasibility since according to the selection of the center

node, we can still face a feasibility problem. Yet, it makes sure that no global feasible solution is

removed since in any scenario having more than ∆∗ leaf nodes from L is directly infeasible regardless

of which node is selected as the center.

One can realize that all LBB feasibility cuts (4.11) can be prepopulated and add into LIP.

Yet, since the number of those cuts is bounded by O(n2m), it is not practical to incorporate all such

cuts in advance and we generate them on the fly in our solution method.

4.3.3 Optimality Cuts

Once the pseudo-star fixed satisfies the feasibility condition, we proceed to solve a SP to

generate an optimality cut. Given a fixed solution (~x, ~y), we define the following primal problem.

φ(x, y) :

max∑

(i,j)∈E

pijzij (4.12a)

s.t.∑

j∈N(i)

zji ≤ 1− xi − yi, ∀i ∈ V (4.12b)

zij ≤ xi + yi, ∀(i, j) ∈ E (4.12c)

zij ∈ 0, 1, ∀(i, j) ∈ E (4.12d)

Here we can use the LP duality to generate the dual formulation by relaxing variables zij .

The relaxation of variable zij produces binary solutions when passing an incumbent solution to

φ(x, y) due to the fact that the constraint matrix is totally unimodular. Let βi and γij be the

dual variables corresponding to Constraints (4.12b) and (4.12c), respectively. The dual of φ(x, y) is

presented as follows.

61

Page 73: Large-Scale Optimization Models with Applications in

Φ(x, y) := minβ≥0,γ≥0

∑i∈V

(1− xi − yi)βi +∑

(i,j)∈E

(xi + yi)γij : βi + γji ≥ pji,∀(j, i) ∈ E

We observe that the constraint set of the dual formulation Φ(x, y) does not depend on the

fixed MP solution. Also, the constraint set is always closed and bounded, in other words, we are

not concerned with feasibility as expected. Whenever a solution violated is identified, we generate

the following optimality cut and add into MP.

t ≤∑i∈V

βi(1− xi − yi) +∑

(i,j)∈E

γij(xi + yi) (4.13)

Lastly, we illustrate our Benders implementation in Fig. 4.3. Note that here we show LBBCs

(i.e., constraints (4.10) and (4.11)) as feasibility cuts. The only change to focus on Benders Feasibility

Cuts is the type of SP solved and cut generated in the lower portion of the figure.

Figure 4.3: The illustration of the Benders Decomposition algorithm including Logic-based Benders cuts.

Solve MP usingbranch-and-

Benders approach

Utilize the genericcallback function

Is feasibilitycondition met?

Solve Φ(x, y)

Do only leaf nodescause infeasibility?

Solve LWF

Solve LSF

Yes

No

No

Yes

Add cut (15)

Add cut (13)

Add cut (12)

62

Page 74: Large-Scale Optimization Models with Applications in

4.4 Algorithmic Enhancements

In this section, we present the acceleration techniques that we adapt to speed up our Benders

implementation. We note that any technique applicable to the full LIP is directly adopted to make

a fair comparison on our computational testing.

4.4.1 Algorithmic Approach for Optimality Cuts

We observe that Φ(x, y) can be solved by a direct algorithm for it rather than utilizing a

commercial solver. More importantly, we can separate the problem over each node i thereby enabling

ourselves to generate multiple cuts at every iteration. Below we first show how to divide Φ(x, y)

over each node as Φi(x, y) and then propose an algorithm that identifies the optimal solution for

Φi(x, y) for a given i. This algorithm works for both incumbent and fractional solutions, i.e., it

follows Modern BD. The problem over node i is:

Φi(x, y) := minβi,γ≥0

(1− xi − yi)βi +∑

j∈N(i)

(xj + yj)γji : βi + γji ≥ pji,∀j ∈ N(i)

We first restate the objective function of the MP as

∑i∈V ti and separate cut (4.13) over

each node as shown below.

ti ≤ βi(1− xi − yi) +∑

j∈N(i)

γji(xj + yj),∀i ∈ V (4.14)

In order to solve each Φi(x, y), we follow the following procedure. First, for the sake of

simplicity, let us assume that for a fixed node i, every node in N(i) is indexed from 1 to l where

p1i ≥ p2i ≥ · · · ≥ pli. If there exists an index j such that∑jk=1

((xk + yk) > (1− xi − yi)

), then we

identify the minimum j satisfying the inequality and set βi = pji; otherwise, we set βi = 0. Then,

we assign γji as maxpji − βi, 0,∀j ∈ N(i).

4.4.2 Greedy Heuristic and Warm-Start

Our preliminary experiments in the SPSDC problem indicates that our selected commercial

solver, CPLEX, has difficulty in determining an initial feasible solution, as well improving the

optimality gaps in both solving the LIP directly and solving the problem via BD. Therefore, we

63

Page 75: Large-Scale Optimization Models with Applications in

design a greedy heuristic that produces an induced pseudo-star for every node in order to test the

impact of warm-start (see Alg. 4).

Given a node i and pseudo-star Si centered at i, let uncovered (i.e., nodes that are not yet

covered by an element in Si) first and second degree nodes of i be represented by R ⊆ N2(i) ∪

N(i) \ ∪j∈SiN(j). We then define R as the complement of R. We let hj be the index of the

element in Si that is assigned to a neighbor node j in R. For a node j ∈ N(i), we define uj as the

contribution of node j as the total increase in the objective in case it is selected as a leaf node where

uj :=∑k∈R∩N(j) pjk +

∑k∈R∩N(j) maxpjk − phjj , 0. Finally, we let ζj be the total probability

value that node j adds into Si if j is selected as a leaf node (i.e., ζj = pij∏k∈Si\i(1− pkj)).

Algorithm 4: Greedy Heuristic

Input: i ∈ V1 Si ← i;2 C ← N(i); #candidate leaf nodes3 π = 1; # total probability of Si4 zij = 1,∀j ∈ N(i); # initially assign the center to every node in N(i)5 R = N2(i)6 while C 6= ∅ do7 j∗ = arg max

j∈Cujζj ;

8 if ζjπ < 1− θ then9 C ← C \ j∗;

10 else11 Si ← Si ∪ j∗;12 zij∗ = 0;13 C ← C \ j∗;14 R ← N2(i) \N(j∗);15 π = ζj∗π;16 for k ∈ N(j∗) do17 if ∃hk ∈ Si : zhkk = 1 then18 if phkk < pj∗k then19 zhkk = 0;20 zj∗k = 1;

21 else22 continue;

23 else24 zj∗k = 1;

25 return Si, ~z

Given a node i, the heuristic identifies the candidate leaf node that has the highest weight

function(i.e., wj = ujζj , j ∈ C

)in candidate leaf nodes and the node is added into Si as long as it

does not violate the feasibility condition. If it is violated, then the node is removed from C. Once

64

Page 76: Large-Scale Optimization Models with Applications in

we obtain one candidate pseudo-star centered at each node, we evaluate the objective value of each

(i.e.,∑

(i,j)∈E pijzij) and warm-start both IP and MP via the best solution.

4.4.3 Valid Inequalities

While we aim to help the solver with improving the primal bounds via warm-start, it is

also important to use valid inequalities to help with the dual bounds. With this purpose, we use

the heuristic algorithm proposed in Chapter 3 and adapt it to our problem. While more details

and pseudocode can be reached in Section 3.4.2 and Appendix A, here we informally explain the

heuristic and our slight modification.

First, we note that the heuristic remains as a valid UB even if we are concerned with a

deterministic objective in the SPSDC problem. For a given node i and candidate induced star Sk,

let δSk be the UB. We initially set δSk = |N(i)∪ N2(i)|. The heuristic identifies each node j in N(i)

which creates a unique path to a node in N2(i) and decreases δSk for each j identified.

Once the bound is obtained, we sum up the δSi largest probability values in the set Vi =

pij : j ∈ N(i) ∪ pjk : j ∈ N(i), k ∈ N(i)∪ N2(i). Let the summation is represented by τi. Then,

the following is a valid inequality which can be placed in MP.∑i∈V

ti ≤ τixi (4.15)

Note that we use the same bound for the objective function (4.2a) in LIP. In addition,

since we look for a unique assignment between a neighbor node and a pseudo-star element, the

contribution of each node to the objective is bounded above by the largest probability connection.

The following is a valid inequality that can be only used in MP:

ti ≤ maxj∈N(i)

pji,∀i ∈ V (4.16)

4.4.4 Separation of Fractional Solutions

Our preliminary experiments indicate that the initial MP quickly ends up being overloaded

with feasibility cuts, thus limiting its ability to solve the problem. In addition, having fractional

values for the center variable increases the difficulty of the feasibility separation problem. Therefore,

in our implementation, fractional solutions are only separated when all variables xi are binary and

the leaf variables are fractional. Otherwise, we let the solver continue its branching process.

65

Page 77: Large-Scale Optimization Models with Applications in

When it comes to separating fractional y solutions, we adapt two different strategies. First,

we treat each yi having a fractional value as a leaf node and conduct the feasibility test accordingly.

In other words, we apply a rounding heuristic to turn the fractional solution into an integer solution.

If the current solution is not feasible, then we proceed to solve a feasibility problem. If the feasibility

condition is met, then we focus on the dual problem with the original fractional solutions. As a

second approach, we follow the standard procedure and perform the feasibility the test with the

original fractional values. Employing the latter strategy turns out to be the most effective since the

solver generates fewer user-defined cuts, as well as branching to a fewer number of nodes to reach

the optimal in most of the instances.

4.5 Experimental Results

We perform all the experiments using Java API and CPLEX solver 12.8.1 on a laptop having

3.10GHz Intel Core i7-6500 processor and 16 GB of RAM. We change the default CPLEX settings

during the decomposition implementation. Similar to Chapter 3, we switch the MIP emphasis to

optimality over feasibility, use strong branching (i.e., VarSel = 3) and set the heuristic frequency

1000 (i.e., RINSHeur = 1000). Furthermore, we set the number of threads as the number of cores

on the laptop (i.e., 4) both when solving the IP directly and solving the model via BD.

4.5.1 Networks Based on the Watts-Strogatz Model

Due to the complexity of our model, it becomes challenging to directly apply it to PPINs

of the scale available. Therefore, we randomly generate network instances for testing purposes.

The instances are created based on the the Watts-Strogatz (WS) model, which is also called the

small-world model. In such models, we observe local clusters and small average path length that is

tuned by the rewiring probability. The reason for selecting the small-world network as our choice is

that one can observe a large number of local clusters in PPINs. Second, the diameter in PPINs is

relatively small. For instance, we take the dataset of two organisms Helicobacter Pylori (HP) with

1,570 nodes and Staphylococcus Aureus (SA) with 2,853 nodes as our reference (Szklarczyk et al.,

2015). In both networks, the diameter is six.

In WS models, one can tune the neighborhood parameter (nei) and rewiring probability

(rp) to generate different network instances. We consider instances with |V | ∈ 500, 750, 1000,

66

Page 78: Large-Scale Optimization Models with Applications in

nei ∈ 12, 14, 16, and p ∈ 0.3, 0.5, 0.7 which in total produces 27 instances.

4.5.2 Calculation of Probability Values

In this section, we present the methodology that we use to identify the probability values

associated with the edges. In PPINs, there exists interaction scores in (1,1000) where the higher

score implies a stronger interaction between two proteins. We normalize the interaction scores and

plot the distribution of the normalized scores (see Figures 4.4 and 4.5 for HP and SA, respectively).

Figure 4.4: Distribution of Interaction Scores in HP Figure 4.5: Distribution of Interaction Scores in SA

We observe that the normalized scores show a right-skewed distribution, which resembles

both the gamma distribution with a shape parameter less than or equal to one and the exponential

distribution with a rate parameter around 1.5. Therefore, generating probability values according

to either distribution can be acceptable. We will be using the exponential distribution with rate

parameter 1.5 to generate the probability values associated with edges. It is important to mention

that one can also use Monte-Carlo sampling on the real-data sets in order to both generate network

samples and probability values. However, sampling from a PPIN would favor that specific network.

Hence, we prefer to use our proposed random generation process over Monte-Carlo sampling in order

to demonstrate its wider applicability.

4.5.3 Computational Experiments

We set a time limit of 5,400 seconds. In initial testing, we set θ = 0.99. Before getting into

a detailed analysis, we share a macro table which summarizes the experimental results. In Table

4.1, we compare six methods including i) LIP, ii) LIP with warm-start (LIP-WS), iii) BD with the

67

Page 79: Large-Scale Optimization Models with Applications in

LBB cuts (BD-LB) used as feasibility cuts, iv) BD-LB with warm-start (BD-LB-WS), v) BD with

the traditional Benders feasibility cuts (BD-TB), vi) and BD-TB with warm-start (BD-TB-WS).

For each method, we report the number of instances solved to optimality, the ratio between the

optimal solutions and the total number of instances, the average optimality gap calculated over all

the instances, as well as the number of times the method achieved the best performance across all

six methods. The best performance is first evaluated according to the optimality gaps. If more than

one method returns the optimal solution for the same test instance, then we examine the time spent

to reach the optimal. In addition, we use the bold font to indicate the best method for each criteria.

Table 4.1: Summary of results (27 Instances)

LIP LIP-WS BD-LB BD-LB-WS BD-TB BD-TB-WS

Optimal 8 7 17 16 11 13Percentage (%) 30 26 63 60 41 49Average Gap (%) 168.88 181.56 7.5 8.51 64.98 54.33Best Performance 2 0 17 7 0 1

One can clearly observe that our BD implementation including the LBBCs shows the best

performance. Although warm-start does not seem to be effective in improving the performance, we

further examine its impact as θ varies. On the other hand, using the LIP with or without warm-

start does not even produce an average optimality gap below 100%. Even though the BD with the

traditional feasibility cuts performs relatively better than solving the problem directly via LIP, it

still cannot compete with both BD-LB and BD-LB-WS. This implies that LBBCs show a better

performance than the traditional Benders cuts in dealing with the infeasibility. We believe that this

is because LBBCs carry more specific information and tell the model exactly which leaf nodes cause

the infeasibility (i.e., see LBBC (4.11)).

We now move into a detailed analysis and share the computational results obtained through

all six methods for each instance. In Table 4.2, we report the following outputs: i) time spent to

reach the solution in seconds, ii) the final optimality gap in percentage, and iii) the number of branch

and bound (BB) nodes visited by the solver. Note that if the optimal solution is not reached within

the time limit (TL), then we use TL as an abbreviation in the table. Also, similar to Table 4.1, we

use the bold font to indicate which method performs the best in each network instance.

Overall, we observe that warm-start does not show a fruitful impact on the performance

of both LIP and BD-LB. The model LIP shows a better performance than LIP-WS in nearly all

the instances with four exceptions for which the optimality gap differences are negligible (e.g.,

68

Page 80: Large-Scale Optimization Models with Applications in

Table

4.2

:T

he

com

puta

tional

resu

lts

wit

=0.9

9

LIP

LIP

-WS

BD

-LB

BD

-LB

-WS

BD

-TB

BD

-TB

-WS

nneip

Tim

e(sec)

Gap

(%)

BB

Nodes

Tim

e(sec)

Gap

(%)

BB

Nodes

Tim

e(sec)

Gap

(%)

BB

Nodes

Tim

e(sec)

Gap

(%)

BB

Nodes

Tim

e(sec)

Gap

(%)

BB

Nodes

Tim

e(sec)

Gap

(%)

BB

Nodes

500

12

0.3

2713.8

90

1278

3031.1

40

1254

615.48

018833

623.9

70

20000

2180.0

30

32361

2035.7

50

31620

500

12

0.5

2164.0

40

1401

2297.3

90

1537

670.02

017844

684.5

30

17577

1494.3

00

22142

1460.9

00

18390

500

12

0.7

2542.5

80

1443

2838.1

40

1399

2118.9

00

34524

3236.3

70

39421

2105.4

00

291782057.58

027664

500

14

0.3

3885.6

30

1702

4238

019481327.81

021896

1624.0

50

26077

3044.1

80

33166

3390.0

80

35195

500

14

0.5

3806.4

80

1692

3993.2

60

16902422.03

036579

3015.8

70

40712

3687.9

50

36836

4062.2

40

36647

500

14

0.7

3363.0

80

2099

3215.6

00

14232308.69

036671

4609.0

60

32370

2644.4

20

30608

5117.5

00

33071

500

16

0.3

TL

159.3

91576

TL

187.5

3808

TL29.14

39029

TL

36.0

728370

TL

110.6

928174

TL

36.6

930665

500

16

0.5

4703.84

01622

4915.1

70

1457

TL

18

39657

TL

17.0

328570

TL

12.3

239234

TL

9.6

941771

500

16

0.7

4018.79

01474

TL

233.4

71357

TL

11

47798

TL

12.9

846253

TL

25.2

537553

TL

21.5

538488

750

12

0.3

TL

152.1

61417

TL

164.6

81191

1864.3

30

349511727.21

029199

4570.1

70

44520

3719.1

20

41432

750

12

0.5

TL

145.5

91695

TL

155.6

114622072.62

032410

2432.4

20

33549

2896.8

20

29463

2382.0

70

27725

750

12

0.7

TL

179.7

41565

TL

192.6

511551196.31

026428

2699.8

10

30669

3638.7

10

34048

3234.4

50

32733

750

14

0.3

TL

201.7

61300

TL

205.3

11133

4686

048120

TL

6.6

943820

TL

105.4

227386

TL

48.4

529999

750

14

0.5

TL

268.8

3945

TL

277.4

4923

4320.5

40

511464187.23

048752

TL

15.6

437139

TL

17.7

138279

750

14

0.7

TL

217.9

9832

TL

220.9

8763

4576.0

40

490054496.01

045464

TL

14.7

731747

4714.2

00

41180

750

16

0.3

TL

290.8

01217

TL

294.1

31257

TL

26

42572

TL

36.4

833277

TL

170.9

628991

TL

114.0

627776

750

16

0.5

TL

260.1

31323

TL

267.1

41073

TL

16.6

238040

TL

6.59

41283

TL

76.3

232294

TL

44.3

730935

750

16

0.7

TL

249.3

8498

TL

252.7

1456

TL22.82

32374

TL

23.6

339329

TL

25.2

132085

TL

93.1

423302

1000

12

0.3

TL

198.6

81034

TL

214.2

9709

2002

028338

2252.1

60

32063

TL

17.3

635668

4805.3

90

41281

1000

12

0.5

TL

241.0

21372

TL

239.9

01367

69.76

01112

1438.4

50

20823

3853.9

30

28128

4033.8

10

26151

1000

12

0.7

TL

226.3

51065

TL

226.5

01271

2205.0

50

233091584.60

022981

5285.1

90

26609

3669.9

20

25151

1000

14

0.3

TL

206.1

2513

TL

206.4

14633430.06

037788

3508.5

10

34604

TL

143.5

925283

TL

149.7

823064

1000

14

0.5

TL

256.3

21371

TL

254.9

01259

TL

3.99

45634

TL

6.2

049167

TL

27.9

428829

TL

110.3

922301

1000

14

0.7

TL

292.8

9961

TL

294.4

6985

5367.5

10

501595092.16

050159

TL

33.6

922346

TL

41.6

921968

1000

16

0.3

TL

330.1

5500

TL

331.2

0892

TL20.59

43851

TL

24.4

739445

TL

322.8

614522

TL

284.1

514575

1000

16

0.5

TL

394.6

7634

TL

392.1

4719

TL

21.8

338180

TL21.27

42369

TL

406.3

216200

TL

316.1

320677

1000

16

0.7

TL

287.8

6583

TL

290.6

0448

TL32.57

35488

TL

38.4

038885

TL

246.1

417284

TL

179.1

421141

69

Page 81: Large-Scale Optimization Models with Applications in

Figure 4.6: Solution time comparison between BD-LB and BD-LB-WS

0

1000

2000

3000

4000

5000

6000

0.3 0.5 0.7 0.3 0.5 0.7 0.3 0.5 0.7 0.5 0.7 0.3 0.5 0.7 0.3 0.7

12 14 12 14 12 14

500 750 1000

Tim

e (s

ec)

|V| - nei - p

BD-LB

BD-LB-WS

Figure 4.7: Optimality gap comparison between BD-LB and BD-LB-WS

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

0.3 0.5 0.7 0.3 0.5 0.7 0.5 0.3 0.5 0.7

16 16 14 16

500 750 1000

Gap

|V| - nei - p

BD-LB

BD-LB-WS

Figure 4.8: Solution time comparison between BD-TB and BD-TB-WS

0

1000

2000

3000

4000

5000

6000

0.3 0.5 0.7 0.3 0.5 0.7 0.3 0.5 0.7 0.5 0.7

12 14 12 12

500 750 1000

Tim

e (s

ec)

|V| - nei - p

BD-TB

BD-TB-WS

Figure 4.9: Optimality gap comparison between BD-TB and BD-TB-WS

0%

50%

100%

150%

200%

250%

300%

350%

400%

450%

0.3 0.5 0.7 0.3 0.5 0.3 0.5 0.7 0.3 0.3 0.5 0.7 0.3 0.5 0.7

16 14 16 12 14 16

500 750 1000

Gap

|V| - nei - p

BD-TB

BD-TB-WS

(1000, 12, 05)− (1000, 14, 0.5)− (1000, 16, 0.5)). As for BD including the LBB cuts, we present two

figures for illustration purposes. Fig. 4.6 represents the solution time comparisons between BD-LB

and BD-LB-WS where we compare the instances solved to optimality with both methods. BD-LB

reaches the optimal solution faster than BD-LB-WS in 11 out of 16 instances. Warm-start results

in visiting more BB nodes which might be the reason behind having a higher solution time. Fig.

4.7 illustrates the comparison of optimality gaps when both BD-LB and BD-LB-WS fail to reach

the optimal solution. BD-LB returns a better optimality gap than BD-LB-WS in 80% of samples

presented in the figure. However, when we look at the BD including the traditional feasibility

cuts, we observe a completely reversed trend where BD-TB-WS outperforms BD-TB in most of the

instances (i.e., 18 out of 27) which implies that warm-start does impact the performance in a positive

way. Moreover, Fig. 4.8 and 4.9 illustrate the comparison of both solution time and optimality gaps

between BD-TB and BD-TB-WS, respectively. While the warm-start version beats BD-TB in eight

instances out of eleven in terms of solution time as presented in Fig. 4.8, it also performs better with

respect to the optimality gaps in 60% of the instances presented in Fig. 4.9.

After comparing each method with its warm-start variant, based on the ‘winner’ cases

we proceed to compare LIP, BD-LB, BD-TB-WS where it can be seen that BD-LB significantly

70

Page 82: Large-Scale Optimization Models with Applications in

outperforms the other two with respect to both solution time and solution quality. LIP shows

a quite poor performance especially in the networks with more than 500 nodes where the average

optimality gap turns out to be 244.47%. The reason behind this could be two-fold. First, the number

of BB nodes pruned by the solver is relatively small (see Table 4.2) which indicates that the size of

the model becomes an issue for the solver to detect new branches. We believe that specifically Eq.

(4.2b) defined in LIP (i.e., feasibility condition) might cause numerical issues due to the existence

of a high number of non-zero coefficients. Second, by looking at the engine logs we observe that

the number of feasible solutions identified by the solver within TL is quite small. Thus, the solver

has a hard time in both reaching the optimal solution and determining a feasible solution. When

analyzing the results returned by BD-TB-WS, we see that instances containing more than 500 nodes

and having nei parameter 14 and higher are the most challenging ones where only a single instance

is solved to optimality (i.e., (750,14,0.7)) and the average optimality gap is 116.58% among those

12 instances.

Since the performance of the warm-start variant was somewhat “close” for the best algorithm

(i.e., BD-LB) for the initial θ value, we further investigate the impact of warm-start. We now solve

the problem for θ = 0.95, 0.9, 0.8. In Table 4.3, we present the computational experiments conducted

via BD-LB and BD-LB-WS by varying θ values.

One can realize that the warm-start strategy becomes quite useful as we decrease θ value.

This is because the feasibility condition becomes harder to satisfy and, therefore, identifying feasible

integer solutions to the problem that have objectives close to the dual bounds becomes more difficult.

BD-LB-WS outperforms BD-LB in 59%, 78%, and 82% of the instances with θ = 0.95, θ = 0.9, and

θ = 0.8, respectively. More importantly, we do not have a single instance for which BD-LB reaches

the optimal while BD-LB-WS cannot. With θ = 0.95, when BD-LB returns optimality gaps over

200% for instances (1000, 14, 0.3) and (1000, 16, 0.7), we obtain optimality gaps of 28.79% and 40.05%

for those two instances via BD-LB-WS. Furthermore, with θ = 0.9, while BD-LB produces an average

optimality gap of 406.31% for instances (500, 16, 0.5), (750, 14, 0.3), (750, 16, 0.7), and (1000, 16, 0.7),

BD-LB-WS reaches the optimal within 3,478 seconds on average for the same instances. Lastly, with

θ = 0.8, BD-LB-WS reaches the optimal while BD-LB fails to do so for instance (1000, 16, 0.5). BD-

LB-WS performs better than BD-LB especially with respect to the solution time for the instances

where both methods reach the optimal. For such 16 instances, BD-LB-WS returns the optimal

solution faster than BD-LB in 11 of them. As a result, we continue our analysis with BD-LB-WS

71

Page 83: Large-Scale Optimization Models with Applications in

Table

4.3

:T

he

com

puta

tional

resu

lts

wit

hdiff

eren

valu

esvia

BD

-LB

and

BD

-LB

-WS

BD

-LB

(θ=

0.9

5)

BD

-LB

-WS

(θ=

0.9

5)

BD

-LB

(θ=

0.9

)B

D-L

B-W

S(θ

=0.9

)B

D-L

B(θ

=0.8

)B

D-L

B-W

S(θ

=0.8

)

nneip

Tim

e(sec)

Gap

(%)

BB

Nodes

Tim

e(sec)

Gap

(%)

BB

Nodes

Tim

e(sec)

Gap

(%)

BB

Nodes

Tim

e(sec)

Gap

(%)

BB

Nodes

Tim

e(sec)

Gap

(%)

BB

Nodes

Tim

e(sec)

Gap

(%)

BB

Nodes

500

12

0.3

648.6

20

13507

548.56

013746

491.62

014250

548.3

40

15594

1263.9

40

17604

643.89

017174

500

12

0.5

486.8

80

13421

483.84

013426

547.0

50

12561

210.50

06318

1651.8

70

15539

656.31

014889

500

12

0.7

667.60

015289

809.8

10

16437

821.3

00

17536

580.95

0145481085.85

019545

3360.6

50

53631

500

14

0.31176.59

024768

1495.6

60

26501

2624.7

50

376091777.34

034008

3875.0

60

552273757.11

057156

500

14

0.5

752.56

017883

872.6

00

18748

683.6

60

16690

634.61

016455

1972.9

80

245201792.68

023579

500

14

0.7

495.98

012302

550.9

30

12758

348.0

80

7450

264.89

06811

962.2

10

18512

896.09

018495

500

16

0.3

TL406.5053306

TL

449.4

359672

TL

619.4

446029

TL430.0152856

TL

615.6

257658

TL500.6565709

500

16

0.5

2074.7

70

312141700.30

028692

TL

523.9

2394552857.08

036284

TL

828.4

755128

TL605.3360253

500

16

0.7

983.1

40

18503

832.28

020714

858.0

60

13508

797.21

012227

2416.4

60

300241594.62

026169

750

12

0.3

1461.7

50

25435

317.77

06675

1513.2

20

247071322.48

021647

2709.6

10

299681947.27

029961

750

12

0.5

933.14

015519

954.7

00

16898

866.9

30

14478

801.37

0134861247.14

016321

1474.0

40

17370

750

12

0.7

729.49

012586

756.2

50

12486

796.0

90

12442

726.50

012619

872.2

00

13082

662.78

08261

750

14

0.3

4944.2

70

523514366.46

051576

TL

350.5

6405224678.75

052326

TL

565.7

740298

TL409.2652450

750

14

0.5

1227.1

30

179391004.51

015520

1526.8

90

202021301.13

015922

2237.1

10

233701344.47

020205

750

14

0.71295.16

020890

1498.9

50

21073

1828.9

10

205961395.34

018243

455.85

09086

1734.1

50

25037

750

16

0.3

TL276.7644435

TL

327.3

943788

TL

542.6

845812

TL490.2045089

TL

663.5

954216

TL618.3155755

750

16

0.5

3207.3

50

300452469.24

0266031724.98

023394

4480.1

20

36059

TL

972.0

545450

TL584.8342349

750

16

0.7

1930.3

70

269631745.12

025159

TL

326.4

3340222771.51

026206

TL

775.7

445104

TL562.2440405

1000

12

0.31061.77

09955

1826.9

30

211631843.67

021799

2313.5

10

21839

3420.5

40

286282308.73

026465

1000

12

0.5

1403.3

70

133981335.79

0150921184.58

012462

1212.9

60

133921440.91

013906

1464.8

00

14976

1000

12

0.7

1283.3

90

143631145.05

012453

1403.3

70

123201074.83

011691

1172.3

60

107281088.94

012345

1000

14

0.3

TL

204.5

237625

TL

28.7941592

TL

292.0

541244

TL284.1342296

TL

500.2

829324

TL419.4731691

1000

14

0.5

2519.0

70

266401779.30

0204162393.76

023811

2713.6

10

23850

586.79

09419

2722.2

30

28957

1000

14

0.71612.23

015493

1619.8

30

170791603.05

016171

1895.3

30

18003

1904.4

40

180411854.58

017867

1000

16

0.3

TL

476.2

332942

TL433.1937231

TL

529.1

738714

TL507.8639908

TL

677.3

042804

TL673.4349242

1000

16

0.5

2941.8

20

289302564.71

024514

2709.9

70

231862161.05

021477

TL

720.5

927768

251.37

07737

1000

16

0.7

TL

262.2

428996

TL

40.0540593

TL

424.3

3258043605.20

034308

TL

722.8

532877

TL601.8933711

72

Page 84: Large-Scale Optimization Models with Applications in

for Table 4.3.

On one hand, the BD implementation with the warm-start performs pretty well when

θ = 0.95 and θ = 0.90 where the number of instances solved to optimality are 22 and 23, re-

spectively. While BD-LB on average branches to 35,249 nodes with θ = 0.99, in these both cases

the average number of BB nodes visited is roughly 24,500. This could be a good indication for

having the smaller average solution time for the instances solved to optimality. For example, our

Benders implementations produce average solution times of 2,426 and 1,394 when θ is 0.99 and 0.9,

respectively. On the other hand, the optimality gaps returned for the instances where we fail to

attain the optimal turn out to be quite bad with θ = 0.95 and θ = 0.90 where the average gaps are

255.77% and 428.05%, respectively. We note that the reason behind this failure is that CPLEX is not

able to improve the dual bounds. Hence, it could be quite useful to identify new valid inequalities,

especially upper bounds, similar to ones proposed in Section 4.4.3 in a future research. We expect

that tighter upper bounds could increase the performance of BD to a great extent.

However, as we decrease θ value more (i.e., 0.8), the performance of BD again decreases,

where the number of instances solved to optimality becomes 17. First, the average number of BB

nodes pruned by the solver is 26,563. It is worth mentioning that it is still better than the ones

obtained with θ = 0.99. Next, the average optimality gap for the instances where the optimal is

not obtained turns out to be 552.82% which indicates that we are far way from the optimal and

extra computational efforts are required for such instances. It is important to note that during our

preliminary experiment we observed that the solver starts relatively performing better in branching

process when the optimality gaps go below 100% and converge the optimal quicker. However, it is

hard to get an insight on the performance of a black-boxed solver. At this point, we believe that

the more we decrease the θ value, the worse performance we will be observing since the problem

gets harder. Our intuition about what makes the problem harder is the existence of the feasibility

constraint which plays a role of a chance constraint. Whenever we decrease θ, the RHS also decreases

when considering the inequality as a less than or equal constraints.

4.6 Conclusion

In this chapter, we introduce a new centrality metric called the stochastic pseudo-star degree

centrality (SPSDC) for which we propose a non-linear binary optimization model. We study the

73

Page 85: Large-Scale Optimization Models with Applications in

complexity of the problem and show that it is NP-complete on general graphs together with trees,

and windmill graphs. We implement a branch-and-Benders approach strengthened by the logic-

based Benders cuts and several other acceleration techniques (e.g., valid inequalities and generation of

multi-cuts). Our decomposition approach outperforms solving the model via a commercial solver to a

great extent in terms of both solution time and quality. Our test cases are generated according to the

small-world networks which resemble the real-world protein-protein interaction networks (PPINs).

The deterministic star degree centrality concept was shown to be an effective centrality metric in

order to detect essential proteins in PPINs and our proposed centrality metric can add to the set of

proteins to explore for essentiality..

In a future study, it might be worth examining new acceleration techniques as a result of

which BD can be used in solving large-scale PPINs. This would open a door to analyse the large-

scale biological networks in order to test the performance of this new centrality metric with respect

to detecting the essential proteins. In addition, it might be interesting to identify a new application

area where the SPSDC can be utilized. One good example might be to investigate the network

resilience in financial networks in order to detect the most important financial entities in a market.

74

Page 86: Large-Scale Optimization Models with Applications in

Chapter 5

Optimizing the Response for

Arctic Mass Rescue Events∗

In this chapter, we propose an integer programming (IP) model to to respond to a large-scale

mass rescue event in the Arctic. In Section 5.1, we first motivate our work and discuss the necessity of

an optimization model for an Arctic mass rescue event. Section 5.2 summarizes the related research

and explains our contribution. Section 5.3 gives a brief problem description and provide the reader

with an illustrative example. The details of our optimization problem are provided in Section 5.4

where we explain each constraint in detail. We then introduce the solution methodologies and discuss

our attempts to solve the model in Section 5.5. Our experimental results and our findings are shared

in Section 5.6. The chapter is summarized and future research is presented in Section 5.7.

5.1 Introduction

The Arctic has been experiencing large-scale changes in the last decade with respect to

several aspects. Due to climate change and global warming, while the air temperature increases

(Przybylak and Wyszynski, 2020), sea ice thickness faces a rapid decrease (Shalina et al., 2020). In

addition, demographic changes including rises on both the number of non-indigenous people living

in the region and the birth rates are observed (Heleniak, 2020). Researchers indicate that unless

proper measures are taken, the infrastructure systems of the Arctic will be at risk by 2050 (Hjort

∗The paper has been accepted at Transportation Research Part E: Logistics and Transportation Review.

75

Page 87: Large-Scale Optimization Models with Applications in

et al., 2018).

Maritime activities regarding tourism and the economy have been advancing as a result of

longer ice-free seasons (Messner, 2020; Østhagen, 2020). For instance, the Crystal Serenity , the

largest cruise ship to date to voyage in coastal Arctic waters, sailed between Anchorage, Alaska and

New York City through the Northwest Passage with 1000 passengers and over 600 crew members in

August 2016 and 2017 (Waldholz, 2016). In preparation for this event, a tabletop exercise (McNutt,

2016) was organized in collaboration with Crystal Cruises, the Canadian Coast Guard, Transport

Canada, the Department of the Defense (U.S. Air Force) and the U.S. Coast Guard (USCG) in

2016. This exercise identified gaps in Arctic maritime search and rescue resources and highlighted

the impacts of the resource gaps on evacuees being rescued. The conversations and activities sug-

gested the need for greater attention to Arctic mass rescue operations, and for greater visibility and

coordination of Arctic emergency response. We refer the interested reader to Elmhadhbi et al. (2020)

and Sarma et al. (2020) who highlight the importance of coordination between different emergency

responders during disaster response.

Recent changes in Arctic industrial activities, defense and tourism have amplified the need

for attention to resource availability and evacuee impacts during an Arctic mass rescue event (MRE).

Ship traffic and maritime activity in the region has increased, and will likely continue to increase in

the future (Østhagen, 2020) without agreements to limit the number of ships entering the region.

In 2021, in recognition of these trends, the U.S. Navy and the USCG, for the first time, issued a

joint Arctic strategy that cites expectations for increased Arctic maritime traffic due to commercial

shipping, natural resource exploration, tourism and military presence (Eckstein, 2021). Increased

Arctic maritime traffic occurs in waters that are largely uncharted because they have never been

ice-free in modern times. Only 4.1% of Arctic waters have been charted using modern multi-beam

sonar techniques (National Oceanic & Atmospheric Administration, 2021). Some waters were last

surveyed by Captain Cook using hand-held ropes and lead lines in the 17th century (Hoag, 2016).

Risks associated with maritime trade, and needs to consider personnel evacuation on ships, are

therefore significant and rising as maritime traffic increases, uncharted waters are increasingly ice-

free, and the size of passenger vessels is increasing (Statista Research Department, 2020).

Meanwhile, the oil and gas industry plays a major economic role in the economy of the region

and the lives of the people who live there (Morgunova, 2020). From a political perspective, both

Russia and China are seeking for economic benefits via expanding oil and gas exploration activities

76

Page 88: Large-Scale Optimization Models with Applications in

and are making new investments in the region (Stepien et al., 2020; Ilinova and Chanysheva, 2020).

Yet, the oil price war between Russia and Saudi Arabia that took place in 2020 together with the

fallout from the COVID-19 pandemic has negatively impacted the oil drilling activities in the region

for United States (U.S) based-companies. For example, one of the largest oil companies, Conoco

Phillips announced that they would halt all drilling operations on the North Slope of Alaska (Hanlon,

2020). The Spring 2020 Revenue report released by the Alaska Department of Revenue forecasts

a $1.15 billion loss in revenue from oil in the current and next fiscal years (Alaska Department of

Revenue, 2020). Hence, from the U.S.’s side, new drilling activities are not expected to take place

in the near future although most of the existing oil-based activities continue to operate.

Arctic emergency response occurs in a setting that requires balancing activities related to

territorial disputes (Schofield and Østhagen, 2020); fishing and subsistence economies; endangered

species and wildlife habitats; industrial and commercial activity; and military operations (Allison

and Mandler, 2018; Ruskin, 2018; Humpert, 2019). Impacts from these activities can be particularly

significant in remote, seasonably variable, and infrastructure-poor settings with sparse populations

such as the Arctic.

The increasing number of visitors to the region is concerning due to the size of Arctic

communities. In Arctic Alaska, the largest community is Utqiagvik (formerly known as Barrow),

which has a population of 4335. The number of people on the Crystal Serenity was 34.6% of

Utqiagvik’s population and would exceed the population of most Arctic communities (see Table

5.1). Of further concern is that the health care system in Alaska was not designed for surges

resulting from potential Arctic MREs. There are currently 17 trauma centers in Alaska and only

two are Level II (Alaska Department of Health and Social Services, 2018) (Level I handles the

highest emergencies) and are located in Anchorage (700 miles away from Utqiagvik). The only

trauma centers in Arctic Alaska are in Nome, Kotzebue, and Utqiagvik (see Table 5.1). It is neither

reasonable nor desirable for the evacuees to stay in the Arctic Alaska communities for a long time

during a MRE. Communities in the Arctic are not equipped to host a large number of evacuees

for an extended period of time. In essence, responding to a MRE in Arctic Alaska becomes much

more difficult than responding to one in the continental United States since an influx of 1,600 people

would significantly strain the infrastructure of Arctic communities.

Maritime response operations require two sets of activities: (i) evacuating people from an

affected area to ‘safe zones’ (e.g., in our case, out of the Arctic), and (ii) providing them with

77

Page 89: Large-Scale Optimization Models with Applications in

Table 5.1: Data on communities in Arctic Alaska (U.S. Bureau of the Census, 2019; Alaska Department ofHealth and Social Services, 2018)

Location Nome Kotzebue Point Hope Point Lay Atqasuk Wainwright Utqiagvik Crystal SerenityPopulation Number 3797 3245 692 247 237 584 4335 1600% of Pop. of Passengers 39.50% 46.22% 216.76% 607.28% 632.92% 34.60% 34.60% —Trauma Center Status Level IV Level IV — — — — Level IV —

the logistic support (i.e., relief commodities) throughout a period of time. In most maritime mass

rescues, once evacuees in distress are brought to shore, the response is often considered complete

since existing infrastructure typically has the ability to handle the influx of passengers. However, in

an Arctic MRE, two steps are required because of limited Arctic shelter, medical, food, and sanitary

infrastructure. Transporting evacuees from the cruise ship out of the Arctic by sea is neither feasible

nor preferred; for example, moving evacuees by sea to Anchorage from the North Slope of Alaska

could take more than 10 days, and this assumes the ship could hold and support the evacuees for that

length of time. As a result, maritime evacuation during this type of event comprises two aspects:

moving evacuees from the location of the evacuation (e.g., cruise ship) to local Arctic communities

and then out of the Arctic (e.g., into Anchorage, Alaska) and; providing evacuees their basic needs

through allocating resources and equipment. Such an evacuation process was seen most recently

in the grounding of the Akademic Ioffe, which ran aground about 45 miles away from Kugaaruk,

Canada (in the Arctic) on August 24, 2018 (Struzik, 2018). The sister ship of the Akademic Ioffe

reached it in 16 hours and brought all passengers to Kugaaruk (Humpert, 2018).

This chapter, which is the first work to model both maritime mass rescue evacuation and

logistics support, highlights the impacts and costs of resource constraints and unavailability, and

the impact on evacuees of those resource constraints during an Arctic MRE. Because an infusion of

evacuees in Arctic communities will strain the communities’ existing infrastructure and resources,

our model considers the communities’ capacities to handle the evacuees, given available shelter,

medical facilities and airport capacity, as well as system capabilities to bring resources and equipment

into the area to support the evacuees during the Arctic MRE. This work, therefore, captures the

characteristics of an infrastructure-poor setting such as the Arctic, and models the two requirements

of MREs (i.e., evacuation and logistics suport), which are unique research contributions. It is the first

work, to the best of our knowledge, to quantitatively assess disaster response to Arctic MREs, falling

into the broad area of ‘smart’ disaster management (Neelam and Sood, 2020), where quantitative

tools are used to assess disaster response.

Outside of Arctic Alaska, the situation where there may be two phases of transportation

required for an evacuation would arise in other applications in remote regions, especially when

78

Page 90: Large-Scale Optimization Models with Applications in

considering tourism. For example, evacuating tourists from sudden onset wildfires may involve

moving them immediately out of the area impacted by the event (e.g., using buses or cars) and

then sending them home from these safe locations using aircraft. A similar situation could arise in

popular remote trekking areas (i.e., the Himalayas) should avalanches occur preventing the trekkers

from leaving the remote area. In this case, helicopters may be used to move the trekkers out of the

remote region to local communities prior to sending them home. A major finding of our analysis

on Arctic MREs is that the transportation resources are a major bottleneck in the process, which

would also provide insights into these other applications.

The remainder of this chapter is organized as follows: Section 5.2 summarizes the related

research. Section 5.3 gives a brief problem description. The details of our optimization problem

are provided in Section 5.4. We then introduce the solution methodologies in Section 5.5. Our

experimental results and our findings are shared in Section 5.6. The paper is summarized and future

research is presented in Section 5.7.

5.2 Literature Review

An Arctic mass rescue operation is similar to evacuating people from an area either before,

during, or after a disaster , with important distinctions especially since the closest communities to

incident sites are relatively small and we still need to move the evacuees out of the Arctic due to

the reasons discussed in Section 5.1. The following studies are the areas most closely related to our

work.

5.2.1 Evacuation Models with Relief Distribution

At a high-level in evacuation models, evacuees are transported from an affected area to safe

zones, such as shelters, hospitals or distribution centers, and the required commodities are delivered

from major supply centers to support them. While Uster and Dalal (2017) develop a mixed integer

linear programming model with multi-objectives to help the integration of the evacuation process

and relief material distribution after a natural foreseeable disaster (e.g., a hurricane), Stauffer and

Kumar (2021) analyze the importance of taking the disposal cost of unused items into consideration

when making initial resource deployment decision before a predictable disaster. Sabouhi et al.

(2019) design an optimization model whose goal is to provide relief commodities to evacuees and

79

Page 91: Large-Scale Optimization Models with Applications in

transport them to shelters in the aftermath of a natural disaster, along with making routing and

scheduling decision for the vehicles used during the evacuation. Setiawan et al. (2019) propose

three different models to determine the best distribution center locations to obtain the optimal relief

resource deployment after a sudden-onset disaster (e.g., an earthquake). In another study, Li et al.

(2020b) addresses a scenario-based hybrid robust and stochastic network design problem to identify

the best integrated logistics decisions in terms of relief commodity and casualty distribution. Shu

et al. (2021) propose a network design model making emergency support location and supply pre-

positioning decisions and design a cutting plane algorithm to solve it. Zhong et al. (2020) similarly

look at a network design model and a detailed vehicle routing problem to deliver pre-positioned

goods to key distribution points (which could include shelters).

There are several shortcomings of applying this previous work to an Arctic MRE. First,

none of these studies consider deprivation costs, which is critical in post-disaster humanitarian

logistics models in order to capture the actual impact of the event on people (Holguın-Veras et al.,

2013). Second, they do not consider the potential to transport relief commodities between the ‘safe

zones’ during the response, which is important in our situation since we can move existing stockpiles

between Arctic communities. Third, these previous studies do not consider moving evacuees out of

the ‘safe zones’ (Arctic communities) towards another location (Anchorage) and measure the time

to reach this final location. To the best of our knowledge, our study is the first to consider all these

features in an optimization model for Arctic MREs.

5.2.2 Prioritizing Victims During a Disaster

The concept of effectively prioritizing victims from a disaster has been well-studied. The

idea is to quickly triage victims in order to group them together and prioritize who receives relief

commodities. Existing triage methods include START (Elbaih and Alnasser, 2020) and SALT (Mc-

Kee et al., 2020). Sung and Lee (2016) use a survival probability function to prioritize victims in

order to optimize the transport of victims in ambulances to available hospitals in a mass casualty

incident. Liu et al. (2019) develop a multi-objective optimization model that identifies temporary

medical service facility locations and distributes the casualties to those facilities by taking casualty

triage and limited resources into consideration. (Rambha et al., 2021) propose a stochastic model

to identify the optimal patient distribution at a hospital after a hurricane where patients are cate-

gorized based on risk levels. Finally, Farahani et al. (2020) survey the operations research literature

80

Page 92: Large-Scale Optimization Models with Applications in

on mass casualty management and express the importance of on-site triage for successful disaster

management.

The limitations of this previous work is that it does not model how relief commodities

allocation decisions can impact the priority level of the victims (in our case, the evacuees). We

believe that modeling the role of deprivation time has on increasing priority levels is important and,

further, will help to better capture the impact of the event on the evacuees.

5.2.3 Modeling the Impact of Relief Commodities

It is likely that during a large-scale, non-routine event that there will be a surge in demand

for relief commodities and therefore, the allocation of the scarce relief commodities is of utmost

important in order to minimize the impact of the event. For example, Rodrıguez-Espındola et al.

(2020) propose a multi-objective, stochastic optimization model to mitigate the shortage seen in relief

aid, shelter and healthcare support during the disaster preparedness process. The authors show that

shelter allocation decisions play a significant role to cope with deprivation of relief resources and

its impact on evacuees. Li et al. (2018) employ a simulation model to emphasize the importance

of having explicit knowledge of the scarce vaccine inventory at hand in the case of an influenza

pandemic. The authors indicate that enhancing the visibility of inventory levels in vaccines brings

several benefits including increasing the vaccine allocation efficiency and decreasing the impact of

the pandemic.

Doan and Shaw (2019) discuss stochastic optimization techniques to allocate scarce relief

resources among multiple locations in the face of multiple, simultaneous disasters. This work high-

lights the influence of political aspects (e.g., inequities between different regions) during resource

allocation. Ramirez-Nafarrate et al. (2021) study a location-allocation problem to overcome the

trade-off between insufficient relief resources and limited response time, and provide a heuristic al-

gorithm to solve it. Lastly, we refer the reader to Ye et al. (2020) who provide an extensive review

on successful management of disaster relief inventory.

This previous literature demonstrates that relief allocation plays an important role in the

aftermath of a disaster. This is especially important in the Arctic context since it is expected that

existing resources and equipment in Arctic communities will not be able to support the evacuees

and, therefore, we must correctly plan how to allocate resources and equipment from a central hub

(such as Anchorage). It further stresses the importance of dynamically updating our allocations

81

Page 93: Large-Scale Optimization Models with Applications in

over the duration of the response, factoring in the planned movements of evacuees out of the Arctic

communities.

5.2.4 Deprivation Costs in Humanitarian Logistics

In our application, the evacuees have demand for relief commodities and it is likely that

we will not be able to fulfill all demand. Holguın-Veras et al. (2013) were the first to argue that

deprivation costs should be used instead of simply penalizing unmet demand as the former better

captures the true costs of human suffering. The authors discuss the ethical implications of prioritizing

the deprivation costs of the response as opposed to the logistics cost of the response. A key finding

is that the actual estimation of the true parameters of the deprivation cost is not a primary concern

- simply including a deprivation cost function is important. Following up on this work, Perez-

Rodrıguez and Holguın-Veras (2015) propose an innovative mathematical model to address the

challenges during inventory allocation in the aftermath of a disaster based on the notion of welfare

economics and deprivation costs. The objective of the model is to minimize the social cost incurred

during the response time and examine a heuristic method to solve this problem. In addition, Yu

et al. (2019) propose a nonlinear integer programming model to measure the performance of resource

allocation after a large-scale disaster by considering three metrics: efficiency, effectiveness, and

equity. The authors capture the effectiveness component through deprivation costs.

We will incorporate the concept of deprivation cost since it is more suitable and realistic

than to penalize unmet demands in a large-scale disaster. We discretize the deprivation cost function

and further consider situations in which fulfilling resource demands does not eliminate the entire

deprivation cost. While the model introduced by Perez-Rodrıguez and Holguın-Veras (2015) has a

non-linear and non-convex objective function, we propose an integer linear programming model (by

discretization) having a similar objective component which aims to minimize the impact of unmet

demands on the evacuees.

5.2.5 Arctic Alaska and Emergency Response

Any tactical operation performed in Arctic Alaska would face major challenges due to (i) the

remoteness of the region, (ii) the lack of infrastructure throughout the Arctic, and (iii) the difficulty

of operating in Arctic conditions. Thus, existing policies and approaches for a MRE would not be

82

Page 94: Large-Scale Optimization Models with Applications in

fully applicable and must be adapted to understand an Arctic event. In the literature, there are a

few social (i.e., non-operations research based) studies conducted specifically for Arctic emergency

response events. Fjørtoft and Berg (2020) discuss and emphasize on the importance of preparedness

to sustain safer maritime and offshore operations in the Arctic Ocean. While Rogers et al. (2020)

provide arguments on the potential challenges that could be faced during the Arctic SAR events,

Pavlov (2020) discusses the issues and limitations expected to occur in oil spill incidents. Afenyo

et al. (2020) review risk assessments techniques for oil spills in the Arctic. Kelman (2020) examine

the need for and importance of settlement and shelter after an emergency response event in the

Arctic.

To our best knowledge, Garrett et al. (2017) are the first who develop an optimization model

for an Arctic emergency response event. The authors create a mixed-integer linear programming

model to understand how to site oil spill response resources to increase response capabilities in

Arctic Alaska. The oil spill response modeling introduced the concepts of follow-up tasks to deal

with the likely situation of missing deadlines of certain key response tasks. This directly models

the remoteness of the region where previous research would not be applicable. The researchers

address some policy questions, such as stockpile and infrastructure investments, that can be utilized

in long-term planning efforts. We complement this work by examining a different type of emergency

response, namely Arctic MREs. Future work in Arctic emergency response could consider the role

of unmanned vehicles (Aiello et al., 2020), especially given the harsh environments that the response

may be operating in.

5.2.6 Our Contribution

In this research, we create a mass rescue model whose objectives are to minimize the impact

of a maritime accident in the Arctic on the evacuees and minimize the average time required for the

evacuees to move out of the Arctic. We believe that this is the first work presenting an optimization

model designed specifically for an Arctic MRE, which is increasingly important as maritime activities

in the area are projected to increase in the near future.

Most importantly, our model and quantitative analysis can be used to assess gaps in Arctic

MRE capabilities and can thus be used to prioritize investments to improve these capabilities.

Beyond these technical contributions, our work is important since it introduces an important area

where future transportation will likely take place due to changes in the Arctic.

83

Page 95: Large-Scale Optimization Models with Applications in

Although detailed passenger evacuation aboard vessels has been well-studied (e.g., Hu et al.

(2019)), models to assess the gaps in passenger evacuation in remote and infrastructure-poor set-

tings have received less attention despite its practical importance. In Arctic workshops and table-

top exercises, emergency response leadership acknowledged that an Arctic MRE would likely not

accommodate all passengers and would overwhelm Arctic villages because of inadequacies in evacu-

ation transport, support logistics, and medical, berthing, sanitary and housing requirements (Arctic

Domain Awareness Center, 2016). Tabletop exercises, such as the Arctic Incident of National Sig-

nificance (Arctic Domain Awareness Center, 2016) and Arctic Maritime Horizons Workshop (Arctic

Domain Awareness Center, 2021), help to lay out the challenges of Arctic emergency response. Our

work contributes to these exercises since it seeks to quantify the impact of inadequacies. It also

determines the gaps in planning exercises and preparation phases in terms of transportation and

logistics operations and reveals the importance of the role of optimization in emergency response in

the Arctic. Therefore, it moves beyond tabletop exercises and highlights the human costs associated

with large-scale disaster response in infrastructure-poor settings: evacuees will not be evacuated in

a timely manner, or at all, and there could be significant strain on local communities.

5.3 Problem Description

Arctic mass rescue events (MREs) require moving evacuees from a distressed ship, transport-

ing them to Arctic communities, and then transporting them out of the Arctic (we focus specifically

on moving them to Anchorage) to complete the operations. This needs to occur while supporting the

evacuees as well. Movement within our problem can be represented with a transportation network,

an example of which is in Fig 5.1.

The modeling process for our study involved observing tabletop planning exercises and

stakeholder interviews to form some of the core assumptions of our model. We were able to observe

the Northwest Passage Tabletop Exercise in 2017 involved a variety of stakeholders from Canada

and the United States and served as a planning exercise to understand the response to a MRE.

This helped to highlight some of the considerations that would go into decision-making in real-time.

We also asked initial scoping questions to officials in District 17 of the United States Coast Guard

(USCG), which covers the entire state of Alaska. These officials had significant experience in search

and rescue (including participating in the aforementioned tabletop exercise). We were also able

84

Page 96: Large-Scale Optimization Models with Applications in

Anchorage

Nome

Kotzebue

Atqasuk

Point Lay

Point Hope

Utqiagvik

Wainwright

670 miles

695 miles

545 miles

535 miles

717 miles

Figure 5.1: Visualization of transportation network in North Slope

to answer important questions from the practitioner’s perspective in building our model and data

including:

• What would be the process of moving evacuees out of the Arctic? Answer: evacuate them using

vessels to Arctic communities and then use air assets to move them out of these communities.

• What type of assets would be used to transport evacuees to shore? Answer: a combination of

USCG vessels and vessels of opportunity.

• Where and how would evacuees be transported once on-shore? Answer: A combination of

federal, state, and privately-owned aircraft.

• How would the Air National Guard and the U.S. Air Force be involved? Answer: They would

be significant in terms of the logistics required to support evacuees with resources and assets.

5.3.1 Important Concepts Used in Modeling

In order to model the impact of the event on the evacuees, we introduce three important

discussions on priority levels of the evacuees, the relief commodities (classified as either resources

or equipment) and their role, and then how to model when evacuees are deprived of those relief

commodities.

5.3.1.1 Priority Level

The priority level of the evacuee is meant to model his/her medical status where a lower

priority status is associated with a lower severity. If the demand (needs) of an evacuee are not

fulfilled, then their priority level may increase. Alternatively, the level may decrease with appropriate

85

Page 97: Large-Scale Optimization Models with Applications in

medical care (although we note that is not likely to occur during the event given the limitations of

health care facilities and number of medical personnel in Arctic Alaska). We aim to make logistics

decisions in order to minimize the deterioration on evacuees’ existing medical states and transport

them to Anchorage as soon as possible in order to provide service there. It is important to note that

having a higher priority level does not necessarily mean that it is best to provide relief commodities

to a person since it may be important to be proactive and to prevent the medical status of the

other evacuees from getting worse. Further, certain relief commodities may only be necessary for

certain priority levels. Our proposed modeling will focus on allocating relief commodities in order

to minimize the cumulative impact of the event across all evacuees.

5.3.1.2 Relief Commodities

Relief commodities are defined as items given to evacuees in order to meet their basic needs.

Example commodities include food, water, shelter, and bedding. Based on these examples, it is

clear that a finer categorization into resources and equipment is necessary to capture the differences

between consumable and non-consumable commodities. Resources are defined as commodities where

the evacuees will have a recurring demand for them. Equipment can be viewed as a ‘one-time’

demand that, once fulfilled, is satisfied. Further, equipment will become available once someone

assigned the equipment leaves the particular Arctic community, e.g., a bed can be reassigned to

another person. The re-allocation of equipment plays an important role in our model due to i) the

limited number of stock in the region, ii) non-consumability, and iii) the non-transportability of

certain equipment.

The demand for resources is likely similar for all priority levels, although missing the demand

may result in more severe impacts for higher priority levels (which we will discuss in the next section)

or may result in an evacuee increasing their priority level. However, the equipment needs for priority

levels will change since medical support (via a bed in a medical center) is necessary for the highest

priority level. This fact will complicate our models as the evacuee may be using equipment (e.g., a

normal bed) when they enter the highest priority level and only release the equipment once their new

equipment demand is met. We assume equipment demand is satisfied (except for medical support)

while the evacuees are in transit since assets are already equipped to a certain extent.

86

Page 98: Large-Scale Optimization Models with Applications in

5.3.1.3 Modeling the Impact of Deprivation on the Evacuees

The idea of deprivation-based penalty costs (Perez-Rodrıguez and Holguın-Veras, 2015) is

to capture the fact that the longer an evacuee goes without having their basic needs (e.g., food and

water) met, the more impactful it is on the evacuee. For instance, a six-hour lack of water does not

have 1/4th

impact on a human body compared to having been without water for 24 hours. Hence,

the deprivation cost is computed as an exponential-like function of the discrete deprivation time

(Holguın-Veras et al., 2013). Furthermore, note that assuming that met demands fully eliminate the

deprivation cost, which implies that all the impact of being without resources is alleviated, is not

realistic.

In order to best illustrate the concept of the hysteric case in the deprivation cost function,

Fig 5.2 depicts the costs as the time without a resource increases. Suppose an evacuee has not been

provided with water for eight time periods (from A to C). For simplicity, also assume that if the

demand is satisfied at time t corresponding period 8 (at point C), the deprivation time declines to

period 5 - in other words, to point B. Note that the curves D-E and C-B are identical. This implies

that though the demand is met, some amount of deprivation cost is still incurred due to the human

suffering as a result of high deprivation time.

Deprivation Time (in period)

Dep

rivation

Co

st

s=8

s=5

A

B

D

E

3 Time Periods

C

Figure 5.2: Illustration of deprivation cost function (adapted from Perez-Rodrıguez and Holguın-Veras(2015))

Holguın-Veras et al. (2013) propose a continuous generic deprivation cost function shown in

Eq 5.1:

γ(δit) = e(1.5031+0.1172δit) − e(1.5031) (5.1)

87

Page 99: Large-Scale Optimization Models with Applications in

where δit is the deprivation time at node i representing an evacuee in time t. We will adapt this idea

to account for both the length of resource deprivation (defined as sr) and equipment deprivation

(defined as se) and the priority level adjustment (defined as p). We note that when the demand of

an evacuee for resources are met, we may not decrease sr all the way to one in order to capture the

hysteretic behavior.

Eq 5.2 demonstrates how to compute the deprivation time as a funtion of sr and se and Eq

5.3 defines the adapted deprivation cost function.

δt = αsr + (1− α)se, 0 ≤ α ≤ 1 (5.2)

κ(p, δt) = e(1.5031+0.1172pδt) − e(1.5031) (5.3)

where α is a non-negative constant which is preferably set as close to 1 to emphasize the importance

sr since equipment deprivation is not nearly as impactful as resource deprivation. During so-called

shoulder season MREs, where a lack of access to heat and shelter can have detrimental health

impacts (Mak et al., 2011), we can tune α appropriately.

5.3.2 Objectives

There are many different criteria that may be used to evaluate the response. First, it is

necessary to examine the average evacuation time of the evacuees through the different ‘stages’ of the

response efforts (i.e., off the ship and then out of the Arctic). Second, it is necessary to understand

the impact of the response on the evacuees, which will be measured through the use of deprivation

costs. Third, it may be necessary to understand the variable costs incurred during the response.

We now discuss each of these in more detail.

The average evacuation time consists of the time evacuees leave the cruise ship and the time

evacuees arrive at Anchorage. Given the fact that we are evacuating a distressed cruise ship, we

will enforce a penalty cost (in terms of time) for evacuees left on the cruise ship at the end of the

response horizon. The evacuees that are left on the cruise ship or in the Arctic communities would

still be evacuated but outside of the ‘desired’ target time of our planning horizon. In addition, we

seek to move evacuees out of the Arctic communities and, therefore, we impose a similar penalty for

evacuees in an Arctic community at the end of the response horizon. In most situations, it is likely

that the evacuation time criteria will be quite important since it helps measure when people return

88

Page 100: Large-Scale Optimization Models with Applications in

to stable conditions.

During the evacuation, we aim to make sure that evacuees are properly taking care of,

as best as possible. With this purpose, we examine the current status of evacuees in each time

period. Modeling the current status of evacuees is conducted by a network called the “status

network” consisting of nodes (p,sr,se) where p represents the priority level, sr represents the time

without resources, and se represents the time without equipment. Each status is associated with

a deprivation cost where higher priority levels and deprivation times imply higher costs (using Eq.

(5.3))

Based on examining just these two criteria, we have a multi-criteria decision making problem

(MCDMP). We refer the reader to Triantaphyllou (2000) and Chankong and Haimes (2008) for more

details and further discussions on MCDMPs. In the MCDMP evacuation literature, work has used

the weighted sum method (Stepanov and Smith, 2009) and the ε-constraint method (Jenkins et al.,

2019). In preliminary modeling efforts, we considered the deprivation costs and evacuation times as

separate objectives and explored the efficient frontier between these two objectives using a weighted

objective. However, there were only two efficient solutions: (1) the one the we present in this paper

that focuses on the evacuation objective and then doing the best to support the evacuees during

this response and (2) one that evacuees would stay on the ship as long as possible to consume

resources/equipment there since it is well-stocked. The solution (1) found that there was enough

time and available air cargo capacity to ‘prep’ the villages for the incoming evacuees in order to

support their basic needs; solution (2) was not practical since there is a desire to move the evacuees

off the distressed ship as quickly as possible, which was confirmed by our partners. In rare events,

the evacuation might not begin immediately upon rescue ships arriving at the incident location since

hasty evacuation might cause detrimental cascading events (e.g., during poor weather). In this case,

no evacuation decisions could be made until the poor weather lifted and our model would ‘start’

once these decisions begin.

In general, passenger vessel evacuation principles and operations are codified in interna-

tional agreements through the International Maritime Organization (IMO), the branch of the United

Nations that regulates global maritime shipping. The IMO Polar Code (International Maritime Or-

ganization, 2016), which the U.S. is a signatory, defines the international regulations for maritime

operations in the Arctic. The IMO (United Nations (2020)), in their Safety of Life at Sea principle,

discusses that human life takes precedence over all other considerations in an evacuation. In general,

89

Page 101: Large-Scale Optimization Models with Applications in

we have followed this code in examining our modeling process, although we should discuss MRE

costs.

We now discuss the operational/physical costs incurring during the transportation of evac-

uees. We note that much of these costs are paid pre-response (e.g., if the USCG responds, the

personnel in the response are salaried and, as a second example, stockpiles of dedicated response

resources are often maintained). In terms of variable costs, it is initially assumed that the responsi-

ble party, likely the operator of the cruise ship, will assume the the costs of search, rescue, recovery

and salvage operations. However, the U.S. Oil Pollution Act of 1990 required that when the RP is

not solvent, able to assume the costs or cannot be located, the event is federalized and the response

and rescue operations are funded by federal funding, including the Harbor Maintenance Trust Fund

(US EPA, 2020).

In terms of examining costs, we focus on variable costs associated with the response. For

example, if a plane carries relief commodities to a village and leaves the location without taking any

evacuee, we consider such operation a ‘cost’ since it has not moved evacuees out of the Arctic. On

the other hand, if the plane leaves its location with evacuees on board, then incorporating operating

costs to the objective would only change the trade-offs between costs and evacuation times if we chose

to increase the evacuation time portion of the objective. We ran some experiments and discovered

that when we restrict the number of air operations, where a plane goes into a village with resources

and/or equipment and leaves without picking anyone up, the solutions obtained remains the same

in every incident compared to the ‘original’ setting (see Section 5.6.2). This implies that the model

produces the same objective with or without incorporating the operational costs or air flights. In

other words, even if we associate each air operation with a cost, the model would produce the same

/ similar solutions in each incident as the ones we have obtained unless we prioritized cost above

evacuation. A similar observation would occur should we begin limiting the number of ships used

in moving passengers from the cruise ship to the villages. Therefore, despite these costs being a

potential criteria to evaluate the response, they do not need to be examined in more detail in our

experiments.

5.3.3 Assumptions

We now discuss some of the underlying assumptions within our model. We examine a

deterministic planning environment which implies that the priority levels of evacuees as well as

90

Page 102: Large-Scale Optimization Models with Applications in

the number of assets involved in the response are known in advance. We assume that there are

deployment times for the assets to model the fact that they may need to prepare to help with the

response. We assume partial allocation amongst evacuees within the same group (in a flow network)

and that equipment demand may be met in transit with the exception of medical equipment demand.

We also assume that there is no financial restriction on procurement and transportation of any

resource and equipment (e.g., see the discussion in Section 5.1). We also assume that there is a

location (for example, Anchorage, Alaska) that has enough resources to fully support evacuees once

they arrive there. This means the response for that evacuee is ‘complete.’ As for the cruise ship,

we assume that there is an adequate amount of resource and equipment stock for a certain amount

of time to take care of the evacuees’ needs on board. We lastly assume that we will not distribute

resources for consumption during travel (i.e., in transit).

Transportation and allocation decisions are performed at the end of each time period. Hence,

the evacuation event is initiated at t = 1. If the resource demand is not satisfied for an evacuee in

a time period, then sr will increase by 1 and may cause a ‘jump’ in priority level. This assumption

is considered realistic since resources to be dispatched (i.e., water and food ) have vital importance

in terms of the impact on a human. If se > 1 and equipment demand is not met, then se will

increase by one; however, equipment will not cause an increase in medical status. If se = 1, then

we have that the evacuee has equipment and, therefore, we can view se = 1 as an absorbing state,

i.e., your equipment demand will remain satisfied. If resource demand is satisfied, then sr will

decrease according to the flow arcs (Section 5.3.4) connecting (p,sr,se) nodes in the status network.

If equipment demand is satisfied, then se is set to 1.

There are five decisions that can be implemented on an evacuee in a in a time period, which

determines their status in the next period: i) An evacuee may receive all the required resources and

equipment, ii) An evacuee may receive neither resources nor have its equipment demand met, iii)

An evacuee may only receive the required resources but not have their equipment demand met, iv)

An evacuee may have their equipment demand met but not be provided with resources, or v) An

evacuee may be transported to another location - either a community or Anchorage via an asset.

This implies that the equipment demand is satisfied. We create five sets of flow arcs to utilize in the

balance constraints (Section 5.4.2) in order to model the impact of these situations on the evacuees.

In the next section, the flow arcs designed to model the status of evacuees are introduced. We

further explicitly discuss how priority levels might change after each decision.

91

Page 103: Large-Scale Optimization Models with Applications in

5.3.4 Flow Arcs

We design five different sets of flow arcs to understand how the five possible decisions

impacting an evacuee (based on resource and equipment allocation decisions) will impact their

status. Remember that each (p,sr,se) is represented as a node in the status network. An arc is

presented between node (p,sr,se) and (p,sr,se) if the decision represented by the corresponding arc

set causes a status change from (p,sr,se) to (p,sr,se). We will use (p,sr,se) to represent the status

change of the evacuee. The first four arc sets are taken into consideration when an evacuee is in a

location. The sets of flow arcs are:

i. The Resource Satisfied Set (ERSS) : When an evacuee receives only the required resources, the

ERSS is utilized to decide the (p,sr,se) situation of the evacuee in the following time period.

We will always have that p = p (since priority level cannot increase due to unmet equipment

demand) and sr < sr. If se = 1, then se = 1. Otherwise, se = se + 1.

ii. The Equipment Satisfied Set (EESS): The EESS is utilized when equipment is the only com-

modity allocated to an evacuee. In this case, we have that se = 1 and sr = sr+1. The priority

level, however, may increase by 1, i.e, p ≤ p ≤ p + 1 if sr was the last time period before a

priority level jump. There is an exception, though, where if p is the transition priority level

(meaning that a person needs medical support but has yet to be assigned to a medical shelter)

than satisfying the equipment demand also has the priority level jump to a different priority

level.

iii. Both Resource and Equipment Satisfied Set (EBSS): Both resource and equpiment demands

are met. In this case, we have that p = p, sr ≤ sr (it may be equal if sr = 1), and se = 1.

iv. Both Resource and Equipment Non-Satisfied Set (EBNS): The worst-case scenario is not being

able to allocate any resource and equipment to an evacuee. The EBNS is utilized to determine

the p-sr-se condition of an evacuee if no resource and equipment is provided to him. We have

that sr = sr + 1, se = se + 1, and p ≤ p ≤ p + 1 where p = p + 1 if sr was the threshold for

the priority level jump.

v. Travel Set (ETS): It is unlikely that the evacuees would receive any resources while traveling on

rescue ships. Therefore, when an evacuee is being transported with an asset, the corresponding

sr increases based on the travel time. If an evacuee is being transported from location i to

location j via asset a with travel time τija ∈ Z+, then the evacuee’s resource period increases

92

Page 104: Large-Scale Optimization Models with Applications in

by τija and becomes sr+τija once the evacuee reaches location j. Note that se keeps increasing

only for those who are in either the highest priority level or transition priority level, since no

medical service can be provided on an asset. Lastly, the priority level might go up if sr reaches

the same bound set in the EESS .

5.3.4.1 Illustrative Example

We provide an illustrative example to elaborate on how the allocations decisions are con-

ducted and start our discussion with the assumptions specifically made for this example. We only

consider allocating one resource for the sake of simplicity. Thus, only resource allocation decisions

are taken into consideration: (i) an evacuee receives the required resource or (ii) an evacuee cannot

receive the required resource.

Note that all the decisions (i.e., transportation and allocation) are made at the end of each

time period. We are currently in time period 4 of the evacuation and seven evacuees have reached

communities. There are two priority levels and three communities. We focus on a single type of

resource (e.g., food) so we are only keeping track of sr and travel between the communities requires

one time period.

sr=1 sr=2 sr=3 sr=4 sr=5

p=2

p=1

Figure 5.3: Evacuees in community 1 at time 4

sr=1 sr=2 sr=3 sr=4 sr=5

p=2

p=1

Figure 5.4: Evacuees in community 2 at time 4

Fig 5.3 and Fig 5.4 illustrate the status of the seven evacuees across the two communities

they are located. In Community 1, we have one evacuee with (p = 1, sr = 2), three evacuees

with (p = 1, sr = 3), and one with (p = 2, sr = 4). In Community 2, we have two evacuees with

(p = 2, sr = 3). In Community 1, there is enough food to satisfy three evacuees’ demands. There is

not enough food in Community 2 to satisfy the demands of the evacuees.

In order to understand the (p,sr) status of the evacuees in the next time period, t = 5,

resource allocation decisions are made and evacuees move along arcs represented in Fig 5.5 and Fig

5.6. Note that the ‘priority jump’ occurs when sr = 3.

The following allocation decisions were made in time period t = 4 resulting in the movements

pictured in Fig 5.7 and Fig 5.8:

93

Page 105: Large-Scale Optimization Models with Applications in

sr=1 sr=2 sr=3 sr=4

p=2

sr=5

p=1

Figure 5.5: Movements when resource demand isnot satisfied

p=2

p=1

sr=1 sr=2 sr=3 sr=4 sr=5

Figure 5.6: Movements when resource demand is sat-isfied

sr=1 sr=2 sr=3 sr=4 sr=5

p=2

p=1

Figure 5.7: Evacuees in community 1 at time 5

sr=1 sr=2 sr=3 sr=4 sr=5

p=2

p=1

Figure 5.8: Evacuees in community 3 at time 5

• In Community 1, food is allocated to the evacuee at node (p = 2, sr = 4) due to their higher

priority level and this person will transition to (p = 2, sr = 2) in the next time period. The

two other units of food are distributed to two evacuees in (p = 1, sr = 3), which prevents them

from jumping the next priority level (see Fig 5.7).

• The third evacuee in Community 1 with (p = 1, sr = 3) is transported to Community 3 by a

plane. When this evacuee arrives in Community 3, his status level will be (p = 2, sr = 4) since

he reached the jump (see Fig 5.8). The evacuee in Community 1 with (p = 1, sr = 2) will not

receive food and therefore transition to (p = 1, sr = 3) in the next time period.

• The two evacuees who are in Community 2 depart the community towards Community 1. They

do no receive food and, therefore, they will arrive in Community 1 in the next time period at

(p = 2, sr = 4). The logic behind such a transportation decision could be that the ‘grouping’

of evacuees will make it easier (and quicker) to move the group to Anchorage in the coming

time periods, thus making better use of plane capacities. For example, we may then choose

to transport all 6 evacuees in Community 1 to Anchorage in time period 5 (thus arriving in

time period 6) where moving the 2 evacuees directly to Anchorage would result in the other

4 (currently in Community 1) not arriving in Anchorage until time period 7 since we must

travel from Community 2 to Anchorage then to Community 1 and then back to Anchorage.

The average evacuation time with the ‘grouping’ would be 6 while the average evacuation time

would be 6.33.

The summary of all the decisions conducted between time 4 and 5 along with the consequences

of these decisions are provided in Table 5.2. Note that transportation decision also implies a non-

94

Page 106: Large-Scale Optimization Models with Applications in

Table 5.2: Decisions conducted at the end of time 5 and their consequencesEvacuee 1 Evacuee 2 Evacuee 3 Evacuee 4 Evacuee 5 Evacuee 6 Evacuee 7

Beginningof Time 4

Location Comm. 1 Comm. 1 Comm. 1 Comm. 1 Comm. 1 Comm. 2 Comm. 2Priority 1 1 1 1 2 2 2Period 2 3 3 3 4 3 3

End ofTime 4

DecisionNon

SatisfiedSatisfied Satisfied Transported Satisfied Transported Transported

Beginningof Time 5

Location Comm. 1 Comm. 1 Comm. 1 Comm. 3 Comm. 1 Comm. 1 Comm. 1Priority 1 1 1 2 2 2 2Period 3 1 1 4 2 4 4

satisfied demand.

The importance of modeling these allocation decisions, rather than allowing a greedy allo-

cation of resources, is that it helps decrease the impact of the event on the evacuees. For example,

we have examined a test scenario with 75 evacuees in the lowest priority level are in a village having

144 available units of a relief resource over a horizon of six periods. If a greedy allocation was used,

i.e., allocating this resource whenever there is a demand for it, then we obtain a deprivation cost

of 438.87 and 69 evacuees jumps to the next priority level. On the other hand, when using our

optimization model to allocate resources, we observe that not a single evacuee jump to next priority

level and the total deprivation cost turns out the be nearly the half of the greedy one (i..e, 283.89).

The use of the model allows us to allocate relief resources efficiently and identify the bottlenecks in

the logistical decisions.

5.4 An Optimization Model for Arctic MREs

We present the optimization model for Arctic MREs in this section. Our model and analysis

assumes that there is a centralized decision-maker (or, equivalently, full coordination and awareness

by all involved agencies). This is reasonable as we are using it to assess capability gaps and under-

stand where vulnerabilities exist in potential response efforts. This further means that we do not

need to specifically consider the areas of responsibility for an individual organization.

It is our goal to capture all features of the problem to truly identify ‘gaps’ in response

capabilities. In our study, the majority of the parameters presented in Table 5.5 (e.g., airport

capacities, hosting capacities) can be gathered from existing data sources. With this regard, we

provide a wide range of what-if analysis (see Section 5.6.3) to understand key factors surrounding

policies within Arctic Alaska. In terms of the deprivation cost function, and its parameters, it is hard

to estimate but Holguın-Veras et al. (2013) discuss that simply including this type of cost function

is often sufficient for modeling purposes (as opposed to capturing its exact parameters).

95

Page 107: Large-Scale Optimization Models with Applications in

The definitions of sets, variables, and parameters are shown in Table 5.3, Table 5.4, and

Table 5.5, respectively. Note that we use C. Ship, Anc., P. Shelter, and Med. for the cruise ship,

Anchorage, portable shelter, and medical support, respectively as abbreviations.

Table 5.3: Set definitionsSet DefinitionA Transportation assets (fixed-wing aircraft, large and small ships)Aa Planes (fixed-wing aircraft)CR Consumable resources (water, food)RE Reusable equipment (portable shelters, sleeping bags, medical support)T Time periodsSr Periods representing the amount of time passed without access to resourcesSe Periods representing the amount of time passed without access to equipmentC Locations (the cruise ship, communities and Anchorage)V CommunitiesP Priority levelsV Set of nodes where each node is represented by priority level p ∈ P and periods sr ∈ Sr, se ∈ SeERSS Set of arcs showing the transitions between each pair of node u = (pi, s

rj , s

ek) and node v = (pl, s

rm, s

en)

where u, v ∈ V, for satisfied resource and non-satisfied equipment demandsEESS Set of arcs showing the transitions between each pair of node u = (pi, s

rj , s

ek) and node v = (pl, s

rm, s

en)

where u, v ∈ V, for non-satisfied resource and satisfied equipment demandsEBSS Set of arcs showing the transitions between each pair of node u = (pi, s

rj , s

ek) and node v = (pl, s

rm, s

en)

where u, v ∈ V, for satisfied resource and equipment demandsEBNS Set of arcs showing the transitions between each pair of node u = (pi, s

rj , s

ek) and node v = (pl, s

rm, s

en)

where u, v ∈ V, for non-satisfied resource and equipment demandsETSτ Set of arcs showing the transitions between each pair of node u = (pi, s

rj , s

ek) and node v = (pl, s

rm, s

en)

where u, v ∈ V, in τ time periods of transitARSN (p, sr, se) The set of arcs ARSN (p, sr, se) = (p′, sr ′, se′) |

((p′, sr ′, se′), (p, sr, se)

)∈ ERSS

AESN (p, sr, se) The set of arcs AESN (p, sr, se) = (p′, sr ′, se′) |((p′, sr ′, se′), (p, sr, se)

)∈ EESS

ABSN (p, sr, se) The set of arcs ABSN (p, sr, se) = (p′, sr ′, se′) |((p′, sr ′, se′), (p, sr, se)

)∈ EBSS

ABNN (p, sr, se) The set of arcs ANSN (p, sr, se) = (p′, sr ′, se′) |((p′, sr ′, se′), (p, sr, se)

)∈ EBNS

ATNτ (p, sr, se) The path set ATNτ (p, sr, se) = (p′, sr ′, se′) |((p′, sr ′, se′), (p, sr, se)

)∈ ETSτ where d((p′, sr ′, se′),

(p, sr, se)) = τ states that ∃ paths of length τ from (p′, sr ′, se′) to (p, sr, se)

5.4.1 Objective function

The objective function of our mass rescue operation model is:

96

Page 108: Large-Scale Optimization Models with Applications in

Table 5.4: Variable definitionsVariable DefinitionIrit the amount of resource r ∈ CR in location i ∈ C at time t ∈ TBeit the amount of equipment e ∈ RE in location i ∈ C at time t ∈ Tgrijat the amount of resource r ∈ CR sent from location i ∈ C to location j ∈ C via asset a ∈ A at time t ∈ Theijat the amount of equipment e ∈ RE sent from location i ∈ C to location j ∈ C via asset a ∈ A at time t ∈ Tfpsrseijat the number of people in priority p ∈ P with periods sr ∈ Sr, se ∈ Se sent from location i ∈ C to location

j ∈ C via asset a ∈ A at time t ∈ TXait whether asset a ∈ A is in location i ∈ C at time t ∈ T . If a is in i, then Xait = 1, Xait = 0 otherwiseZait whether asset a ∈ A stays in location i ∈ C at time t ∈ T . If a stays in i, then Zait = 1, Zait = 0 otherwiseYaijt whether asset a ∈ A leaves location i ∈ C at time t ∈ T heading towards location j ∈ C. If a departs, then

Yaijt = 1, Yaijt = 0 otherwiseDeipsrset the amount of equipment e ∈ RE in location i ∈ C used for people in priority p ∈ P with periods sr ∈ Sr

, se ∈ Se at time t ∈ TKripsrset the amount of resource r ∈ CR in location i ∈ C used for people in priority p ∈ P with periods sr ∈ Sr,

se ∈ Se at time t ∈ TQpsrseit the number of people in priority p ∈ P with periods sr ∈ Sr, se ∈ Se who require resource and equipment in

location i ∈ C at time t ∈ TBSpsrseit the number of people in priority p ∈ P with periodssr ∈ Sr, se ∈ Se whose resource and equipment demand is

met in location i ∈ C at time t ∈ TBNpsrseit the number of people in priority p ∈ P with periodssr ∈ Sr, se ∈ Se whose resource and equipment demand is

not met in location i ∈ C at time t ∈ TESpsrseit the number of people in priority p ∈ P with periods sr ∈ Sr, se ∈ Se whose resource demand is not met, while

equipment demand is met in location i ∈ C at time t ∈ TRSpsrseit the number of people in priority p ∈ P with periods sr ∈ Sr, se ∈ Se whose resource demand is met while

equipment demand is not met in location i ∈ C at time t ∈ T

Table 5.5: Parameter definitionsParameter Definitionαrp the amount of resource r ∈ CR required to satisfy an evacuee’s demand in priority p ∈ Pζep the amount of equipment e ∈ RE required to satisfy a evacuee’s demand in priority p ∈ Pρri the amount of resource r ∈ CR positioned in location i ∈ C at time t = 1ξei the amount of equipment e ∈ RE positioned in location i ∈ C at time t = 1νpsrsei the number of evacuees in priority p ∈ P with periods sr ∈ Sr, se ∈ Se being in location i ∈ C at time t = 1µa the maximum cargo capacity of asset a ∈ AΨa the maximum passenger capacity of asset a ∈ Aπai whether location i ∈ C is the closest location to asset a ∈ A at time t = 1.Ωa the travel time of asset a ∈ A to the closest locationθai whether asset a ∈ A can land on location i ∈ C. If a can land, then θai = 1, θai = 0 otherwiseτaij the travel time of asset a ∈ A from location i ∈ C to location j ∈ Cωr the weight of one unit resource r ∈ CRεe the weight of one unit equipment e ∈ REφi the ground capacity of location i ∈ Cκpsrse deprivation cost for a person in priority p ∈ P with periods sr ∈ Sr, se ∈ SeΓpsrseijl cumulative in-transit deprivation cost for an evacuee in priority p ∈ P with periods sr ∈ Sr, se ∈ Se traveling

from location i ∈ C to location j ∈ C which takes l time unitϑi the available capacity of location i ∈ C to host evacueesγi the available capacity of public spaces in location i ∈ Cp′max the transition priority level in which an evacuee needs medical support but has yet to be assigned to a medical

facilitytlim the earliest time period when an evacuee can be in the transition prioritypmax the highest priority level

97

Page 109: Large-Scale Optimization Models with Applications in

minimize: (5.1)∑p∈P

∑sr∈Sr

∑se∈Se

∑i∈C/

“Anc.”

∑t∈T

κpsrseQpsrseit +∑p∈P

∑sr∈Sr

∑se∈Se

∑i∈C

∑j∈C

∑a∈A

∑t∈T

Γpsrseijτijafpsrseijat

+∑p∈P

∑sr∈Sr

∑se∈Se

∑i∈V

2|T |Qpsrsei|T | +∑p∈P

∑sr∈Sr

∑se∈Se

3|T |Qpsrse“C. Ship”|T |

∑p∈P

∑sr∈Sr

∑se∈Se

∑i∈C

∑a∈A

∑t∈T

(t+ τi“Anc.”a)fpsrsei“Anc.”at+

∑p∈P

∑sr∈Sr

∑se∈Se

∑j∈C

∑a∈A

∑t∈T

tfpsrse“C. Ship”jat

The objective function has six components. The first two components examine the total

deprivation costs associated with evacuees in each location excluding Anchorage and in transit,

respectively. The following two components help to drive evacuees, if possible, to Anchorage and off

of the cruise ship, respectively, by incurring penalties for those that remain in the communities or

on the cruise ship at the end of the planning horizon. The fifth component is focused on the total

evacuation time of evacuees arriving in Anchorage while the sixth component is focused on the total

evacuation time of moving people off of the cruise ship.

5.4.2 Constraints

We present the constraints based on two categories: those on how we use the assets to move

evacuees and resources and those modeling the allocation decisions and their impact on the status

on the evacuees.

5.4.2.1 Asset Constraints

The asset constraints presented in this section are grouped into two categories. The first

one are capacity-based constraints and the second one focuses on initial assignments and routing of

the assets.

Capacity Constraints

∑e∈RE\“Med.”

εeheijat +∑r∈CR

ωrgrijat ≤ µaYaijt ∀a ∈ Aa,∀i ∈ C,∀j ∈ C,∀t ∈ T (5.2)

Constraint (5.2) ensures that the total weight of resources and equipment carried by a plane does

98

Page 110: Large-Scale Optimization Models with Applications in

not exceed its capacity. Medical support is not considered due to fact we assume it cannot be

transported.

∑p∈P

∑sr∈Sr

∑se∈Se

fpsrseijat ≤ ΨaYaijt ∀a ∈ A,∀i ∈ C, ∀j ∈ C,∀t ∈ T (5.3)

Constraint (5.3) ensures that an asset cannot carry more evacuees than its passenger capacity.

∑a∈Aa

Xait ≤ φi ∀i ∈ C, ∀t ∈ T (5.4)

Constraint (5.4) guarantees that the number of planes landing at an airport does not violate the

airport capacity in a location (i.e., the communities and Anchorage) during a time period.

Positioning and Travel Constraints

∑t∈T :t<Ωa

Xait = 0 ∀a ∈ A,∀i ∈ C (5.5)

Xait=Ωa = πai ∀a ∈ A,∀i ∈ C (5.6)

Constraints (5.5) and (5.6) make the initial assignment of each asset by taking the deployment times

into consideration. This ensures that the asset goes to the closet (acceptable) community to prepare

for deployment.

∑i∈C

Xait ≤ 1 ∀a ∈ A,∀t ∈ T (5.7)

Constraint (5.7) implies that an asset can be located at most in one location during a time period.

Yaijt ≤ θaj ∀a ∈ A,∀i ∈ C, ∀j ∈ C,∀t ∈ T (5.8)

Constraint (5.8) prevents an asset from landing at locations not meeting its required specifications.

Xait = Zait +∑j∈C

Yaijt ∀a ∈ A,∀i ∈ C, ∀t ∈ T (5.9)

Constraint (5.9) ensures that in each time period, an asset either stays in its location or travels to

99

Page 111: Large-Scale Optimization Models with Applications in

another one.

Xait = Zai(t−1) +∑j∈C

Yaji(t−τaji) ∀a ∈ A,∀i ∈ C,∀t ∈ T/1 (5.10)

Constraint (5.10) ensures that if an asset is at a location in t then either asset a stayed in location

i at time t− 1 or asset a left location j at time t− τaji to arrive at location i.

5.4.2.2 Resource and equipment allocation and its impact on the status of the evacuees

The constraints introduced in this section capture the resource and equipment allocation

decisions to the evacuees and the influence of this allocation on their status.

Resource and Equipment Balance Constraints

Iri(t=1) = ρri −∑p∈P

∑sr∈Sr

∑se∈Se

Kripsrse(t=1) ∀r ∈ CR, ∀i ∈ C (5.11)

Constraint (5.11) initiates the resource inventories in each location at t = 1. Note that no trans-

portation decision is conducted during the first time period. This is because each asset takes at least

one time unit to be assigned to the initial locations (i.e., deployment time).

Bei(t=1) = ξei −∑p∈P

∑sr∈Sr

∑se∈Se

Deipsrse(t=1) ∀e ∈ RE \ “P. Shelter” ∀i ∈ C (5.12)

B“P. Shelter”i(t=1) = γi + ξ“P. Shelter”i −∑p∈P

∑sr∈Sr

∑se∈Se

D“P. Shelter”ipsrse(t=1) ∀i ∈ C (5.13)

Constraints (5.12) and (5.13) position equipment in each location at t = 1, including incorporating

the public spaces located into each community into their ‘shelter’ inventory level (Constraint (5.13)).

Irit +∑p∈P

∑sr∈Sr

∑se∈Se

Kripsrset +∑j∈C

∑a∈A

grijat = Iri(t−1) +∑j∈C

∑a∈A

grjia(t−τaji) ∀r ∈ CR, (5.14)

∀i ∈ C,∀t ∈ T \ 1

Constraint (5.14) is the resource inventory balance equation. At time t, the inventory level in each

location is equal to the amount of the resources remaining from the previous time period and the

resources transported from other locations. Furthermore, resources in the current location can be

100

Page 112: Large-Scale Optimization Models with Applications in

carried to the other locations at time t and can be distributed to the evacuees.

Beit +∑p∈P

∑sr∈Sr

∑s∈Se

Deipsrset +∑j∈C

∑a∈A

heijat = Bei(t−1) +∑j∈C

∑a∈A

hejia(t−τaji)+ (5.15)

∑p∈P\

pmax,pmax′

∑sr∈Sr

∑j∈C

∑a∈A

fpsr(se=1)ija(t−1) +∑sr∈Sr

∑se∈Se

∑j∈C

∑a∈A

fpmax′srseija(t−1)+

∑sr∈Sr

∑se∈Se

BSpmax′srsei(t−1) +

∑sr∈Sr

∑se∈Se

ESpmax′srsei(t−1) ∀e ∈ RE \ “Med.”,∀i ∈ C,

∀t ∈ T \ 1

B“Med.”it +∑p∈P

∑sr∈Sr

∑s∈Se

D“Med.”ipsrset = B“Med.”,i,t−1 +∑sr∈Sr

∑j∈C

∑a∈A

fpmaxsr(se=1)ija(t−1)

(5.16)

∀i ∈ C,∀t ∈ T \ 1

Constraints (5.15) and (5.16) are an equipment inventory balance equations similar to Constraint

(5.14). However, since equipment is considered non-consumable, the equipment of those who depart

the location during the previous time period become available at time t. Further, recall that those

in the transition priority (i.e., pmax′) will release their ‘normal’ equipment once they are assigned

the medical support necessary for their priority level (e.g., they will move from a bed to a bed in

the medical center). Further, as mentioned previously, medical support is only provided in medical

centers, which are non-transportable. Hence, an individual equipment balance constraints (see

Constraint (5.16)) is generated for medical support.

B“P. Shelter”it +∑p∈P

∑sr∈Sr

∑se∈Se

D“P. Shelter”ipsrset +∑p∈P\

pmax,pmax′

∑sr∈Sr

∑se∈Se

BSpsr(se=1)it+ (5.17)

∑p∈P\

pmax,pmax′

∑sr∈Sr

∑se∈Se

ESpsr(se=1)it +∑sr∈Sr

∑se∈Se

BNpmax′srseit +

∑sr∈Sr

∑se∈Se

RSpmax′srseit ≥ γi

∀i ∈ C, ∀t ∈ T

Constraint (5.17) ensures that public spaces are not transported to other communities by ensuring

that the total capacity of a location never goes down the true capacity. In particular, the left hand

side of the constraint sums up the inventory of shelter at location i carrying over into the next period,

101

Page 113: Large-Scale Optimization Models with Applications in

the amount of shelter assigned in t, the amount of people in normal priority levels (not pmax or p′max)

that currently have shelter (se = 1), and the amount of people at p′max that are currently using the

normal shelter. The constraint ensures this summation is great than or equal to the capacity of

public space.

Evacuees Balance Constraints

Qpsrseit=1 = νpsrsei ∀p ∈ P,∀sr ∈ Sr,∀se ∈ Se,∀i ∈ C (5.18)

Constraint (5.18) assigns the initial populations in each location. Clearly, the ship is the only

location where evacuees are located at t = 1. We further constrain the number of evacuees that can

be in an Arctic community at a particular time:

∑p∈P

∑sr∈Sr

∑se∈Se

Qpsrseit ≤ ϑi ∀i ∈ C,∀t ∈ T (5.19)

We now describe the constraints governing the transitions of the evacuees into different statuses

both out of and into different time periods. We first present the constraints where se = 1, i.e., the

equipment demand has already been met for all priorities besides the transition priority (p′max).

Qpsr(se=1)it = BSpsr(se=1)it + ESpsr(se=1)it +∑j∈C

∑a∈A

fpsr(se=1)ijat ∀p ∈ P \ pmax′, (5.20)

∀sr ∈ Sr,∀i ∈ C, ∀t ∈ T

Qpsr(se=1)it =∑

(p′,sr′,se′)∈

ABSN (p,sr,se=1)

BSp′sr′se′ i(t−1) +∑

(p′,sr′,se′)∈

AESN (p,sr,se=1)

ESp′sr′se′ i(t−1) ∀p ∈ P \ pmax′, (5.21)

∀sr ∈ Sr,∀i ∈ C, ∀t ∈ T \ 1

Constraint (5.20) implies that after receiving the required equipment, evacuees either stay in se = 1

by continuing to “receive equipment” (i.e., once equipment demand is met, no extra allocation

decision is done after the first assignment) or they move to another location. Constraint (5.21)

states that evacuees can be in the absorbing equipment state if and only if they receive the required

equipment (i.e., BSpsrseit and ESpsrseit) during the previous time period. As mentioned before,

evacuees cannot arrive at a location with se = 1 and, therefore, transportation decisions are not

included in Constraints ((5.20)-(5.21)). We now turn our attention to the constraints for se 6= 1 and

102

Page 114: Large-Scale Optimization Models with Applications in

all priorities besides the transition priority (p′max).

Qpsrseit = BSpsrseit +BNpsrseit + ESpsrseit +RSpsrseit +∑j∈C

∑a∈A

fpsrseijat (5.22)

∀p ∈ P \ pmax′,∀sr ∈ Sr,∀se ∈ Se \ 1,∀i ∈ C,∀t ∈ T

Qpsrseit =∑

(p′,sr′,se′)∈

ABNN (p,sr,se)

BNp′sr′se′ i(t−1) +∑

(p′,sr′,se′)∈

ARSN (p,sr,se)

RSp′sr′se′ i(t−1)+ (5.23)

∑j∈C

∑a∈A

∑(p′,sr

′,se′)∈

ATNτaji(p,sr,se)

fp′sr′se′ jia(t−τaji) ∀p ∈ P \ pmax′,∀sr ∈ Sr,∀se ∈ Se \ 1,∀i ∈ C,

∀t ∈ T \ 1

Constraint (5.22) indicates that any of the five allocation and/or transportation decisions can be

made for evacuees with p 6= p′max and se 6= 1: they can have both their demands satisfied, BSpsrseit,

they can have neither demands satisfied, BNpsrseit, the can have just their equipment demand

satisfied, ESpsrseit, they can have just their resource demand satisfied, RSpsrseit, or they can be

transported out of i, fpsrseijat. Constraint (5.23) captures how evacuees can end up in location i at

time t with a particular status where p 6= p′max and se ≥ 2: they can have both demands unsatisfied,

they can have just their resource demand satisfied, or they can arrive from another location. We

now present the constraints governing the behavior of evacuees with the transition priority level,

p′max.

Qpmax′srseit = BSpmax

′srseit +BNpmax′srseit +RSpmax

′srseit + ESpmax′srseit+ (5.24)∑

j∈C

∑a∈A

fpmax′srseijat ∀sr ∈ Sr,∀se ∈ Se,∀i ∈ C,∀t ∈ T \ 1, . . . , tlim

Qpmax′srseit =

∑(p′,sr

′,se′)∈

ABNN (pmax′,sr,se)

BNp′sr′se′ i(t−1) +∑

(p′,sr′,se′)∈

ARSN (pmax′,sr,se)

RSp′sr′se′ i(t−1)+ (5.25)

∑(p′,sr

′,se′)∈

AESN (pmax′,sr,se)

ESp′sr′se′ i(t−1) ∀i ∈ C, ∀t ∈ T \ 1, . . . , tlim,∀sr ∈ Sr,∀se ∈ Se

The first difference is that once equipment demand is met in the transition priority level, then we

move the evacuee into the highest demand level (recall that the transition priority level is meant to

103

Page 115: Large-Scale Optimization Models with Applications in

represent an evacuee who already has ‘normal’ equipment demand met but then requires ‘normal

plus medical’ equipment demand). The second difference in these constraints is that no evacuee can

reach a location in a transition priority thus altering Constraint (5.25). There are two reasons for

this: i) equipment demand of evacuees in normal priority levels is satisfied in transit, ii) evacuees

in transition priority getting on an asset transition to the highest priority level due to releasing the

current equipment being held. One can arrive into the status by having their equipment demand

satisfied if they were making the jump to the transition priority level.

Allocating Resources and Equipment to Demand Constraints

αrp(BSpsrseit +RSpsrseit) = Krpitsrse ∀p ∈ P \ p′max, r ∈ CR, ∀sr ∈ Sr,∀se ∈ Se \ 1,

(5.26)

∀i ∈ C, ∀t ∈ T

αrpBSpsr(se=1)it = Krpitsr(se=1) ∀p ∈ P \ p′max,∀sr ∈ Sr, r ∈ CR, ∀i ∈ C (5.27)

Constraints (5.26) and (5.27) connect the satisfied flow decisions for resources and equipment, re-

spectively, to evacuees in location i at time t with a certain status to the allocation decisions made

for evacuees in that location at that time with that status. Note that since Constraint (5.27) is

created for those who are in se = 1, it does not contain the RS component since the equipment

demand is satisfied for those with se = 1.

αrpmax′(BSpmax

′srseit +RSpmax′srseit) = Kripmax

′srset ∀r ∈ CR, ∀sr ∈ Sr,∀se ∈ Se,∀i ∈ C,

(5.28)

∀t ∈ T \ 1, . . . , tlim

ζepmax′(BSpmax

′srseit + ESpmax′srseit) = Deipmax

′srset ∀e ∈ RE, ∀sr ∈ Sr,∀se ∈ Se,∀i ∈ C,

(5.29)

∀t ∈ T \ 1, . . . , tlim

Constraints (5.28) and (5.29) connect the satisfied flow decisions for resources and equipment, re-

spectively, to evacuees in a location at a time t in a certain status in the transition priority level to

the amount that is allocated to evacuees at that status in transition priority level at that location

104

Page 116: Large-Scale Optimization Models with Applications in

at that time.

ζep(BSpitsrse + ESpitsrse) = Deisrsept ∀p ∈ P \ p′max,∀sr ∈ Sr,∀se ∈ Se \ 1,∀e ∈ RE,

(5.30)

∀i ∈ C,∀t ∈ T

Constraint (5.30) ensures that equipment is allocated to satisfy the demand of those evacuees who

will have their equipment demands satisfied in location i at time t. The last constraints focus on

variable restrictions.

Irit, Teit,Kripsrset, Deipsrset, grijat, heijat, fpsrseijat, Qpsrseit, BSpsrseit, BNpsrseit, (5.31)

ESpsrseit, RSpsrseit,mijat, wit ∈ Z+ ∀r ∈ CR, ∀e ∈ RE, ∀i ∈ C, ∀p ∈ P,∀sr ∈ Sr,∀se ∈ Se,

∀a ∈ A,∀t ∈ T

Xait, Zait, Yijat ∈ 0, 1 ∀a ∈ A,∀i ∈ C, ∀j ∈ C,∀t ∈ T (5.32)

5.5 Overview of Solution Methodologies

The mathematical model is a large-scale IP that has characteristics similar to problems in

evacuation and resource allocation. It is, therefore, important to recognize that solving our model

directly with a commercial solver may be time-prohibitive and that customized solution approaches

may be necessary. We describe two heuristic approaches for identifying quality solutions quickly.

As shown in the Appendix, solving the IP using a warm-start heuristic solution outperforms solving

the IP directly (see Section A.2).

5.5.1 Conservative One-by-One Heuristic (COBOH)

We approach the problem by asking the following question: How can the model be solved if

we consider the problem through a practitioner’s eyes?. The focus would likely be on the allocating

assets to move the evacuees around and then using availability capacity to bring relief commodities

when possible. A practitioner naturally would carry the evacuees to the closest available villages

via the available ships in a greedy manner. The practitioner would use the planes to transport

everyone to Anchorage. In other words, first a ship carries a certain amount of evacuees to a village,

105

Page 117: Large-Scale Optimization Models with Applications in

then a plane takes action and carries the evacuees to Anchorage at some point later in time. This

pair of operations can be repeated in an iterative way by taking all the capacity constraints into

consideration.

This heuristic focuses on the transportation decisions only and we then optimize the resource

allocation decisions with fixed transportation decisions. In other words, we look at the best possible

use of the response resources once we know when we planned on evacuating passengers from the

cruise ship to the villages and the villages to Anchorage. Therefore, we are examining best possible

resource allocation decisions whereas, in practice, triage and rationing may be implemented to make

these decisions. The insight that we expect to obtain from the heuristic approach is: i) when we

really need an OR model for an Arctic MRE, and ii) the benefits of applying the complex model to

determine the response decisions rather than focusing solely on the transportation decisions. The

pseudocode of the heuristic can be found in the Appendix (see Section A.1). Here, we focus on

explaining the heuristic in an informal way.

Each asset is assigned to the initial location by taking the deployment times into consid-

eration. As a second step, all available ships are routed towards the cruise ship to assist with the

evacuation. We then start our iterative method and examine the ship and plane sets in sequence.

For every ship, we calculate the maximum number of evacuees that can be transported to

each village, which is reachable from the cruise ship, together with the earliest arrival time. A higher

ratio implies that we can carry more evacuees within a shorter time period to a village. Each ship

links with one village which has the highest ratio for that ship. The amount of evacuees that can be

carried to a village depends on the hosting capacity of a village, the passenger capacity of the ship,

and the number of evacuees currently at the village. Once every ship is associated with a village, we

pick the ship with the earliest arrival time. In the case of a tie, we prefer the ship with the higher

ratio.

We then proceed to make a transportation decision for a plane. Each plane is examined

one by one and a similar analysis is conducted with slight changes. We compute the last minimum

number of positive population within the total evacuation time horizon for every village after the

earliest possible departure time. The calculation of the minimum number of evacuees that can

be transported from a village is a significant step since it ensures feasibility with respect to the

population numbers in the locations.

The number of evacuees that can be carried by the plane is set as the minimum of population

106

Page 118: Large-Scale Optimization Models with Applications in

amount in terms of evacuees and the passenger capacity of the plane. Then, we essentially proceed in

the same way as the ship portion of the heuristic except that we also examine the airport capacities

and make sure that the constraint related to airport capacities is not violated. If a plane cannot

be associated with any village, then there is no way to utilize the plane for the remainder of the

horizon. In this case, we make an idle transportation decision for that plane. The asset either stays

in its current location or moves to another location while checking the airport capacity.

If a ship or a plane reaches the end of the time horizon, we eliminate the corresponding asset

from consideration. The iterative method is continued until both the ship and plane sets become

empty implying that there is no more transportation decisions required or all evacuees arrive in

Anchorage. Once the heuristic is over, we obtain a full set of transportation variables. We refer

to this heuristic as the one-by-one heuristic since we are allocating assets individually, in what is

essentially a greedy manner.

5.5.2 Optimizing Transportation Heuristic (OTH)

In the second method, we focus on optimizing just the transportation decisions focusing

on the evacuation first and then optimizing support decisions based on these ‘greedy’ evacuation

decisions. In particular, we move all the transportation variables related to the assets and evacuees

(i.e., Xaij , Yaijt, and fpsrseijat) into an optimization problem and ignore the ones related to the relief

materials (i.e., grijat and heijat). With this conversion, we intend to optimize the transportation

decisions. We first define two new variables shown in Table 5.6 that focus on the number of people in

a location and/or being transported and we ‘remove’ the fpsrsejiat variables since they are dictated

by relief/support decisions.

Table 5.6: New variables definedVariable Definitionwit the number of people staying in location i ∈ C at time t ∈ Tmijat the number of people leaving location i ∈ C to go to location j ∈ C at time t ∈ T

We rearrange the last four components of the Objective Function (5.1) and update the

correlated constraints. Then, the new modified evacuation IP (EvacIP) focusing on evacuation can

be represented as:

(EvacIP): max∑i∈C

∑a∈A

∑t∈T

(t+ τi“Anc”a)mi“Anc”at +∑j∈C

∑a∈A

∑t∈T

(t)m“C. Ship”jat+ (5.33)

107

Page 119: Large-Scale Optimization Models with Applications in

∑i∈V

2|T|wi|T | + 3|T|w“C. Ship”|T |

s.t. mijat ≤ ΨaYaijt, ∀a ∈ A,∀i ∈ C,∀j ∈ C, ∀t ∈ T (5.34)

wi(t=1) =∑p∈P

∑sr∈Sr

∑se∈Se

νpsrsei ∀i ∈ C (5.35)

wit +∑j∈C

∑a∈A

mijat ≤ ϑi, ∀i ∈ C, ∀t ∈ T (5.36)

wi,t +∑j∈C

∑a∈A

mijat = wi(t−1) +∑j∈C

∑a∈A

mjia(t−τaji) ∀i ∈ C,∀t ∈ T (5.37)

(5.4)− (5.5)− (5.6)− (5.7)− (5.8)− (5.9)− (5.10)− (5.32)

wit ∈ Z+, ∀i ∈ C, ∀t ∈ T (5.38)

mijat ∈ Z+, ∀i ∈ C, ∀j ∈ C,∀a ∈ A,∀t ∈ T. (5.39)

Note that Constraints (5.34), (5.35), and (5.36) replace Constraints (5.3) , (5.18), and (5.19),

respectively and play the same role. The left hand side of Constraint (5.37) captures the number of

evacuees staying and leaving location i at time t. The right hand side of the constraint determines; i)

how many evacuees stayed in location i at time t− 1, and ii) how many evacuees left other locations

to arrive at i at time t− τaji.

In particular, EvacIP answers the following question : What happens when we prefer to

focus on the evacuation decisions without worrying about distributing any relief sources? We then

use the model as a heuristic approach and warm start the original IP model via the partial solution

obtained through this EvacIP.

5.6 Computational Study: Data Set Description and Baseline

Analysis

The objective of our computational study is to analyze different potential response events

in Arctic Alaska and obtain insights into policy questions by applying our novel MRE model. All

the experiments are conducted in the Optimization Programming Language (OPL) using CPLEX

12.8.1 as the IP solver on an Dell machine with an Intel Core i7-8700 CPU at 3.20GHz, 64 GB Ram.

108

Page 120: Large-Scale Optimization Models with Applications in

5.6.1 Case Study Description

In this section, we discuss the data collected for our case study. We utilize online sources

that are publicly available from institutions operating in Arctic Alaska. We have discussed the

application and data with experts from the region. The case studies were also created based on

discussions with District 17 of the USCG, although the data and model have yet to be fully verified

and validated with them.

We separate our test cases into five incident locations. The purpose is to identify the areas

where there are ‘capability gaps’ for MREs from Anchorage through the Northwest Passage. The

largest cruise ship that has entered the region is the Crystal Serenity (Waldholz, 2016) and have

used its planned route shown in Figure 5.9 for selecting five incident locations. We are interested in

the region starting from the Bering Strait, through the Chukchi Sea, and into the Beaufort Sea. The

number of evacuation time periods is set equal to sixteen and there are three priority levels where

we assume that 65%, 25% and 10% of the evacuees are at level 1, 2, and 3. We consider MREs

where there are 800, 1200, and 1600 people on the cruise ship.

1

3

4

2

5

Incident Location Coordinates

1 Bering Strait 65° 50'59.9"N 168° 27'31.7"W

2 Chukchi Sea 67° 24'08.0"N 167° 56'48.3"W

3 Chukchi Sea 69° 44'00.7"N 166° 54'59.8"W

4 Chukchi Sea 71° 15'24.1"N 160° 23'27.4"W

5 Beaufort Sea 71° 26'05.8"N 154° 57'50.3"W

Figure 5.9: Incident locations selected on the Crystal Serenity’s planned routes (Waldholz, 2016)

In our data, Utqiagvik, Nome, Kotzebue, Point Hope, Point Lay, and Wainwright are the

communities where the evacuees can be used in an Arctic MRE. Note that Point Hope, Point Lay,

and Wainwright are relatively small (i.e., ones that have a population of fewer than 1000 people)

but are included since they are located in the North Slope Borough, which has a robust emergency

management department (Brooks, 2020). Each community has an airport implying that it is feasible

to take off and land there via certain planes (Federal Aviation Administration, 2019). It would be

inappropriate for large planes to use airports in the small villages due to their short runways. For

instance, Point Hope, Point Lay, and Wainwright each have a single runway of no longer than 4,500

109

Page 121: Large-Scale Optimization Models with Applications in

feet (GCR, 2017), which does not fit the normal requirements of landing the HC-130H or Boeing

737-700. Communities also have a number of small boats and vehicles that can be used for local

transport (i.e., to shuttle evacuees from offshore ships to shore or helping to move from the shore

to the airport). Since there is no capacity issue regarding such local transportation operations, we

do not model them. We also consider the potential use of the inland community of Atqasuk as a

pre-positioning site for resources and equipment. Each community has a carrying capacity standing

for the number of evacuees that can be hosted, which is set equal to 40% of its population (see Table

5.7).

Table 5.7: Populations and capacities in locationsLocation Nome Kotzebue Point Hope Point Lay Atqasuk Wainwright Utqiagvik AnchorageNum. of People 3,841 3,266 709 269 244 584 4,438 294,356Carrying Capacity 1,536 1,306 283 107 97 233 1,775 ∞Airport Capacity 3 3 1 1 1 1 3 5

The distance between each location is calculated by Google Maps in miles. Travel routes

are separated into two categories: i) sea distance (i.e., travel that accounts for the shoreline) and ii)

air distance. Discussions with USCG suggested that transportation directly from the cruise ship to

Anchorage is undesirable since (1) the ships that the evacuees would be moved to (including sister

ships) are not designed for passenger travel and (2) it could take a significant amount of time to

reach Anchorage from the Arctic via the sea. To obtain the number of time periods travel requires,

we calculate the travel time between two locations via each asset by⌈(distance)/(6× cruise speed)

⌉.

We assume there are 6 hours per time period.

The available assets play an important role in transportation and logistics operations. Ex-

amining various tabletop exercises (Coast Guard News, 2016; McNutt, 2016) have led us to incor-

porate a set of available assets owned by USCG, Alaska Air National Guard (AANG), North Slope

Borough (NSB), U.S. Air Force (USAF), and the commercial airlines operating in the region (i.e.,

Alaska Airlines and Ravn Alaska).

Moving all evacuees directly out of the Arctic (where evacuees can be supported), or even

into a single Arctic community, is not practical since the existing infrastructure and transportation

assets may not be capable of providing sufficient support or it may not be desirable to have evacuees

on those assets for long periods of time. While planes are utilized to transfer people from the

communities to Anchorage, as well as to deliver commodities to the communities, rescue ships are

only used to carry people from the cruise ship to the communities. As a result, cargo capacity of

110

Page 122: Large-Scale Optimization Models with Applications in

ships are set as zero. Aircraft are not considered in the operation of taking people off the cruise

ship. Note that helicopters are not specifically modeled in the set of air assets, although they would

play an important role in the response in transporting high-priority evacuees off the cruise ship,

possibly lightering passengers from the cruise ship to rescue ships, and moving responders onto the

ship. The main reason for not including helicopters is that they would be used in conjunction with

rescue ships except in cases of extreme medical duress.

Since military and commercial assets require mobilization time, we assume that planes

owned by USAF and commercial airlines will be available to support the MRE within 24 hours of

the event. USCG, AANG, and NSB’s assets tend to be dedicated to this type of emergency more

than USAF and commercial airlines.

Given the assumptions mentioned, we include relevant planes that we believe would be

available for the response (see Table 5.8). In our baseline experiment, we have examined the length

of the runway required for each plane to land and have disallowed the landing of large planes, such

as the HC 130H, Lockheed HC-130, and Boeing 737-700 in the small villages (Point Lay, Point

Hope, and Wainwright). This restriction will be lifted in certain analyses, which would represent

investments to the runways of these small villages. However, we do note that who pays the costs

associated with the response is a question outside the scope of this paper (i.e., the cruise ship

company or its insurance may pay the costs of the response back to the federal government). We

will create scenarios in a way that only a subset of these assets are ready to use to observe the

impact of different asset types. For ships, the passenger capacity is set as the maximum number of

crew members allowed on board. For each plane, we use 60% of its cargo capacity to ensure that no

problem is faced during loading resources and/or equipment without looking at the detailed packing

plan.

We now discuss some remaining assumptions used when creating this data set. We focus

on situations where ships are the only asset type that can carry evacuees from a cruise ship to the

communities, because planes cannot land on ships. We assume evacuees need equipment such as

shelter (either in ‘public space’, such as a school or portable shelter) and sleeping bags. Public space

is different than a portable shelter since it cannot be transported. We further assume that the time

to refuel an asset is sufficiently small compared to the travel time of the asset and, therefore, does

not need to be accounted for in our model.

Assets are assigned to the closest locations, which is assumed to be known a priori, when

111

Page 123: Large-Scale Optimization Models with Applications in

Table 5.8: List of assets (Griner, 2013; USCG, 2016; Office of Aviation Forces, 2019; Sherman, 2000; UnitedStates Air Force, 2008; Alaska Airlines, 2020; Brady, 2019; RavnAir Alaska, 2020; Cessna, 2019)

Asset Type OwnerAvailableNum.

Num. inBasline

PassengerCap.

CargoCap. (lbs)

CruiseSpeed (mi.)

HC 130H Aircraft USCG 2 2 92 51,000 374Lockheed HC-130 Aircraft AANG 1 1 20 30,000 251Learjet 31A Aircraft NSB 2 2 6 2,000 441Boeing 737-700 Aircraft Alaska Airlines 1 1 124 16,505 460Beechcraft 1900C Aircraft Ravn Alaska 1 1 12 2,030 250WLB 206 Buoy Tender USCG 1 1 86 0 17.3WLB 212 Buoy Tender USCG 1 1 86 0 17.3WLM 175 Buoy Tender USCG 1 1 24 0 13.8282 WMEC Endurance Cutter USCG 1 1 99 0 13.8378 WHEC Endurance Cutter USCG 1 1 160 0 12.7154 WPC Fast Response Cutter USCG 2 1 24 0 32.2

the rescue event is initiated. The majority of the planes are located around Anchorage and Kodiak

with a few exceptions. For instance, the Learjet 31A type aircraft is often positioned in Utqiagvik

(Griner, 2013) and will be deployed there. These initial locations for the planes are kept the same

regardless of the incident area throughout our analysis. On the other hand, since Coast Guard ships

are actively used for normal operations, we prefer not to fix a location to ships. Ships are deployed

to their initial locations based on the incident area meaning that initial locations may vary across

instances. For example, data for the incidents can capture the case where certain ships (e.g., sister

ship(s)) ‘move’ with the cruise ship in order to respond to an incident.

The stock levels of available resources and equipment in each location are illustrated in Table

5.9. We assume that the cruise ship would have enough supplies to satisfy all evacuees’ demands for

the first six time periods of the response. We assume that no resource can be taken out from the

ship when an evacuee is placed on a ship. Equipment demand will be satisfied while the evacuees are

on the ship. Given the relative population sizes of Utqiagvik, Kotzebue and Nome, we assume that

there is some level of water and food that can be used in the MRE. In addition, as a result of having

Level 4 Trauma Centers, there is medical support in Utqiagvik, Nome, and Kotzebue. We assume

a large stockpile in Anchorage for all the commodity types. Public facilities (e.g., churches, sport

centers etc.) could be utilized in response events in lieu of portable shelters and will be included in

our analysis. We also note that the level of resources available in Anchorage are at least an order

of magnitude larger than those available in the villages and, therefore, for the purposes of modeling

the MRE, we do not need to capture allocation decisions there.

We then share the list of resources and equipment together with the unit weight and the re-

quired amount for each priority level in Table 5.10. Lastly, Table 5.11 presents the initial deployment

locations for each ship used in the baseline experiment according to each incident location.

112

Page 124: Large-Scale Optimization Models with Applications in

Table 5.9: Initial inventory in each locationName Nome Kotzebue Point Hope Point Lay Atqasuk Wainwright Utqiagvik AnchorageWater 150 150 0 0 300 0 200 5000Food 150 150 0 0 300 0 200 5000Sleeping Bag 0 0 0 0 75 0 0 500Portable Shelter 0 0 0 0 50 0 0 150Public Space 200 200 50 50 50 50 250 2000Medical Support 25 25 0 0 0 0 45 400

Table 5.10: Resource and equipment list(Division of Homeland Security & Emergency Management, 2019; World Health Organization,

2019)Name Type Weight (lbs) Priority Level 1 Priority Level 2 Priority Level 3′ Priority Level 3Water Resource 1.54 1 2 3 3Food Resource 2.35 1 1 1 1Sleeping Bag Equipment 7.50 1 1 0 0Portable Shelter Equipment 26.40 1 1 0 0Medical Support Equipment — 0 0 1 1

Table 5.11: Initial deployment locations for shipsShip Incident 1 Incident 2 Incident 3 Incident 4 Incident 5WLB 206 Point Lay Nome Nome Utqiagvik UtqiagvikWLB 207 Utqiagvik Point Lay Utqiagvik Nome KotzebueWLB 212 Point Hope Point Lay Wainwright Point Lay Point LayWLM 175 Point Hope Point Hope Point Hope Point Hope Point Lay282 WMEC Kotzebue Point Hope Point Hope Wainwright Wainwright378 WHEC Nome Nome Utqiagvik Utqiagvik Utqiagvik154 WPC Kotzebue Kotzebue Kotzebue Kotzebue Kotzebue

Lastly, we represent the flow arcs designed for the evacuation balance constraints. If an

evacuee receives equipment at any time period, the corresponding se becomes one regardless of the

previous sr and se. Recall that this implies that evacuee’s equipment demand is fully met. On

the other hand, if the resource demand is satisfied, then the changes, which are demonstrated in

Table 5.12, take place according to the evacuee’s priority level. Since a priority level symbolizes the

seriousness of an evacuee’s medical situation, decreases in periods occur slower for higher priority

levels. As for the changes in priority levels, alternation takes place mainly based on the value of sr.

The situations where an evacuee’s priority level increase are shown in Tables 5.13 and 5.14 (i.e., arc

jumps taken place between the layers).

Table 5.12: Changes in sr when resource demand is metPriority Level TransitionPriority Level 1 sr ← 1Priority Level 2 If sr ≤ 4, then sr ← 1 ; otherwise sr ← sr − 3Priority Level 3 If sr ≤ 3, then sr ← 1 ; otherwise sr ← sr − 2

113

Page 125: Large-Scale Optimization Models with Applications in

Table 5.13: Jumps in ABNN (p, sr, se)Priority Level Jump (from → to)Priority Level 1 (p = 1, sr = 4,se ≥ 2) → (p = 2, sr = 5,se + 1)Priority Level 2 (p = 2, sr = 8,se ≥ 2) → (p = 3, sr = 9,se + 1)Priority Level 3 —

Table 5.14: Jumps in AESN (p, sr, se)Priority Level Jump (from → to)Priority Level 1 (p = 1, sr = 5, se ≥ 1) → (p = 2, sr = 6,se = 1)Priority Level 2 (p = 2, sr = 9,se ≥ 1) → (p = 3, sr = 10,se = 1)Priority Level 3 —

5.6.2 Baseline Experiment

During our experiments, we set a time limit of 60 minutes. If the solution method does not

converge to the optimal solution within the time limit, the best solution obtained by then together

with its optimality gap is reported. Lastly, for each experiment, we examine a total of fifteen different

scenarios consisting of five different incident locations along with three different levels of evacuees.

We start our discussion with a baseline experiment where we assume that there is a sufficient set of

resources and conditions (e.g., travel times) are in an ideal setting. We will vary certain parameters

from this baseline (e.g., when planes are available) in examining critical aspects of the response. For

this baseline, we consider only a subset of the previously described assets during the response (see

Table 5.8).

We discuss the computational performance of various approaches to solve the problem in

the Appendix (see Section A.2). Of note, warm-starting CPLEX with either heuristic significantly

outperforms directly solving the model. Further, the intuitive COBOH results in solutions with gaps

well over 10%, thus indicating the importance of using optimization to examine response efforts.

Based on this analysis, we will conduct the remaining experiments by warm-starting the IP with the

solution identified by OTH.

We now provide detailed analysis on the baseline experiment. Fig 5.10 depicts the objective

values for each scenario. The total average evacuation time is computed as the sum of the average

time to leave the cruise ship and the average time to arrive at Anchorage. Recall that lower objectives

indicate a more ‘successful’ response since we are focusing on minimizing the total of evacuation time

and the impact on the evacuees.

It is important to mention that the model is able to successfully complete the response

event in all the scenarios except Incident 1-1600 and Incident 3-1600. The reason for the failure in

Incident 1-1600 lies behind the fact that the travel times from Incident 1 to the closest communities

114

Page 126: Large-Scale Optimization Models with Applications in

0

10000

20000

30000

40000

50000

60000

70000

800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600

Incident 1 Incident 2 Incident 3 Incident 4 Incident 5

Baseline experiment

Total Average Evacuation Staying in Ship Staying in a Village

Depr.Cost for Commodities Depr. Cost During Travel

Figure 5.10: Objective values in the baselineexperiment

580

822

1265220

378

268

650

1015

1398

193

538405

150

185

202

178

354

332

86

110

17224

24148 160

24 46 72

343

198 331

776

1052

1440

776

1154

1528

0

200

400

600

800

1000

1200

1400

1600

1800

800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600

Incident 1 Incident 2 Incident 3 Incident 4 Incident 5

Nome Kotzebue Point Hope Point Lay Wainwright Utqiagvik

Figure 5.11: The villages used in the baselineexperiment

of Kotzebue and Nome are longer compared to the other incident areas. In fact, the model does not

transport any evacuee to other northern communities (see Fig 5.11) since there exist sufficient hosting

and airport capacities in both communities. As for Incident 3, since large planes, which comprise

87% of the total capacity provided by all the planes, cannot be utilized in the small villages, the

model transports a number of evacuees to the farther communities (e.g., Kotzebue and Utqiagvik).

This results in higher travel times when using ships which also causes another negative impact since

it takes more time to evacuate people from the cruise ship.

Further, we observe high penalty costs due to leaving some evacuees in the villages in

Incident 3 when there are 1200 and 1600 passengers. The underdeveloped runways in the airports

prevent the use of large planes from landing in the small villages close to Incident 3, thus delaying

transport of evacuees or causing them to go to large villages far away from the incident location.

For instance, the number of evacuees who cannot not make it to Anchorage and have to stay in in

Point Hope and Point Lay by the end of the rescue operation is equal to 200 and 267 for 1200 and

1600 passengers, respectively. These people would not stay in these villages indefinitely but there

are significant penalties for them being there at the end of the horizon.

Overall, the transportation decisions have the greatest influence on the objective. When

evacuating people from the incident and from the local communities is delayed, it not only increases

the total evacuation time, but it exponentially increases the total deprivation costs due to the limited

amount of available resources. However, the bottleneck is the transportation decisions concerning

passengers. We observed that the planes are able to move resources and equipment into the villages

at or before the time evacuees arrive into the village and, therefore, the ‘arrival time’ into the village

has the most impact on deprivation costs. Therefore, resources and equipment enter the Arctic

quickly enough in support of a rescue operation. Although we did not specifically model the concept

of an Arctic fulfillment package, our results show that if these packages are the quickest way to

115

Page 127: Large-Scale Optimization Models with Applications in

provide resources and equipment, then they play an important role in the response.

It can be seen that the worst response performances are observed in Incidents 1 and 3. This

clearly indicates that there exists a strong correlation between the closeness of the incident area to

the local communities and the capacities. Even though we have plenty of capacity around Incident

1, as a result of long travel distances, the evacuees suffer and the rescue event is challenging. While

we have close communities located around Incident 3, these small villages have limitations on how

they can be used during the response (e.g., the types of planes that can land there). Hence, the

rescue event is still challenging.

Another significant finding is that the response to Incident 4 is slightly worse than the

response to Incident 5 in each scenario (i.e., 800, 1200, and 1600 passengers). This is somewhat

counter-intuitive in the sense that Incident 4 could more easily take advantage of Point Lay, Wain-

wright and Utqiagvik. However, the response to Incident 5 performs better since the incident is

closer to the larger community of Utqiagvik.

Lastly, we provide the list of the villages used in each incident location in Fig 5.11, where

each column shows the villages together with the number of evacuees transported there. Overall,

Utqiagvik and Kotzebue are important large communities and Point Hope stands as a significant

small community. We will now analyze how the transportation decisions are affected based on

different situations faced by the response.

Managerial insights: It would be important to either increase the number of ships around

Nome and Kotzebue or locate ships that improve upon capacity or speed in order to address the

‘response gap’ in this area. It could also be quite useful to incorporate some infrastructure devel-

opments in small villages to be able to utilize larger aircraft and/or host more evacuees. In our

remaining analysis, we will focus on the impacts of such decisions.

5.6.3 ‘What If’ Analysis

In this section, we focus on examining various what-if scenarios that alter the data associated

with our baseline experiment to understand key issues around response capabilities. Our experiments

address: (i) the improvement in response when new infrastructure is developed in the Arctic, (ii)

the impact on response when it faces challenges (e.g., weather), and (iii) situations that combine (i)

and (ii).

116

Page 128: Large-Scale Optimization Models with Applications in

5.6.3.1 Experiment 1: Improving Infrastructure in the Arctic

There may be opportunities to invest in improving infrastructure in order to increase the

‘slack’ in these systems so that they may be able to better handle emergency response. We look

to answer the following question: “How much positive effect may be seen when airport and hosting

capacities are increased and the runway lengths are upgraded in the small villages?”. Here, the

airport and hosting capacities are increased by one and 20%, respectively, and runways lengths are

upgraded in order to land any type of aircraft.

0

10000

20000

30000

40000

50000

60000

70000

800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600

Incident 1 Incident 2 Incident 3 Incident 4 Incident 5

Total objective

Baseline Exp 1

Figure 5.12: The total objective values in the base-line experiment and Experiment 1

595.161003.45

1434.06

2055.16

3133.383326.77

116.14 314.64572.39

836.44

0

500

1000

1500

2000

2500

3000

3500

800 1200 1600 800 1200 1600

Incident 2 Incident 3

Baseline Exp 1

Figure 5.13: The deprivation costs incurred duringtravel in the baseline experiment and Experiment 1

First, we point out that the investments that are proposed for the small villages did not

improve the response for Incidents 1 and 5 (see Fig 5.12). Reaching the small villages still takes

the same amount of time via the ships. Hence, utilizing the closest villages, which are known to

have high capacities, under the baseline is still preferred. For example, although Wainwright has

improved its capabilities, Utqiagvik still has significant response capacity and we use it as the center

of the response in Incident 5.

We do see an improvement of between 15%-25% and 30%-45% in response capabilities

for Incidents 2 and 3, respectively (see Fig 5.12). For Incident 2, while on average 84% of the

evacuees are transported to Kotzebue in the baseline experiment, this ratio drops sharply to 7% in

Experiment 1. Point Hope becomes a more appealing location to move the evacuees since there are

major infrastructural improvements in the small villages. We observe a significant decrease in the

deprivation costs during travel (see Fig 5.13). This is because all the ships can reach the cruise ship

from Point Hope within one time period while it takes, on average, 1.8 periods to reach Kotzebue.

We observe a very similar pattern in Incident 3 and the model no longer transports any evacuee to

Utqiagvik. Further, improvements in the objective value in Incident 3 occur due to the fact that no

117

Page 129: Large-Scale Optimization Models with Applications in

evacuee is left in the villages as a result of the improvements to the airports.

As for Incident 4, Wainwright becomes nearly as important as Utqiagvik by hosting roughly

half of the total evacuees. As a result, the total average evacuation time and the deprivation costs for

commodities decrease. Yet, we observe only a 3.5% decline on average in terms of the total objective.

We believe that the reason behind such a small decrease is the fact that though Wainwright is highly

utilized, it takes longer time to reach Wainwright compared to Utqiagvik from the cruise ship. For

example, while we do not observe any deprivation costs during travel in the baseline experiment in

Incident 4, this trend changes in Experiment 1.

Managerial insights: We believe that infrastructure investments in terms of both improving

the runways and increasing the hosting capacities in Point Hope and Wainwright would be quite

beneficial. Both communities could play an important role in different incidents due to their central

locations in areas between larger Arctic communities. However, our results suggest that additional

infrastructure investment in some communities may have limited benefit, as incidents close to Nome,

Kotzebue, and Utqiagvik are responded to better than those incidents in more remote areas.

5.6.3.2 Experiment 2: Restricting the Air Transportation as a Result of the Bad

Weather Conditions

We ask the following question: “What is the (negative) impact to response capabilities when

air operations are impacted by weather conditions?”. To answer this, we introduce a new constraint

such that no flight is operated between t = 1 and t = 8, which helps to model a storm that would

ground air operations.

We provide a comparison of the objective values with the baseline experiment in Fig 5.14.

When air operations are restricted as a result of bad weather conditions seen in the region, the model

fails to bring everyone to Anchorage when there are 1600 passengers in every incident, which is why

there is a high penalty cost due to leaving some evacuees in the villages. In addition, the model

leaves 57 more evacuees in the cruise ship in Incident 3 when there are 1600 passengers. It is worth

mentioning that airport restrictions in the small villages create a bottleneck and remain tight after

the air operations are started. Therefore, this indicates that investments to improve the airports in

the small villages would be quite beneficial in a response.

Restricting the air transportation has another negative impact since we can no longer move

resources and equipment into villages. This results in a significant increase in the deprivation costs

118

Page 130: Large-Scale Optimization Models with Applications in

0

10000

20000

30000

40000

50000

60000

70000

800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600

Incident 1 Incident 2 Incident 3 Incident 4 Incident 5

Baseline Exp 2

Figure 5.14: The total objective values in the base-line experiment and Experiment 2

48.92%

38.57%

31.42%

57.02%

49.10%

41.40%

26.88%21.10%

13.39%

104.85%

91.51% 92.64%

115.44%

97.32% 98.70%

0%

20%

40%

60%

80%

100%

120%

140%

800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600

Incident 1 Incident 2 Incident 3 Incident 4 Incident 5

Figure 5.15: The percentage increase in the objectivein Experiment 3 compared to the baseline experiment

for commodities in every scenario (i.e, 63% on average). For instance, the deprivation cost increases

more than five times in Incident 5-1600 (e.g., from 2000.83 to 10,620.59). This helps to indicate that

resource and equipment stockpiles are not sufficient to support evacuees for long periods of time

without replenishment.

In terms of the impact on response capabilities, Fig 5.15 provides the increase in the objective

functions across all incidents from the baseline to this particular situation. We can view large gaps

as significantly decreasing response capabilities. Incidents 4 and 5 are most impacted in terms of

an increase to the objective. This is because we were able to evacuate people quickly through

Utqiagvik for these incidents in the baseline but since the planes are now grounded, we now need

to have evacuees wait in this community. Incident 3 experiences the relative smallest increase. This

is due to the fact that the arrival times into the villages from the distressed ship were a significant

part of the objectives and this does not change as air operations are grounded. We observe that the

percentage increase decreases in a linear way in Incidents 1,2 and 3 while the number of passengers

are increasing. Meanwhile, the objective value rises approximately half in Incident 3 and the response

becomes nearly identical with Incident 1 in terms of the objective values.

Managerial insights: It could be useful to stockpile more relief commodities to be used

during an emergency response event in larger villages including both Kotzebue and Utqiagvik when

infrastructure development is not possible. The response tends to favor utilizing larger villages to

transport evacuees. Hence, if an infrastructure improvement is not possible in the region, then

holding extra relief commodities in larger villages would be preferred as an alternative in order to

ensure longer support for evacuees as they arrive into the larger communities or as resources are

transported to the smaller communities where evacuees may be.

119

Page 131: Large-Scale Optimization Models with Applications in

5.6.3.3 Experiment 3: Decreasing the Speed of Ships Due to Navigating with Sea Ice

Weather conditions do not only cause problems in the air operations but also ships traveling

in the sea. In particular, there may be sea ice in and around the ships as they travel in the Arctic.

The USCG owns polar-class icebreakers should ships become iced-in. Navigation in the uncertain

conditions surrounding sea ice may also reduce the speed at which ships can travel. We, therefore,

examine the response under conditions where the ship travel times would increase to move between

the cruise ship and the villages. In this case, the travel time of each ship is increased by one. One

important observation in this case is that the model fails to evacuate everyone from the cruise ship

in all the incident locations with 1600 passengers as shown in Fig 5.16. Note that the model does

not utilize a different transportation path for the evacuees in any of the incidents and use the same

villages as presented in Fig 5.11, but may leave more evacuees in the cruise ship.

67

336

218

618

95

495

2

402

802

163 163

0

100

200

300

400

500

600

700

800

900

1200 1600 1200 1600 800 1200 1600 1600 1600

Incident 1 Incident 2 Incident 3 Incident 4 Incident 5

Baseline Exp 3

Figure 5.16: The number of evacuees stayed in the C.ship at |T | in the baseline experiment and Exp. 3

18

22

18

23

17 17

15

21

14

19

13 13

15 15

11 11

15

18

14

18

0

5

10

15

20

25

1200 1600 1200 1600 1200 1600 1200 1600 1200 1600

Incident 1 Incident 2 Incident 3 Incident 4 Incident 5

Baseline Exp 3

Figure 5.17: The total number of tours completed bythe ships in the baseline experiment and Exp. 3

The impact to response capabilities can be explained by examining the number of ‘tours’

that are made from the cruise ship to the villages by ships. We define a tour as the travel from a

village to cruise ship and from cruise ship to a village for a ship. Fig 5.17 compares the number of

tours under the baseline and this experiment with 1200 and 1600 passengers. Note that the number

of tours does not really change with 800 passengers due to the available total passenger capacity

provided by the ships. Under Experiment 3, the total number of tours made by the ships ends up

significantly decreasing from their baseline when there are 1600 evacuees for Incidents 1,2, and 3

(i.e., 35% decrease) and slightly decreases for Incidents 4 and 5. In these cases, we no longer have

the capacity to evacuate everyone from the cruise ship within the planning horizon. This implies

that a larger number of ships may need to be present in challenging navigation conditions (which

causes its own problems) in order to achieve the same response as our baseline experiments.

120

Page 132: Large-Scale Optimization Models with Applications in

Furthermore, Incident 2 and 4 are negatively impacted as a result of the evacuees who are

left in the villages. For instance, the number of evacuees sent to Point Hope increases by around

40% with 1200 and 1600 passengers compared to the baseline experiment in Incident 2. Since the

ship speeds are decreased, the model sends more evacuees to Point Hope due to its closeness to

the incident area (i.e., Incident 2) in spite of its limitations. As for Incident 4, the number of the

evacuees sent to Wainwright doubles when there are 1600 passengers. It is worth mentioning that

no hosting capacity of a village is fully utilized here, confirming that the airport limitations are the

bottleneck (not hosting capacity) of the small villages.

Managerial insights: We have identified that a larger number of ships may need to be present

in challenging navigation conditions (which causes its own problems) in order to achieve the same

response as our baseline experiments. In addition, increasing airport capacities in terms of upgrading

the runways is more critical than investing in the hosting capacities. Therefore, our results suggest

airport improvements as a critical aspect of potentially improving emergency response.

5.6.3.4 Experiment 4: Increasing Infrastructure in Small Villages and Decreasing the

Speed of Ships

We now examine if improving the infrastructure systems in the small villages improve re-

sponse capabilities (similar to Experiment 1) when ships are moving slower than their ideal speed

(similar to Experiment 3). We will increase the airport capacities by one, improve the length of their

runways, and decrease the speed of ships by one time unit. We will then determine the improvement

in response capabilities from Experiment 3 from the infrastructure improvements. We do not see

any improvement in response capabilities from Experiment 3 for Incidents 1 and 5 (see Fig 5.18),

since almost all evacuees in these incidents are routed through Nome, Kotzebue, and Utqiagvik. We

see slight improvements in response capabilities in Incident 4 and major improvements in Incidents

2 and 3 (on average a 5.84%, 20.66% and 24.12% decrease, respectively), which are similar to the

ones from the baseline to Experiment 1.

The impact of the extra airport capacity together with the upgraded runways is twofold.

First, the model completely changes the transportation path for the evacuees in Incidents 2, 3 and

4. For example, while there are evacuees transported to Kotzebue and Utqiagvik in Incident 3, all

the evacuees left the cruise ship are transported only to Point Hope and Point Lay with Experiment

4. Second, in Incident 4, although it does not change the number of evacuees left in the cruise ship,

121

Page 133: Large-Scale Optimization Models with Applications in

0

10000

20000

30000

40000

50000

60000

70000

80000

800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600

Incident 1 Incident 2 Incident 3 Incident 4 Incident 5

Exp 3 Exp 4

Figure 5.18: The total objective values in Experiment 3 and Experiment 4

it decreases the number of evacuees (i..e, from 184 to 24) that cannot make it out of the Arctic due

to the improvements in Wainwright.

We emphasize that Point Hope is the key village for both Incidents 2 and 3. When improving

the airport infrastructure in Pont Hope, the model no longer uses Kotzebue and Utqiagvik in Incident

3. More evacuees are carried to Point Hope in Incident 2 while Kotzebue is less used. This is

an important observation indicating that airport investments in Point Hope might be critical in

improving response capabilities.

Managerial insights: This experiment reveals that airport investments in Point Hope would

likely be important (depending on their feasibility) in improving response capabilities in the face

of challenging situations. Thus, we believe that Point Hope could be the key location in the entire

region for the infrastructure investments.

5.6.3.5 Experiment 5: Increasing the Number of Evacuees in Higher Priority Levels

Here, we increase the number of evacuees in Priority 2 (i.e., 35% of the total evacuees), and

decrease in Priority 1 (i.e., 55% of the total evacuees). Our goal is to test whether a) transportation

decisions would change, and b) logistic decisions would experience major changes. In this analysis,

the core transportation decisions remain the same and, therefore, the evacuation portion of the

objective remains the same. Therefore, priority is still given to this piece of the MRE. As expected,

we do see a slight increase in deprivation costs since more evacuees are at a higher priority level.

The only major increase occurs in Incident 3 (e.g., an increase of 11% for the deprivation costs when

evacuating 1600 people). This is because the evacuees stay longer in the Arctic and the penalty

from the change to the priority levels in this experiment accumulates.

Managerial insights: Although each air asset uses its maximum capacity, enough relief com-

122

Page 134: Large-Scale Optimization Models with Applications in

modities cannot be carried when the priority levels have shifted. Hence, improving the infrastructure

in one of the small villages and/or pre-positioning relief commodities might be an effective solution

for such scenario.

5.7 Conclusion

In this chapter, we have focused on how to respond to a MRE in Arctic Alaska. Our contri-

butions to this area is that we propose a novel IP model whose main objective is to evacuate people

from the distressed ship to the local villages around the Arctic and transport them to Anchorage

while minimizing the negative impact of the event on them. We conduct extensive analysis of poten-

tial MREs along the route the Crystal Serenity traveled around Arctic Alaska. Our work helps to

focus on concerns about Arctic MREs that are increasingly likely to occur given the shift in Arctic

maritime transportation and tourism.

The human costs we identify and the emergency response gaps that are modeled are capable

of assessing situations broader than the U.S. Arctic and impact all Arctic nations to varying degrees.

This work helps to make a case that optimization models can help to address operational gaps for

Arctic MREs, where these gaps have been practically recognized but not modeled previous to our

efforts. Our paper models the tradeoffs that policy makers, regulators and transportation logistics

professionals must consider as transportation in remote and infrastructure-poor settings increases

due to climate and ecosystem changes.

Highlights obtained from our computational analysis are as follows. The accidents occurring

around Nome and Kotzebue (Incidents 1 and 2) had a major issue that not everyone would be able

to evacuate from the cruise ship within a reasonable evacuation horizon due to the long distances

between the incident and Arctic villages. This is due both to the speed which ships can travel and

the number of ships involved in the response. Therefore, in order to mitigate this vulnerability, it

is suggested that additional ships are made available to respond to an incident in this area (near

the Bering Strait). In addition, the response capabilities for these incidents are the least sensitive

to both infrastructure improvements and challenges in the response.

The most impactful change in improving response capabilities for Incident 2 is in improving

airport capacity and upgrading the runway of Point Hope since it is closer to Incident 2 than

Kotzebue but has significantly fewer people. In addition, this investment would help the response

123

Page 135: Large-Scale Optimization Models with Applications in

to Incident 3. When infrastructure investments are made into the small villages, evacuees no longer

travel to the farther communities and Point Hope plays the central role during the rescue operation.

One recommendation to help improve response capabilities in the Arctic would be to invest in the

necessary capacities to have Point Hope (or a similar village in the area) play a more significant role

in the response. Note that Point Hope may not be the only option (it was in our case study since it

is in the North Slope Borough) for these potential upgrades. Wales sits at the smallest part of the

Bering Strait and could also significantly impact response capabilities.

We further observed how critical the village of Utqiagvik (the largest village in Arctic Alaska)

has in responding to MREs. The response to Incidents 4 and 5 route the majority of evacuees

through this village (although for Incident 4, it collaborates with Wainwright in the response).

These experiments indicate that the ‘ground’ capacity of Utqiagvik is sufficient to move evacuees

through it during the response. We do assume that Coast Guard ships are relatively close to the

incidents when they occur, which helps to indicate that these ships, or others of similar size should

be in the area while cruise ship travels through it.

In terms of future work, it will be critical to understand the practical feasibility of the

optimized responses. Although the model was built based on discussions with subject matter experts,

the output of the model has not been carefully vetted with both those involved in the response and

those that represent the villages that would be impacted. This type of vetting may lead to the

discovery that certain core assumptions in the model should be updated. Community buy-in to the

optimization model will allow for its practical deployment. Further, we can improve upon this work

by modeling how infrastructure investments should be made across Arctic Alaska to best improve

our overall response capabilities. It is our long-term goal to build such infrastructure investment

models that not only account for response capabilities but also capture the benefits (or negative

impacts) of the infrastructure development on the communities which it is built.

124

Page 136: Large-Scale Optimization Models with Applications in

Chapter 6

Conclusion

In this dissertation, we study i) a group-based centrality metric called star degree centrality

(SDC) under both deterministic and stochastic settings, and ii) an Arctic emergency response event.

We first introduce the SDC and stochastic pseudo-SDC (SPSDC) problems and then examine each

problem by proposing integer programming (IP) formulations, studying their complexity for different

network structures, as well as developing decomposition-based exact solution methods. We then

move into the emergency response events and examine the Arctic mass rescue events. We design a

large-scale IP model which integrates transportation and logistics decisions to rescue evacuees after

a maritime accident. We propose a heuristic solution method and provide a wide range of what-if

analysis.

125

Page 137: Large-Scale Optimization Models with Applications in

Appendices

126

Page 138: Large-Scale Optimization Models with Applications in

Appendix A Appendix A

A.1 Pseudocode of Conservative One-by-One Heuristic

In this section, we present the pseudo-code of Conservative One-by-One Heuristic

Algorithm 5: Initialization

Input: A, C1 lastT ime[i] := the last time period when asset i is used2 lastLoc[i] := the location of asset i at lastT ime3 for a ∈ A do4 for c ∈ C do5 if πa,i = 1 then6 Xa,c,ωa ← 17 lastT ime[a]← ωa8 lastLoc[a]← c

Algorithm 6: ConservativeOneByOneHeuristic

Input: A,Aa, C1 Initialization(A, C)2 for v ∈ A \ Aa do3 SendAsset(v)4 nextShip← true5 nextP lane← true6 while nextShip or nextP lane do7 if nextShip then8 nextShip← ShipAssignment(A \ Aa)9 if nextP lane then

10 nextP lane← PlaneAssignment(A)

11 Finalize(A,Aa)

Algorithm 7: PopulationUpdate

Input: i ∈ C, depart, j ∈ C, arrival, carry1 for depart ≤ t ≤ |T | do2 wi,t ← wi,t − carry3 for arrival ≤ t ≤ |T | do4 wj,t ← wj,t + carry

127

Page 139: Large-Scale Optimization Models with Applications in

Algorithm 8: AirportCheck

Input: a ∈ Aa, ν ∈ V, depart, t, transit1 decision← true2 if transit then3 airportT ime← depart− t4 num← number of planes located in ν at airportT ime5 if num+ 1 > κν then6 decision← false7 else8 while airportT ime > lastT ime[a] do9 num← number of planes located in lastLoc[a] at airportT ime

10 if num+ 1 > κlastLoc[a] then11 decision← false12 break

13 else14 airportT ime− = 1

15 else16 airportT ime← lastT ime[a]17 while airportT ime ≤ depart do18 num← number of planes located in ν at airportT ime19 if num+ 1 > κν then20 decision← false21 break

22 else23 airportT ime+ = 1

24 return decision

Algorithm 9: SendAsset

Input: a ∈ A1 if a ∈ A \ Aa then2 target← CruiseShip3 else4 target← Anchorage5 arrival← lastT ime[a] + τa,lastLoc[a],target

6 if arrival ≤ |T | then7 YlastLoc[a],target,a,lastT ime[a] ← 18 Xa,target,arrival ← 19 lastT ime[a]← arrival

10 lastLoc[a]← target

11 else12 while lastT ime[a] < |T | do13 Xv,lastLoc[a],lastT ime[a]+1 ← 114 lastT ime[a]+ = 1

15 A ← A \ a

128

Page 140: Large-Scale Optimization Models with Applications in

Algorithm 10: ShipAssignment

Input: A \ Aa, V1 map[]← null2 for v ∈ A \ Aa do3 stay ← false4 rv ←M5 if lastT ime[v] = |T | then6 A ← A \ v7 next v

8 for ν ∈ V do9 t← τv,Ship,ν

10 if @t > 0 then11 next v12 popCruiseShip← min. number of population in CruiseShip between

lastT ime[v] and |T |13 minCap← the last min. non-zero available capacity in ν between

lastT ime[v] + t and |T |14 minTime← the corresponding time period of minCap15 arrival← max(lastT ime[v] + t,minT ime)16 if minCap = 0 or arrival > |T | then17 next ν18 if lastT ime[v] + t < minTime then19 stay ← true20 carry ← min(µv, popCruiseShip,minCap)

21 if rv <carryarrival

then

22 rv ← carry/arrival23 map[v]← ν, rv, arrival, carry, stay24 v∗ ← arg min

m∈mapm.get(arrival). If there is a tie, then v∗ ← arg max

m∈mapm.get(rv)

25 ν∗, rv∗, arrival∗, carry∗, stay∗ ← map[v∗]

26 t∗ ← τv∗,CruiseShip,ν∗

27 if stay∗ then28 depart← arrival∗ − t∗29 YCruiseShip,ν∗,v∗,depart ← 130 Xv∗,ν∗,arrival∗ ← 131 PopulationUpdate(CruiseShip, depart, ν∗, arrival∗, carry∗)32 while lastT ime∗ < depart do33 Xv∗,CruiseShip,depart ← 134 depart− = 1

35 else36 YCruiseShip,ν∗,v∗,lastT ime[v∗] ← 137 Xv∗,ν∗,arrival∗ ← 138 PopulationUpdate(CruiseShip, lastT ime[v∗], ν∗, arrival∗, carry∗)

39 lastT ime[v∗]← arrival∗

40 lastLoc[v∗]← ν∗

41 SendAsset(v∗)

129

Page 141: Large-Scale Optimization Models with Applications in

42 rem← the remaining population in CruiseShip after arrival∗

43 if rem = 0 or A \ Aa = ∅ then44 return false45 else46 return true

Algorithm 11: PlaneAssignment

Input: Aa, V1 map[]← null2 for a ∈ Aa do3 transit← false4 rv ←M5 if lastT ime[a] = |T | then6 Aa ← Aa \ a7 next a

8 for ν ∈ V do9 t← 0

10 if ν = currentLoc then11 minPop← the last min non-zero available population in ν between

lastT ime[a] + t and |T |12 depart← the corresponding time period of minPop

13 else if θv,i = 1 then14 t← τa,lastLoc[a],ν

15 minPop← the last min. non-zero available pop. in ν between lastT ime[a]+t and |T |

16 depart← the corresponding time period of minPop17 transit← true

18 else19 continue20 canLand← AirportCheck(a, ν, depart, t, transit)21 if depart ≥ |T | or minPop = 0 or !canLand then22 next a23 carry ← min(minPop, ψa)

24 if rv <carrydepart

then

25 rv ← carrydepart

26 map[a]← ν, rv, depart, carry, transit

130

Page 142: Large-Scale Optimization Models with Applications in

27 for a ∈ Aa do28 if a ( map then29 num← number of planes located in lastLoc[a] at lastT ime[a] + 1;30 if num+ 1 ≤ κlastLoc[a] then31 Xa,lastLoc[a],lastT ime[a]+1 ← 1;32 lastT ime[a]+ = 1 ;

33 else34 for ν ∈ V do35 t← lastT ime[a] + τa,lastLoc[a],ν ;36 totalPop← total number of population in ν after lastT ime[a] + t ;37 num← number of planes located in ν at lastT ime[a] + t ;38 if totalPop = 0 and num+ 1 ≤ κν then39 YlastLoc[a],ν,a,lastT ime[a] ← 1;40 Xa,ν,t ← 1;41 lastT ime[a]← lastT ime[a] + t ;42 lastLoc[a]← ν ;43 break ;

44 a∗ ← arg minm∈map

m.get(depart). If there is a tie, then a∗ ← arg maxm∈map

m.get(rv)

45 ν∗, rv∗, depart∗, carry∗, transit∗ ← map[a∗]

46 if transit then47 t← τa∗,lastLoc[a∗],ν∗48 leave← depart∗ − t;49 YlastLoc[a∗],ν∗,a∗,leave ← 150 Xa∗,ν∗,leave ← 151 lastLoc[a∗]← ν∗

52 arrival← depart∗ + τa∗,ν∗,Anchroage53 PopulationUpdate(ν∗, depart∗, Anchroage, arrival, carry∗);54 while lastT ime[a∗] < depart∗ do55 Xa∗,ν∗,depart∗ ← 156 depart∗− = 1

57 else58 arrival← depart∗ + τa∗,ν∗,Anchroage59 while lastT ime[a∗] ≤ depart∗ do60 Xa∗,ν∗,depart∗ ← 161 depart∗− = 1

62 PopulationUpdate(ν∗, depart∗, Anchorage, arrival, carry∗)

63 lastT ime[a∗]← depart∗

64 SendAsset(a∗)65 if Aa = ∅ then66 return false67 else68 return true

131

Page 143: Large-Scale Optimization Models with Applications in

Algorithm 12: Finalize

Input: A,Aa1 for a ∈ Aa do2 while lastT ime[a] ≤ |T | do3 num← number of planes located in lastLoc[a] at lastT ime[a]4 if num+ 1 ≤ κlastLoc[a] then5 Xa,lastLoc[a],lastT ime[a] ← 16 lastT ime[a]+ = 1

7 else8 loc← a location where there is enough airport capacity at

t = lastT ime[a] + τa,lastLoc[a],loc and plane a can land in (i.e., θloc,i = 1)9 YlastLoc[a],loc,a,lastT ime[a] ← 1

10 Xa,loc,t ← 111 lastT ime[a]← t12 lastLoc[a]← loc

13 for v ∈ A \ Aa do14 while lastT ime[v] ≤ |T | do15 lastT ime[v]+ = 116 Xa,lastLoc[v],lastT ime[v] ← 1

A.2 Method Selection

In this section, we compare the performance of solving the IP directly, warm-starting with

both OTH and COBOH. We first conduct our comparison in the baseline experiment. We then

create another setting where the runway lengths of the airports located in the small villages are

assumed to be upgraded implying that all plane types can land in and take off from those airports.

With the latter experiment, our goal is to examine the performance of the solution methods when

there is a change in the baseline setup and we observed that the decisions become more difficult in

this setting.

We present how the two heuristic approaches performed with respect to their initial optimal-

ity gaps in Tables A1 and A2 which correspond to the baseline experiment and the experiment with

the upgraded runways, respectively. The optimality gaps are calculated as follows. After obtaining

a solution vector via a heuristic method, the IP model is warm-started with the transportation vari-

ables Xait and Yijat. Note that even though we can include the number of evacuees transported (i.e.,

mijat) in the solution vector, our preliminary experiments indicated that providing only the former

variables yields better initial objective values. After warm-starting the model, the initial objective

value and the best bound reported by the solver by the end of the time limit are used as a UB and

an LB, respectively. The optimality gap then is computed as (UB−LB)LB .

Recall that heuristic approaches proposed solely focus on the transportation decisions and

132

Page 144: Large-Scale Optimization Models with Applications in

Table A1: Comparison of the initial optimality gaps in the baseline experimentIncident 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5Num. ofPeople

800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600

OTH 0.54% 0.01% 0.05% 0.22% 1.01% 0.16% 2.53% 3.09% 2.65% 0.01% 0.08% 0.77% 0.40% 0.09% 0.36%COBOH 0.38% 0.88% 0.65% 12.36% 18.81% 6.45% 52.78% 24.99% 16.57% 23.43% 15.57% 9.89% 0.05% 0.09% 0.33%

Table A2: Comparison of the initial optimality gaps in the experiment with the upgraded runwaysIncident 1 1 1 2 2 2 2 3 3 4 4 4 5 5 5Num. ofPeople

800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600

OTH 0.06% 0.39% 0.20% 0.62% 2.41% 0.66% 0.09% 0.05% 0.16% 3.99% 1.71% 1.71% 4.63% 3.84% 2.10%COBOH 0.38% 0.95% 1.29% 4.10% 5.72% 7.69% 6.51% 6.64% 8.06% 6.69% 7.13% 7.90% 0.05% 0.09% 0.33%

in essence the OTH solves the transportation version of the model to the optimality. In both

experiments, it can be seen that the OTH produces a better initial optimality gap compared to the

COBOH in most of the instances. However, the COBOH method outperforms the OTH with respect

to the initial optimality gaps in Incident 5 in both experiments. In addition, it produces a better

initial gap in Incident 1-800 in the baseline experiment. This can be explained by the observation

that if there are large local communities capacity-wise around an incident area (e.g., Incident 1

and Incident 5), the COBOH gives a good approximation of the optimal solution. An important

observation with respect to this table is that significant improvements in terms of the objective can

be obtained by using optimization-based approaches (i.e., OTH) as opposed to relying on intuitive

approaches to responding to the mass rescue event.

We now focus on the final results obtained with three methods in two experiments. Tables

A3 and A4 summarize the comparisons in terms of the solution times and the final optimality gaps

for the baseline experiment and the experiment with the upgraded runways, respectively. In the

baseline experiment, while we observe that the number of instances solved to optimality with three

methods are close to each other, the IP model solved with CPLEX did not return any incumbent

solution for Incident 3-1200. As for the experiment with the upgraded runways, while the IP model

reaches the optimal solution via CPLEX in six scenarios, the problem is solved to optimality in seven

scenarios with the support of the heuristic approaches. However, while CPLEX did not produce any

solution within an hour for Incident 4-1600, it returned a poor solution (i.e., an optimality gap of

5.90%) for Incident 4-1200. Overall, we observe that since the larger local communities with higher

capacities (i.e., Kotzebue, Nome, and Utqiagvik) are closer to the Incidents 1,2, and 5, it becomes

relatively easier to obtain a high quality solution with CPLEX for those specific incident areas in

both experiments.

As for warm-starting the model with the solutions generated by the heuristic approaches,

133

Page 145: Large-Scale Optimization Models with Applications in

Table A3: Comparison of the solution methods in the baseline experiment [Time (in mins), Gap (%)]IP OTH COBOH

IncidentNum. ofPeople

Time Gap Time Gap Time Gap

Incident 1 800 16.43 0.00 9.72 0.00 13.05 0.00Incident 1 1200 60.00 0.26 51.00 0.00 60.00 0.07Incident 1 1600 60.00 0.12 60.00 0.03 60.00 0.15Incident 2 800 12.22 0.00 4.49 0.00 14.41 0.00Incident 2 1200 13.89 0.00 13.49 0.00 6.42 0.00Incident 2 1600 51.55 0.00 60.00 0.14 60.00 0.19Incident 3 800 60.00 1.65 60.00 1.55 60.00 4.00Incident 3 1200 60.00 — 60.00 0.47 60.00 0.59Incident 3 1600 60.00 0.89 60 1.19 60.00 0.87Incident 4 800 41.1 0.00 7.62 0.00 13.78 0.00Incident 4 1200 24.29 0.00 6.61 0.00 7.02 0.00Incident 4 1600 28.31 0.00 4.82 0.00 8.41 0.00Incident 5 800 3.69 0.00 3.01 0.00 2.53 0.00Incident 5 1200 3.57 0.00 3.13 0.00 2.79 0.00Incident 5 1600 4.46 0.00 3.07 0.00 2.86 0.00

even though the OTH produces a better initial solution (see Tables A1 and A2) in most of the

cases, when it comes to the overall performance in terms of the solution time, we do not observe a

consistent difference. The warm-start with the OTH produces better final results in more scenarios

than warm-start with the COBOH (e.g., Incident 1-1200 and Incident 1-1600 in both Tables A1 and

A2).

Table A4: Comparison of the solution methods in the experiment with the upgraded runways

IP OTH COBOH

IncidentNum. of

PeopleTime Gap Time Gap Time Gap

Incident 1 800 45.28 0.00 17.78 0.00 26.02 0.00

Incident 1 1200 60.00 0.31 60.00 0.14 60.00 0.24

Incident 1 1600 60.00 0.29 60.00 0.12 60.00 0.27

Incident 2 800 7.60 0.00 11.23 0.00 9.53 0.00

Incident 2 1200 60.00 0.85 60.00 0.05 60.00 0.05

Incident 2 1600 60.00 1.01 60.00 0.19 52.44 0.00

Incident 3 800 39.74 0.00 60.00 0.04 60.39 0.04

Incident 3 1200 60.00 1.19 60.00 0.02 60.00 0.02

Incident 3 1600 60.00 1.54 60.00 0.16 60.00 0.30

Incident 4 800 60.00 0.02 7.20 0.00 33.34 0.00

Incident 4 1200 60.00 5.90 10.35 0.00 60.35 0.02

Incident 4 1600 60.00 — 60.00 0.05 60.39 0.04

Incident 5 800 5.90 0.00 5.71 0.00 3.80 0.00

Incident 5 1200 7.39 0.00 5.12 0.00 4.07 0.00

Incident 5 1600 8.11 0.00 5.77 0.00 5.27 0.00

134

Page 146: Large-Scale Optimization Models with Applications in

Bibliography

Adulyasak, Y., Cordeau, J.-F., and Jans, R. (2015). Benders decomposition for production routingunder demand uncertainty. Operations Research, 63(4):851–867.

Afenyo, M., Khan, F., and Ng, A. K. (2020). Assessing the risk of potential oil spills in the Arcticdue to shipping. In Maritime Transport and Regional Sustainability, pages 179–193. Elsevier.

Ahat, B., Ekim, T., and Taskın, Z. C. (2017). Integer programming formulations and Bendersdecomposition for the maximum induced matching problem. INFORMS Journal on Computing,30(1):43–56.

Aiello, G., Hopps, F., Santisi, D., and Venticinque, M. (2020). The employment of unmanned aerialvehicles for analyzing and mitigating disaster risks in industrial sites. IEEE Transactions onEngineering Management, 67(3):519–530.

Akers, S. B., Harel, D., and Krishnamurthy, B. (1994). The star graph: An attractive alternative tothe n-cube. Proceedings of the International Conference on Parallel Processing, pages 393–400.

Akers, S. B. and Krishnamurthy, B. (1989). A group-theoretic model for symmetric interconnectionnetworks. IEEE Transactions on Computers, 38(4):555–566.

Alaska Airlines (2020). Our aircraft. https://www.alaskaair.com/content/travel-info/our-

aircraft/. (Accessed on 11/03/2020).

Alaska Department of Health and Social Services (2018). Trauma system in Alaska. http://dhss

.alaska.gov/dph/Emergency/Pages/trauma/default.aspx. (Accessed on 12/29/2019).

Alaska Department of Revenue (2020). tax.alaska.gov/programs/documentviewer/viewer.aspx?1583r.http://tax.alaska.gov/programs/documentviewer/viewer.aspx?1583r. (Accessed on09/05/2020).

Alibeyg, A., Contreras, I., and Fernandez, E. (2018). Exact solution of hub network design problemswith profits. European Journal of Operational Research, 266(1):57–71.

Alkaabneh, F., Diabat, A., and Elhedhli, S. (2019). A Lagrangian heuristic and GRASP for thehub-and-spoke network system with economies-of-scale and congestion. Transportation ResearchPart C: Emerging Technologies, 102:249–273.

Allison, E. and Mandler, B. (2018). Oil and gas in the U.S. Arctic. https://www.americangeosciences.org/geoscience-currents/oil-and-gas-us-arctic. (Accessed on 01/07/2020).

Arctic Domain Awareness Center (2016). Arctic-related incidents of national significance workshop.https://arcticdomainawarenesscenter.org/Downloads/PDF/Arctic%20IoNS/ADAC Arctic%

20IoNS%202016 Report 160906.pdf. (Accessed on 03/03/2021).

135

Page 147: Large-Scale Optimization Models with Applications in

Arctic Domain Awareness Center (2021). Arctic maritime horizons workshop. https://arcticdo

mainawarenesscenter.org/Events. (Accessed on 03/22/2021).

Ashtiani, M., Salehzadeh-Yazdi, A., Razaghi-Moghadam, Z., Hennig, H., Wolkenhauer, O., Mirzaie,M., and Jafari, M. (2018). A systematic survey of centrality measures for protein-protein interac-tion networks. BMC Systems Biology, 12(1):80.

Aykin, T. (1994). Lagrangian relaxation based approaches to capacitated hub-and-spoke networkdesign problem. European Journal of Operational Research, 79(3):501–523.

Bai, L. and Rubin, P. A. (2009). Combinatorial Benders cuts for the minimum tollbooth problem.Operations Research, 57(6):1510–1522.

Banerjee, A., Chandrasekhar, A. G., Duflo, E., and Jackson, M. O. (2013). The diffusion of micro-finance. Science, 341(6144):1236498.

Bavelas, A. (1948). A mathematical model for group structures. Applied Anthropology, 7(3):16–30.

Bavelas, A. (1950). Communication patterns in task-oriented groups. The Journal of the AcousticalSociety of America, 22(6):725–730.

Behbahani, H., Nazari, S., Kang, M. J., and Litman, T. (2019). A conceptual framework to formulatetransportation network design problem considering social equity criteria. Transportation ResearchPart A: Policy and Practice, 125:171–183.

Benders, J. F. (1962). Partitioning procedures for solving mixed–variables programming problems.Numerische Mathematik, 4(1):238–252.

Bhowmick, S. S. and Seah, B. S. (2015). Clustering and summarizing protein-protein interactionnetworks: A survey. IEEE Transactions on Knowledge and Data Engineering, 28(3):638–658.

Bley, A. and Rezapour, M. (2016). Combinatorial approximation algorithms for buy-at-bulk con-nected facility location problems. Discrete Applied Mathematics, 213:34–46.

Bonacich, P. (1972). Factoring and weighting approaches to status scores and clique identification.Journal of Mathematical Sociology, 2(1):113–120.

Bonacich, P. (1987). Power and centrality: A family of measures. American Journal of Sociology,92(5):1170–1182.

Botton, Q., Fortz, B., Gouveia, L., and Poss, M. (2013). Benders decomposition for the hop-constrained survivable network design problem. INFORMS Journal on Computing, 25(1):13–26.

Brady, C. (2019). Boeing 737 Detailed Technical Data. http://www.b737.org.uk/techspecsdet

ailed.htm. (Accessed on 12/04/2019).

Brooks, A. D. (2020). Search & Rescue in The North Slope Borough. http://www.north-slope.

org/departments/search-rescue. (Accessed on 01/20/2020).

Canca, D., De-Los-Santos, A., Laporte, G., and Mesa, J. A. (2017). An adaptive neighborhoodsearch metaheuristic for the integrated railway rapid transit network design and line planningproblem. Computers & Operations Research, 78:1–14.

Cessna (2019). Cessna Caravan. https://cessna.txtav.com/en/turboprop/caravan. (Accessedon 12/04/2019).

Chankong, V. and Haimes, Y. Y. (2008). Multiobjective decision making: Theory and methodology.Courier Dover Publications.

136

Page 148: Large-Scale Optimization Models with Applications in

Chen, L. and Miller-Hooks, E. (2012). Resilience: an indicator of recovery capability in intermodalfreight transport. Transportation Science, 46(1):109–123.

Chiang, W.-K. and Chen, R.-J. (1998). Topological properties of the (n, k)-star graph. InternationalJournal of Foundations of Computer Science, 9(02):235–248.

Chou, Z.-T., Hsu, C.-C., and Sheu, J.-P. (1996). Bubblesort star graphs: A new interconnectionnetwork. In Proceedings of 1996 International Conference on Parallel and Distributed Systems,pages 41–48. IEEE.

Coast Guard News (2016). Coast guard partners industry conduct mass rescue tabletop exercisein Anchorage Alaska. https://coastguardnews.com/coast-guard-partners-industry-con

duct-mass-rescue-tabletop-exercise-in-anchorage-alaska/2016/04/21/. (Accessed on12/29/2019).

Contreras, I., Cordeau, J.-F., and Laporte, G. (2011). Benders decomposition for large-scale unca-pacitated hub location. Operations Research, 59(6):1477–1490.

Cordeau, J.-F., Furini, F., and Ljubic, I. (2019). Benders decomposition for very large scale partialset covering and maximal covering location problems. European Journal of Operational Research,275(3):882–896.

Crainic, T. G., Hewitt, M., Toulouse, M., and Vu, D. M. (2016). Service network design withresource constraints. Transportation Science, 50(4):1380–1393.

Dalal, J. and Uster, H. (2017). Combining worst case and average case considerations in an integratedemergency response network design problem. Transportation Science, 52(1):171–188.

Dangalchev, C. (2006). Residual closeness in networks. Physica A: Statistical Mechanics and itsApplications, 365(2):556–564.

Day, K. and Tripathi, A. (1992). Arrangement graphs: a class of generalized star graphs. InformationProcessing Letters, 42(5):235–241.

De Corte, A. and Sorensen, K. (2016). An iterated local search algorithm for water distributionnetwork design optimization. Networks, 67(3):187–198.

Division of Homeland Security & Emergency Management (2019). Resource catalog. https://ww

w.ready.alaska.gov/SEOC/ResourceCatalog. (Accessed on 12/04/2019).

Doan, X. V. and Shaw, D. (2019). Resource allocation when planning for simultaneous disasters.European Journal of Operational Research, 274(2):687–709.

Eckstein, M. (2021). New Arctic strategy calls for regular presence as a way to compete with Russia,China . https://news.usni.org/2021/01/05/new-arctic-strategy-calls-for-regular-p

resence-as-a-way-to-compete-with-russia-china. (Accessed on 03/08/2021).

Elbaih, A. H. and Alnasser, S. R. (2020). Teaching approach for START triage in disaster manage-ment. Medicine, 9(4):4.

Elmhadhbi, L., Karray, M.-H., Archimede, B., Otte, J. N., and Smith, B. (2020). A semantics-basedcommon operational command system for multiagency disaster response. IEEE Transactions onEngineering Management, pages 1–15.

Emde, S., Polten, L., and Gendreau, M. (2020). Logic-based Benders decomposition for schedulinga batching machine. Computers & Operations Research, 113:104777.

137

Page 149: Large-Scale Optimization Models with Applications in

Enayaty-Ahangar, F., Rainwater, C. E., and Sharkey, T. C. (2019). A logic-based decompositionapproach for multi-period network interdiction models. Omega, 87:71–85.

Eskandarpour, M., Dejax, P., and Peton, O. (2017). A large neighborhood search heuristic for supplychain network design. Computers & Operations Research, 80:23–37.

Estrada, E. (2006). Virtual identification of essential proteins within the protein interaction networkof yeast. Proteomics, 6(1):35–40.

Estrada, E. and Rodrıguez-Velazquez, J. A. (2005). Subgraph centrality in complex networks. Phys.Rev. E, 71:056103.

Everett, M. G. and Borgatti, S. P. (1999). The centrality of groups and classes. The Journal ofMathematical Sociology, 23(3):181–201.

Everett, M. G. and Borgatti, S. P. (2005). Extending centrality. Models and Methods in SocialNetwork Analysis, 35(1):57–76.

Farahani, R. Z., Lotfi, M., Baghaian, A., Ruiz, R., and Rezapour, S. (2020). Mass casualty man-agement in disaster scene: A systematic review of OR&MS research in humanitarian operations.European Journal of Operational Research, 287(3):787–819.

Fazel-Zarandi, M. M. and Beck, J. C. (2012). Using logic-based Benders decomposition to solvethe capacity-and distance-constrained plant location problem. INFORMS Journal on Computing,24(3):387–398.

Federal Aviation Administration (2019). Alaskan Region Airports Division. https://www.faa.go

v/airports/alaskan/. (Accessed on 03/11/2020).

Fischetti, M., Ljubic, I., and Sinnl, M. (2016). Benders decomposition without separability: Acomputational study for capacitated facility location problems. European Journal of OperationalResearch, 253(3):557–569.

Fischetti, M., Ljubic, I., and Sinnl, M. (2017). Redesigning Benders decomposition for large-scalefacility location. Management Science, 63(7):2146–2162.

Fjørtoft, K. and Berg, T. E. (2020). Handling the preparedness challenges for maritime and offshoreoperations in Arctic waters. In Arctic Marine Sustainability, pages 187–212. Springer.

Fortz, B., Gorgone, E., and Papadimitriou, D. (2017). A Lagrangian heuristic algorithm for thetime-dependent combined network design and routing problem. Networks, 69(1):110–123.

Frank, S. M. and Rebennack, S. (2015). Optimal design of mixed AC-DC distribution systemsfor commercial buildings: A nonconvex generalized Benders Decomposition approach. EuropeanJournal of Operational Research, 242(3):710–729.

Freeman, L. C. (1978). Centrality in social networks conceptual clarification. Social Networks,1(3):215–239.

Friggstad, Z., Rezapour, M., Salavatipour, M. R., and Soto, J. A. (2019). LP-based approximationalgorithms for facility location in buy-at-bulk network design. Algorithmica, 81(3):1075–1095.

Gabrel, V., Knippel, A., and Minoux, M. (1999). Exact solution of multicommodity network opti-mization problems with general step cost functions. Operations Research Letters, 25(1):15–23.

Garrett, R. A., Sharkey, T. C., Grabowski, M., and Wallace, W. A. (2017). Dynamic resourceallocation to support oil spill response planning for energy exploration in the Arctic. EuropeanJournal of Operational Research, 257(1):272–286.

138

Page 150: Large-Scale Optimization Models with Applications in

GCR (2017). AirportIQ 5010. https://www.airportiq5010.com/5010web/. (Accessed on01/31/2020).

Gendron, B. (2019). Revisiting Lagrangian relaxation for network design. Discrete Applied Mathe-matics, 261:203–218.

Geoffrion, A. M. (1972). Generalized Benders decomposition. Journal of Optimization Theory andApplications, 10(4):237–260.

Goemans, M. X., Goldberg, A. V., Plotkin, S., Shmoys, D. B., Tardos, E., and Williamson, D. P.(1994). Improved approximation algorithms for network design problems. In Proceedings of thefifth annual ACM-SIAM symposium on Discrete algorithms, pages 223–232. Society for Industrialand Applied Mathematics.

Govindan, K., Jafarian, A., and Nourbakhsh, V. (2019). Designing a sustainable supply chain net-work integrated with vehicle routing: A comparison of hybrid swarm intelligence metaheuristics.Computers & Operations Research, 110:220–235.

Grimmer, B. (2018). Dual-based approximation algorithms for cut-based network connectivity prob-lems. Algorithmica, 80(10):2849–2873.

Griner, C. (2013). Learjet 31A Rescue Bird in search and rescue. https://www.flickr.com/photos/air traveller/10392962794. (Accessed on 12/04/2019).

Guo, C., Bodur, M., Aleman, D. M., and Urbach, D. R. (2021). Logic-based Benders decompo-sition and binary decision diagram based approaches for stochastic distributed operating roomscheduling. INFORMS Journal on Computing.

Hanlon, T. (2020). ConocoPhillips shuts down North Slope drilling over coronavirus concerns.https://www.alaskapublic.org/2020/04/08/conocophillips-shuts-down-north-slope-dr

illing-over-coronavirus-concerns/. (Accessed on 09/05/2020).

Heleniak, T. (2020). The future of the Arctic populations. Polar Geography, pages 1–17.

Hjort, J., Karjalainen, O., Aalto, J., Westermann, S., Romanovsky, V. E., Nelson, F. E., Etzelmuller,B., and Luoto, M. (2018). Degrading permafrost puts Arctic infrastructure at risk by mid-century.Nature Communications, 9(1):1–9.

Hoag, H. (2016). NOAA is Updating its Arctic Charts to Prevent a Nautical. https://deeply.t

henewhumanitarian.org/arctic/community/2016/08/29/noaa-is-updating-its-arctic-ch

arts-to-prevent-a-nautical-disaster. (Accessed on 03/08/2021).

Holguın-Veras, J., Perez, N., Jaller, M., Van Wassenhove, L. N., and Aros-Vera, F. (2013). Onthe appropriate objective function for post–disaster humanitarian logistics models. Journal ofOperations Management, 31(5):262–280.

Holmberg, K. (1994). On using approximations of the Benders master problem. European Journalof Operational Research, 77(1):111–125.

Hooker, J. N. (2007). Planning and scheduling by logic-based Benders decomposition. OperationsResearch, 55(3):588–602.

Hooker, J. N. and Ottosson, G. (2003). Logic-based Benders decomposition. Mathematical Program-ming, 96(1):33–60.

Hu, M., Cai, W., and Zhao, H. (2019). Simulation of passenger evacuation process in cruise shipsbased on a multi-grid model. Symmetry, 11(9):1166.

139

Page 151: Large-Scale Optimization Models with Applications in

Humpert, M. (2018). Arctic cruise ship runs aground in Canada’s northwest passage. https://www.highnorthnews.com/en/arctic-cruise-ship-runs-aground-canadas-northwest-passage.(Accessed on 12/29/2019).

Humpert, M. (2019). New satellite images reveal extent of Russia’s military and economic build-upin the Arctic. https://www.highnorthnews.com/en/new-satellite-images-reveal-extent-

russias-military-and-economic-build-arctic. (Accessed on 12/04/2019).

Igraph (2020). R igraph manual pages. https://igraph.org/r/doc. (Accessed on 12/07/2020).

Ilinova, A. and Chanysheva, A. (2020). Algorithm for assessing the prospects of offshore oil and gasprojects in the arctic. Energy Reports, 6:504–509.

International Maritime Organization (2016). International code for ships operating in polar waters.http://www.imo.org/en/MediaCentre/HotTopics/polar/Documents/POLAR%20CODE%20TEXT

%20AS%20ADOPTED.pdf. (Accessed on 09/14/2020).

Jalili, M., Salehzadeh-Yazdi, A., Asgari, Y., Arab, S. S., Yaghmaie, M., Ghavamzadeh, A., andAlimoghaddam, K. (2015). Centiserver: A Comprehensive Resource, Web-Based Application andR Package for Centrality Analysis. PLOS ONE, 10(11):1–8.

Jenkins, P. R., Lunday, B. J., and Robbins, M. J. (2019). Robust, multi-objective optimization forthe military medical evacuation location-allocation problem. Omega, page 102088.

Jeong, H., Mason, S. P., Barabasi, A.-L., and Oltvai, Z. N. (2001). Lethality and centrality in proteinnetworks. Nature, 411(6833):41–42.

Joy, M. P., Brock, A., Ingber, D. E., and Huang, S. (2005). High-betweenness proteins in the yeastprotein interaction network. BioMed Research International, 2005(2):96–103.

Kamath, R. S., Fraser, A. G., Dong, Y., Poulin, G., Durbin, R., Gotta, M., Kanapin, A., Le Bot,N., Moreno, S., Sohrmann, M., et al. (2003). Systematic functional analysis of the Caenorhabditiselegans genome using RNAi. Nature, 421(6920):231–237.

Kelman, I. (2020). Arctic humanitarianism for post-disaster settlement and shelter. Disaster Pre-vention and Management: An International Journal, 29(4):471–480.

Keskin, M. E. (2017). A column generation heuristic for optimal wireless sensor network design withmobile sinks. European Journal of Operational Research, 260(1):291–304.

Kleitman, D. J. and Winston, K. J. (1982). On the number of graphs without 4-cycles. DiscreteMathematics, 41(2):167–172.

Kloimullner, C. and Raidl, G. R. (2017). Full-load route planning for balancing bike sharing systemsby logic-based Benders decomposition. Networks, 69(3):270–289.

Leavitt, H. J. (1951). Some effects of certain communication patterns on group performance. TheJournal of Abnormal and Social Psychology, 46(1):38.

Leitner, M., Ljubic, I., Riedler, M., and Ruthmair, M. (2020). Exact approaches for the directednetwork design problem with relays. Omega, 91:102005.

Li, C., Lin, S., and Li, S. (2020a). Structure connectivity and substructure connectivity of stargraphs. Discrete Applied Mathematics, 284:472–480.

Li, Y., Zhang, J., and Yu, G. (2020b). A scenario-based hybrid robust and stochastic approachfor joint planning of relief logistics and casualty distribution considering secondary disasters.Transportation Research Part E: Logistics and Transportation Review, 141:102029.

140

Page 152: Large-Scale Optimization Models with Applications in

Li, Z., Swann, J. L., and Keskinocak, P. (2018). Value of inventory information in allocating alimited supply of influenza vaccine during a pandemic. PLOS One, 13(10):e0206293.

Lin, L., Huang, Y., Hsieh, S.-Y., and Xu, L. (2020). Strong reliability of star graphs interconnectionnetworks. IEEE Transactions on Reliability.

Liu, Y., Cui, N., and Zhang, J. (2019). Integrated temporary facility location and casualty allocationplanning for post-disaster humanitarian medical service. Transportation Research Part E: Logisticsand Transportation Review, 128:1–16.

Mak, L., Farnworth, B., Wissler, E. H., DuCharme, M. B., Uglene, W., Boileau, R., Hackett, P.,and Kuczora, A. (2011). Thermal requirements for surviving a mass rescue incident in the Arctic:Preliminary results. In ASME 2011 30th International Conference on Ocean, Offshore and ArcticEngineering, pages 375–383. American Society of Mechanical Engineers Digital Collection.

McKee, C. H., Heffernan, R. W., Willenbring, B. D., Schwartz, R. B., Liu, J. M., Colella, M. R.,and Lerner, E. B. (2020). Comparing the accuracy of mass casualty triage systems when used inan adult population. Prehospital Emergency Care, 24(4):515–524.

McNutt, C. (2016). Northwest Passage 2016 Exercise, After Action Report. https://www.hsdl.o

rg/?abstract&did=802138. (Accessed on 12/29/2019).

Messner, S. (2020). Future Arctic shipping, black carbon emissions, and climate change. In MaritimeTransport and Regional Sustainability, pages 195–208. Elsevier.

Morgunova, M. (2020). The global energy system through a prism of change: The oil & gas industryand the case of the Arctic. PhD thesis, KTH Royal Institute of Technology.

Nasirian, F., Pajouh, F. M., and Balasundaram, B. (2020). Detecting a most closeness-central cliquein complex networks. European Journal of Operational Research, 283(2):461–475.

National Oceanic & Atmospheric Administration (2021). NOAA surveys the unsurveyed, leadingthe way in the U.S. Arctic. https://nauticalcharts.noaa.gov/updates/noaa-surveys-the

-unsurveyed-leading-the-way-in-the-u-s-arctic/. (Accessed on 03/08/2021).

Neelam, S. and Sood, S. K. (2020). A scientometric review of global research on smart disastermanagement. IEEE Transactions on Engineering Management, 68(1):317–329.

Nguyen, H., Sharkey, T. C., Mitchell, J. E., and Wallace, W. A. (2020). Optimizing the recoveryof disrupted single-sourced multi-echelon assembly supply chain networks. IISE Transactions,52(7):703–720.

Nurre, S. G., Cavdaroglu, B., Mitchell, J. E., Sharkey, T. C., and Wallace, W. A. (2012). Restoringinfrastructure systems: An integrated network design and scheduling (INDS) problem. EuropeanJournal of Operational Research, 223(3):794–806.

Office of Aviation Forces (2019). USCG Fixed Wing & Sensors Division (CG-7113). https://ww

w.dco.uscg.mil/Our-Organization/Assistant-Commandant-for-Capability-CG-7/Off

ice-of-Aviation-Force-CG-711/Fixed-Wing-Sensors-Division-CG-7113/. (Accessed on12/04/2019).

Østhagen, A. (2020). Maritime Tasks and Challenges in the Arctic. In Coast Guards and OceanPolitics in the Arctic, pages 25–32. Springer.

Paraskevopoulos, D. C., Bektas, T., Crainic, T. G., and Potts, C. N. (2016). A cycle-based evo-lutionary algorithm for the fixed-charge capacitated multi-commodity network design problem.European Journal of Operational Research, 253(2):265–279.

141

Page 153: Large-Scale Optimization Models with Applications in

Pavlov, V. (2020). Arctic marine oil spill response methods: Environmental challenges and techno-logical limitations. In Arctic Marine Sustainability, pages 213–248. Springer.

Perez-Rodrıguez, N. and Holguın-Veras, J. (2015). Inventory–allocation distribution models forpostdisaster humanitarian logistics with explicit consideration of deprivation costs. TransportationScience, 50(4):1261–1285.

Przybylak, R. and Wyszynski, P. (2020). Air temperature changes in the Arctic in the period1951–2015 in the light of observational and reanalysis data. Theoretical and Applied Climatology,139(1-2):75–94.

Rahmaniani, R., Crainic, T. G., Gendreau, M., and Rei, W. (2018). Accelerating the Bendersdecomposition method: Application to stochastic network design problems. SIAM Journal onOptimization, 28(1):875–903.

Rambha, T., Nozick, L. K., Davidson, R., Yi, W., and Yang, K. (2021). A stochastic optimizationmodel for staged hospital evacuation during hurricanes. Transportation Research Part E: Logisticsand Transportation Review, 151:102321.

Ramirez-Nafarrate, A., Araz, O. M., and Fowler, J. W. (2021). Decision assessment algorithms forlocation and capacity optimization under resource shortages. Decision Sciences, 52(1):142–181.

Rasti, S. and Vogiatzis, C. (2019). A survey of computational methods in protein–protein interactionnetworks. Annals of Operations Research, 276(1-2):35–87.

Ravi, R., Marathe, M. V., Ravi, S., Rosenkrantz, D. J., and Hunt III, H. B. (2001). Approxi-mation algorithms for degree-constrained minimum-cost network-design problems. Algorithmica,31(1):58–78.

RavnAir Alaska (2020). The Ravn Aircraft Fleet Specifications. https://www.flyravn.com/abou

t-us/aircraft-fleet/. (Accessed on 12/04/2019).

Rodrıguez-Espındola, O., Alem, D., and Da Silva, L. P. (2020). A shortage risk mitigation model formulti-agency coordination in logistics planning. Computers & Industrial Engineering, 148:106676.

Rogers, D. D., King, M., and Carnahan, H. (2020). Arctic search and rescue: A case study forunderstanding issues related to training and human factors when working in the north. In ArcticMarine Sustainability, pages 333–344. Springer.

Roshanaei, V., Luong, C., Aleman, D. M., and Urbach, D. (2017). Propagating logic-based Ben-ders’ decomposition approaches for distributed operating room scheduling. European Journal ofOperational Research, 257(2):439–455.

Ruskin, L. (2018). China seeks bigger role in Arctic. https://www.alaskapublic.org/2018/02/06/china-seeks-bigger-role-in-arctic/. (Accessed on 02/05/2020).

Rysz, M., Pajouh, F. M., and Pasiliao, E. L. (2018). Finding clique clusters with the highestbetweenness centrality. European Journal of Operational Research, 271(1):155–164.

Sabouhi, F., Bozorgi-Amiri, A., Moshref-Javadi, M., and Heydari, M. (2019). An integrated routingand scheduling model for evacuation and commodity distribution in large-scale disaster reliefoperations: a case study. Annals of Operations Research, 283(1):643–677.

Saif, A. and Elhedhli, S. (2016). Cold supply chain design with environmental considerations: Asimulation-optimization approach. European Journal of Operational Research, 251(1):274–287.

142

Page 154: Large-Scale Optimization Models with Applications in

Samotij, W. (2015). Counting independent sets in graphs. European Journal of Combinatorics,48:5–18.

Sarma, D., Das, A., Dutta, P., and Bera, U. K. (2020). A cost minimization resource allocationmodel for disaster relief operations with an information crowdsourcing-based mcdm approach.IEEE Transactions on Engineering Management, pages 1–21. .

Schiermeyer, I. (2019). Maximum independent sets near the upper bound. Discrete Applied Mathe-matics, 266:186–190.

Schofield, C. and Østhagen, A. (2020). A Divided Arctic: Maritime Boundary Agreements andDisputes in the Arctic Ocean. In Handbook on Geopolitics and Security in the Arctic, pages171–191. Springer.

Sen, S., Barnhart, C., Birge, J., Boyd, A., Fu, M., Hochbaum, D., Morton, D., Nemhauser, G.,Nelson, B., Powell, W., et al. (2014). Operations research: A catalyst for engineering grandchallenges. Technical report, Tech. rep., National Science Foundation.

Setiawan, E., Liu, J., and French, A. (2019). Resource location for relief distribution and victimevacuation after a sudden-onset disaster. IISE Transactions, 51(8):830–846.

Shalina, E. V., Johannessen, O. M., and Sandven, S. (2020). Changes in Arctic Sea Ice Cover in theTwentieth and Twenty-First Centuries. In Sea Ice in the Arctic, pages 93–166. Springer.

Sherali, H. D., Bae, K.-H., and Haouari, M. (2010). Integrated airline schedule design and fleetassignment: Polyhedral analysis and Benders’ decomposition approach. INFORMS Journal onComputing, 22(4):500–513.

Sherman, R. (2000). C-17 Globemaster III. https://fas.org/man/dod-101/sys/ac/c-17.htm.(Accessed on 12/04/2019).

Shu, J., Lv, W., and Na, Q. (2021). Humanitarian relief supply network design: Expander graphbased approach and a case study of 2013 flood in northeast china. Transportation Research PartE: Logistics and Transportation Review, 146:102178.

Statista Research Department (2020). Cruise industry statistics & facts. https://www.statista

.com/topics/1004/cruise-industry/. (Accessed on 03/08/2021).

Stauffer, J. M. and Kumar, S. (2021). Impact of incorporating returns into pre-disaster deploymentsfor rapid-onset predictable disasters. Production and Operations Management, 30(2):451–474.

SteadieSeifi, M., Dellaert, N., Nuijten, W., and Van Woensel, T. (2017). A metaheuristic for themultimodal network flow problem with product quality preservation and empty repositioning.Transportation Research Part B: Methodological, 106:321–344.

Stepanov, A. and Smith, J. M. (2009). Multi-objective evacuation routing in transportation net-works. European Journal of Operational Research, 198(2):435–446.

Stepien, A., Kauppila, L., Kopra, S., Kapyla, J., Lanteigne, M., Mikkola, H., and Nojonen, M.(2020). China’s economic presence in the Arctic: Realities, expectations and concerns. In ChinesePolicy and Presence in the Arctic, pages 90–136. Brill Nijhoff.

Struzik, E. (2018). In the melting Arctic, a harrowing account from a stranded ship. https://e360.yale.edu/features/in-the-melting-arctic-harrowing-account-from-a-stranded-ship.(Accessed on 03/11/2020).

143

Page 155: Large-Scale Optimization Models with Applications in

Sung, I. and Lee, T. (2016). Optimal allocation of emergency medical resources in a mass casualtyincident: Patient prioritization by column generation. European Journal of Operational Research,252(2):623–634.

Szklarczyk, D., Franceschini, A., Wyder, S., Forslund, K., Heller, D., Huerta-Cepas, J., Simonovic,M., Roth, A., Santos, A., Tsafou, K. P., et al. (2015). STRING v10: Protein–protein interactionnetworks, integrated over the tree of life. Nucleic Acids Research, 43(D1):D447–D452.

Taskın, Z. C., Smith, J. C., and Romeijn, H. E. (2012). Mixed-integer programming techniques fordecomposing IMRT fluence maps using rectangular apertures. Annals of Operations Research,196(1):799–818.

Triantaphyllou, E. (2000). Multi-criteria decision making methods. In Multi-criteria decision makingmethods: A comparative study, pages 5–21. Springer.

United Nations (2020). Passenger Vessels. London, UK: The International Maritime Organization(IMO). http://www.imo.org/en/OurWork/Safety/Regulations/Pages/PassengerShips.asp

x. (Accessed on 09/05/2020).

United States Air Force (2008). HC-130P/N. https://www.106rqw.ang.af.mil/About-Us/Fact-Sheets/Display/Article/1041575/hc-130pn/. (Accessed on 12/04/2019).

U.S. Bureau of the Census (2019). The United States Census 2020. https://www.census.gov/.(Accessed on 12/29/2019).

US EPA (2020). Summary of the oil pollution act: Laws & regulations. https://www.epa.gov/la

ws-regulations/summary-oil-pollution-act#:~:text=33%20U.S.C.&text=The%20Oil%20Po

llution%20Act%20(OPA,or%20unwilling%20to%20do%20so. (Accessed on 09/05/2020).

USCG (2016). Operational Assets. https://www.work.uscg.mil/Assets/. (Accessed on12/04/2019).

Uster, H., Easwaran, G., Akcali, E., and Cetinkaya, S. (2007). Benders decomposition with alter-native multiple cuts for a multi–product closed-loop supply chain network design model. NavalResearch Logistics, 54(8):890–907.

Uster, H., Wang, X., and Yates, J. T. (2018). Strategic Evacuation Network Design (SEND) undercost and time considerations. Transportation Research Part B: Methodological, 107:124–145.

Veremyev, A., Prokopyev, O. A., and Pasiliao, E. L. (2017). Finding groups with maximum be-tweenness centrality. Optimization Methods and Software, 32(2):369–399.

Vogiatzis, C. and Camur, M. C. (2019). Identification of essential proteins using induced stars inprotein–protein interaction networks. INFORMS Journal on Computing, 31(4):703–718.

Vogiatzis, C., Veremyev, A., Pasiliao, E. L., and Pardalos, P. M. (2015). An integer programmingapproach for finding the most and the least central cliques. Optimization Letters, 9(4):615–633.

Waldholz, R. (2016). On the scene with the Crystal Serenity. https://www.ktoo.org/2016/08/1

7/scene-crystal-serenity/. (Accessed on 03/25/2020).

Wang, J., Peng, W., and Wu, F.-X. (2013). Computational approaches to predicting essentialproteins: A survey. PROTEOMICS–Clinical Applications, 7(1-2):181–192.

World Health Organization (2019). Publications on water sanitation and health. https://www.wh

o.int/water sanitation health/publications/en/. (Accessed on 12/04/2019).

144

Page 156: Large-Scale Optimization Models with Applications in

Wuchty, S. and Stadler, P. F. (2003). Centers of complex networks. Journal of Theoretical Biology,223(1):45–53.

Ye, Y., Jiao, W., and Yan, H. (2020). Managing relief inventories responding to natural disasters:Gaps between practice and literature. Production and Operations Management, 29(4):807–832.

Yu, L., Yang, H., Miao, L., and Zhang, C. (2019). Rollout algorithms for resource allocation inhumanitarian logistics. IISE Transactions, 51(8):887–909.

Zetina, C. A., Contreras, I., and Cordeau, J.-F. (2019). Exact algorithms based on Benders decom-position for multicommodity uncapacitated fixed-charge network design. Computers & OperationsResearch, 111:311–324.

Zhong, S., Cheng, R., Jiang, Y., Wang, Z., Larsen, A., and Nielsen, O. A. (2020). Risk-averse op-timization of disaster relief facility location and vehicle routing under stochastic demand. Trans-portation Research Part E: Logistics and Transportation Review, 141:102015.

145