Procuring Spontaneous Session-level Resource Guarantees ...a video conferencing session for a job interview who cannot procure any guarantees for the call quality and performance

1

Procuring Spontaneous Session-level ResourceGuarantees for Real-time Applications:

An Auction ApproachMadhumitha Harishankar, Sireesha Pilaka∗, Pragya Sharma∗, Nagarjun Srinivasan, Carlee Joe-Wong, and Patrick

Tague

Abstract—Real-time multimedia applications such as interac-tive gaming, live video streaming, and augmented reality havestrict latency and bitrate requirements. However, unpredictablenetwork conditions like congestion and link quality can severelydegrade the Quality of Experience (QoE). While buffer-basedmitigations cannot be applied to real-time applications due totheir immediate resource needs, recent innovations in networkslicing have demonstrated the feasibility of dedicating specifiedamounts of network resources to individual sessions in the radioaccess network. Encouraged by this, we propose to reservenetwork resources for multimedia sessions in real time accordingto their declared needs, thereby providing ad hoc session-level performance guarantees. Through WiFi experiments andtrace-driven LTE simulations, we show that such session-levelresource provisioning is robust to real-time channel fluctuationsand congestion externalities over the lifetime of a session. Thisapproach, however, raises challenges: how can the networkensure that users are honest about their resource needs andoptimally allocate its limited resources to users, under uncertaintyin future sessions’ resource needs? We derive a novel Multi-Unit Combinatorial Auction (MUCA) model with a uniquestructure that can be exploited for fast winner determination andyet incentivize truthful bidding, properties not simultaneouslyachieved in a generic MUCA but essential to making real-timesession guarantees. Further, since dynamic bidding in real time ischallenging for end-users who are budget-constrained, we developa Reinforcement Learning based utility-maximizing strategy todistribute their budget across sessions, and show that it yieldshigh user utility.

Index Terms—Quality of service, 5G mobile communication,Resource management, Economics, Machine Learning, NetworkSlicing, Auction Theory, Reinforcement Learning, Performanceguarantees

∗These authors contributed equally.Manuscript received October 19 2018; revised March 16 2019; accepted

April 21 2019. This work was supported in part by NSF CNS-1751075 andDARPA contract no. HR001117C0048.

M. Harishankar, C. Joe-Wong and P. Tague are with the Department ofElectrical and Computer Engineering, Carnegie Mellon University, CA -94035, USA (e-mail: {mharisha, cjoewong, tague}@andrew.cmu.edu).

S. Pilaka is with Visa Inc., CA - 94404, USA and was affiliated withthe Information Networking Institute, Carnegie Mellon University during herinvolvement in this work (e-mail: [email protected]).

P. Sharma is with Amazon Lab126, CA - 94089, USA and was af-filiated with the Department of Electrical and Computer Engineering,Carnegie Mellon University during her involvement in this work (e-mail:[email protected]).

N. Srinivasan is with Mist Systems, CA - 95014, USA and was affiliatedwith the Department of Electrical and Computer Engineering, CarnegieMellon University during his involvement in this work (e-mail: [email protected]).

I. INTRODUCTION

Cyber-Physical Systems (CPS) and the Internet of Things(IoT) are emerging paradigms for increasingly pervasive andreal-time computing environments. Users are coming to relyimplicitly on the availability of services like Amazon Alexaand Google Home for their day-to-day tasks, while ambitiousnext-gen offerings like HoloLens promise to enable new usecases for real-time augmented reality like telepresence. How-ever, the network connectivity that these ubiquitous computingenvironments rely on is insufficient [1] even for currentapplications. Widespread mobile multimedia services suchas video conferencing and interactive mobile gaming havespecific resource needs to provide an acceptable Quality ofExperience (QoE) for the end-user. Without any means ofconveying these needs to the network, the network may notmeet them [2], [3] leading, for instance, to Google deployingits network Espresso.

In this work, we propose to provide resource guaranteeson a per-session basis to QoE-sensitive applications by rec-onciling available resources with their session needs. To seethe benefits of such guarantees, consider a Skype user startinga video conferencing session for a job interview who cannotprocure any guarantees for the call quality and performance.Mechanisms for QoS-aware allocation [4], [5] typically do notmodel resource consumption at session-level timescales, henceallowing demand spikes to potentially interrupt or degrade thecall. Xu et al. [1] and others [6] investigate this problem ofhigh variance in cellular resource availability in the contextof real-time applications. With buffer-based reactive measureslike HTTP DASH infeasible [7], [8], the authors propose ashort-term up-ahead estimation of channel conditions to proac-tively adjust application behavior, thereby reducing perceiveddelay. Improved channel estimation, however, cannot helpsessions that must be preempted altogether due to congestionor spiky traffic, thereby entirely disrupting the call. Hence,channel estimation itself is insufficient to guarantee call qual-ity or completion. We instead propose to proactively allocateresources to real-time sessions to guarantee high QoE overtheir entire duration. Proactive resource provisioning has beenstudied [5], [9] at the time-scale of packets or transmissiontime interval (TTI), using dynamic pricing or auction-basedmethods to allocate limited resources. However, the resultingallocation at one TTI is entirely independent of the next,limiting their use here. For multimedia applications, while

millisecond level network performance affects user perception,user engagement and ultimately QoE occur at the sessionlevel. Therefore, the user’s QoE depends on the resourceallocation throughout the duration of the session, which ison the order of minutes or hours. Our goal in this workis to proactively provide resource guarantees expressed interms that a user agent or application can understand andnegotiate for, abstracting away the lower-level intricacies ofnetwork resource allocation as details left to the networkoperator. Similar mechanisms have been proposed for theInternet backbone [10] and cloud environments [11], [12], andwireless applications are likely to benefit from them even moredue to their best-effort nature. However, this also makes itchallenging to provide such guarantees in wireless networks.In fact, offering multiple tiers of service guarantees to wirelessusers is the goal of the emergent network slicing paradigmin 5G. Critical components of network slicing are still intheir infancy [13], including RAN slicing, i.e., designing thewireless radio access network to enforce per-flow performanceguarantees, and slice admission and management. This workaddresses these research challenges [14] that arise in enablingproactive, session-level resource provisioning for wireless net-works.

Feasibility: To proactively provision flows for their antic-ipated duration, available spectrum resources must be quan-tified and accurately reconciled with session requirementspresumably expressed in terms of bitrate, latency and dura-tion. Is this feasible? Even then, are performance guaranteespossible despite uncontrollable wireless influences like fastfading? Recent works in RAN slicing [15], [16] facilitatefunctional slice isolation and empirically analyze various fac-tors to provide probabilistic performance guarantees in cellularnetworks. We herein employ an admission control algorithmthat allows a flow into the network based on a simple resourceforecast and reconciliation model that is generalizable to bothscheduled and random-access wireless networks. We conductextensive WiFi experiments and trace-driven LTE simulationswith multimedia applications. In Sections II and III, wefind that admitted latency-sensitive as well as bitrate-heavyflows achieve their promised performances and congestionexternalities are effectively mitigated. In fact the networkaccommodates even more flows by implementing incentivizedadmission control. Further, since this may be offered as avalue-added service that some users may not require, weshow reliable guarantees can be made even in the presence ofbackground flows not controlled by our admission algorithm.

Allocation and Incentive Compatibility: Given a forecastof network resource availability and a mechanism to reconcilethis with session needs, how should these limited resources beprovisioned? The network will likely need to prioritize userswith higher resource valuations as it cannot accommodateall session requests. Variation in such valuations could arisefrom usage context (medium quality for recreational videocalls but high for an interview) or device preferences (lowerresolution on a smartphone vs a 4K monitor). Further, allo-cating resources for the duration of a session is particularlydifficult as the operator must account for uncertainty in futureneeds, and users may strategically misrepresent their needs and

valuations. We address these concerns in a novel auction modelintroduced in Section IV. The operator offers consecutiveauctions throughout the day, and users relay their sessions’resource needs dynamically in a combinatorial bid to thecurrent auction; session durations may span multiple auctions.In Section V, we show that the spontaneous and real-timenature of sessions can be exploited to reduce the searchspace of the intractable optimization problem of determiningwinning bids, thereby facilitating spontaneous guarantees. InSection VI, we propose multiple ways for the operator toincentivize truthful user declarations even under uncertaintyof future bid arrivals and analyze trade-offs in social welfare,incentive compatibility and operator revenue.

Usability: For users to procure and benefit from perfor-mance guarantees in this system, they must engage in routineauctions by bidding. However, studies have shown [17] thatdynamic pricing is challenging for end users who are budgetconstrained and averse to making real-time network con-sumption decisions. We consequently address the user-facingchallenges of resource-specification overhead, price discoveryand budget constraints. We envision that an automated agentwill participate in these session-oriented resource auctions oneach user’s behalf, placing bids using a parameterized utilitymodel and enforcing the user’s daily budget. In Section VII,we formulate the distribution of this budget across bids as a dy-namic program solved with model-free reinforcement learning,specifically the Monte Carlo policy iteration algorithm [18].We show via simulation in Section VIII that these agentsmaximize user utility for a given budget within a billing cycle(1 month) without any loss in revenue to the network operator.

Overall, we formulate an end-to-end system for realizingsession-level performance guarantees, addressing challengesin the radio access network, incentive mechanisms for resourceprovisioning, and usability. We defer discussion of relatedwork to Section X and conclude in Section XI. Additionalexperimental results and proofs can be found in our extendedreport [19].

II. FEASIBILITY OF SESSION-LEVEL PERFORMANCEGUARANTEES OVER LTE

Providing session-level performance guarantees requires anAdmission Control (AC) procedure that only admits flows withdemands that can be fulfilled for the stated session duration.This reconciliation of available and required resources isthen expected to result in admitted flows that are robust toexternalities and realize their promised performances. Whileunpredictable real-time channel fluctuations are inevitable(e.g., fast fading), our premise is that the short timescaleof these fluctuations affects session-level QoE less than usercompetition, which occurs at session-level timescales. Wevalidate this via a proof-of-concept trace-driven simulation ofusers with different traffic types sharing the resources of anLTE eNodeB. We show session performance can be guaranteedby an AC that 1) accounts for resource competition in flowadmission and 2) accommodates for unpredictable wirelessexternalities when provisioning capacity.

2

(a) Without AC (b) With AC (c) AC with Background Traffic

Fig. 1: (a) Without a resource-aware Admission Control (AC) algorithm, almost half of the LTE network’s resources areexpended in failed sessions. (b) With the AC in place, this is reduced to ∼5% while preserving high network utilization and(c) most performance guarantees are met even in the presence of uncontrolled background traffic.

Setup: We use SimuLTE and INET1 to simulate LTE withTCP/UDP/IP. Our simulation includes one eNodeB with 12resource blocks and a noisy channel. Users are randomlydispersed in the coverage area, yielding variation in channelqualities. Data usage is modeled from the multimedia activityfound in mobile traffic traces of 20 users collected over10 days. Multimedia content from the traces includes videostreaming, audio streaming and real-time video conferencing,parameterized by bitrates and latencies of known applica-tions234. Since the AC algorithm compares these requiredquantities with the available capacity to determine flow feasi-bility, it must translate granular frequency-time network blocksinto bitrate and latency capacity forecasts. While devisingan accurate model for this is a challenging task in itselfand out of our scope, we presume a naıve model that issufficient to indicate general feasibility and benefits of thisapproach. In essence, the AC procedure maps the 12 resourceblocks to a conservative estimate of bitrate capacity (e.g.,10Mbps), thereby allowing a buffer of radio capacity thatmay be consumed, for example, by an admitted flow(s) withpoor signal strength or temporary channel degradation. A flowis admitted after verifying that its required capacity can beaccommodated, and the capacity forecast reduced accordingto the requested bitrate for the specified duration.

Network Performance with AC: We measure link-layerutilization both with and without session-level AC. Figure 1(a)shows that without AC, roughly 47% of resource blocksare allocated to sessions that fail (i.e., the stream halts be-fore completion) mainly from excessive resource competition.However, as in Figure 1(b), the AC algorithm drasticallyreduces this wastage (to below 5%) while preserving nearlyfull utilization of available resources. While the AC allocatesa conservative 10 Mbps for flow provisioning to guard againstexternalities, this would presumably leave the network un-derutilized. In this case, however, congestion between flowsand impact from other externalities were almost entirelyeliminated while retaining high utilization. These promisedperformances were achieved with noise and channel qual-ity variation, indicating that session-level guarantees can beprovided in wireless cellular networks without the significant

1http://simulte.com/, https://inet.omnetpp.org/2https://support.skype.com/en/faq/FA1417/how-much-bandwidth-does-

skype-need, http://download.skype.com/share/business/guides/skype-connect-requirements-guide.pdf

3https://help.pandora.com/customer/portal/articles/166391-minimum-specifications-to-run-pandora

4https://support.google.com/youtube/answer/2853702

cost of network under-provisioning. Even the naıve modelof capacity forecasting used by our AC procedure provedsufficient to serve most admitted flows of different applicationtypes. This validates our premise that wireless radio resourcemodeling and reconciliation with session-oriented resourcerequirements are feasible and can likely provide performanceguarantees.

Since AC-admitted sessions may co-exist with unregulatedbackground sessions (of users that do not require performanceguarantees), we reserve some network capacity for this trafficand apply AC to the remaining capacity. We employ the MAXC/I scheduling at the MAC layer to prioritize flows with betterchannel quality [20], and modify the scheduler to further pri-oritize AC-admitted flows before ranking by channel quality.Figure 1(c) compares the resulting performance of sessionsbelonging to AC and non-AC traffic. An AC session is deemedsuccessful if it streams for its entire duration at its guaranteedbitrate on average; real-time sessions additionally require apacket inter-arrival time below 40 ms. A non-AC session issuccessful if it achieves any resolution supported by its traffictype. As in Figure 1(c), the network is highly congested withnon-AC traffic, so only one of eleven non-AC flows succeeds,while four of the five admitted AC flows succeed. Figure 1(c)demonstrates that session-level guarantees can be achieved inthe presence of non-AC traffic with appropriate reservationsin the capacity forecast and MAC prioritization. Our extendedreport [19] provides additional LTE-based simulation studies.

III. FEASIBILITY OF SESSION-LEVEL PERFORMANCEGUARANTEES OVER WIFI

We now assess whether performance guarantees can bemade in a random access medium like WiFi. Unlike LTE, WiFihas a short range and operates in unlicensed spectrum, makingthe channel more susceptible to interference and externalities.A likely use case for performance guarantees, however, iswhere multiple users engaged with various apps contend forcongested resources of a public WiFi network, for instance,in a cafe-like scenario. Measurement studies [21], [22] haveshown extensive growth in public hotspot traffic and AccessPoint (AP) deployment, with WiFi traffic doubling every twoyears (∼35% video). These experiments verify the feasibilityof session-level guarantees in such scenarios.

Setup: We launch 50 iPerf5 clients in parallel acrossmultiple devices to induce channel quality variations. Clients

5https://iperf.fr/

3

connect to an 802.11g AP operating at 2.4GHz and launchfive sequential sessions over 50 minutes, each comprisinga random duration of video streaming, audio streaming orvideo conferencing (we continue to use resolution rates andcorresponding bandwidth requirements that are widely inpractice). We further incorporate non-AC web browsing trafficat 50 Kbps, thereby inducing overall activity variance typicalof public WiFi.

Network performance without AC: As a baseline, we firstengage the 50 clients in their planned mobile activity over thishotspot without the AC algorithm. Clients request the high-est supported resolutions for their multimedia sessions (e.g.,1.5 Mbps and 4.5 Mbps for video conferencing and streaming,respectively) since they have no incentive to request lowerbitrates due to free hotspot access. An aggregate data demandof up to 80 Mbps is seen in Figure 4(a). However, although the802.11g AP has a theoretical capacity of 54 Mbps, only half ofthis is realized by the network, indicating severe performancedegradation from congestion. The network also exhibits highlatency and jitter, as in Figure 2(b), causing real-time videosessions (requiring ∼100ms) to fail or be lag-ridden.

AC Procedure: We now introduce our admission controlprocess. Browsing sessions constitute light-weight traffic andalways commence upon launch, modeling the backgroundtraffic of regular-access users as in the LTE experiments. TheAC algorithm simply uses an estimated bitrate capacity asan abstracted representation of available radio resources andpermits a session to start only if requisite session bandwidth isavailable. We herein refer to this as Non-Incentivized AC andintroduce a corresponding Incentivized version. Since Non-Incentivized AC admits flows (subject to feasibility) on afirst-come first-serve basis, users always request high bitrateseven when they may be content with lower bitrates (e.g.,when using a mobile device with low screen resolution). WithIncentivized AC, we presume that an incentive mechanisminduces clients to request only their value-maximizing resolu-tions for multimedia sessions. This incentive mechanism may,for instance, be a payment policy that charges admitted clientsto persuade them to state only what they need (developedin Sections IV, V, VI). Using Incentivized AC, users arethus admitted according to their valuation for the appropriatecontext. We simulate Incentivized AC with clients streamingmultimedia sessions at a resolution that is randomly chosenfrom the supported ones, simulating the distribution of utilities,preferences and budgets in a population.

Performance With AC: While the AP’s theoretical capacityis 54 Mbps, this is rarely realized in practice due to time-varying nature of the wireless channel. Since provisioningbased on this capacity may result in poor performance ofsome flows, the AC procedure is initialized with a capacityof 50 Mbps. Figure 3 depicts the number of clients admittedinto the network under AC. A few clients are consistentlyrejected for the first half hour due to lack of capacity asaggregate data demand is highest then, Figure 2(a). However,with Incentivized AC, the entire pool of 50 clients is admittedinto the network at all times that they initiate multimediaflows. As clients distribute their requesting resolutions inalignment with their true utilities and valuations, aggregate

user demand decreases so much that the network has sufficientcapacity to now admit all of them (in this case), therebyincreasing the utility of the entire set of users. Even whenthe provisionable capacity is reduced to 40 Mbps and numberof clients increased to 75, the network admits them all withIncentivized AC, thereby increasing net social welfare of users.

With AC in place, network throughput increases to almost50 Mbps, as in Figure 4(a). Due to random access in WiFi,congestion externalities have a severe impact on the networkand are almost entirely mitigated with AC. When the In-centivized AC is deployed, as in Figure 4(a), only around30 Mbps is typically required of the network now, indicatingthat incentivizing users for truthfulness may allow the networkto serve more users overall. With AC, all real-time sessionsstream at mean latencies below 50 ms, as in Figure 4(b), whileintroducing incentives causes lower aggregate demand whichfurther reduces the mean latency to a maximum of 20 ms.In fact, the CDF of per-packet jitter for real-time sessionsacross all experiments in Figure 4(c) indicates that while morethan 80% of packets exhibit jitter above 40 ms without AC,∼80% of packets experience jitter below 40 ms with Non-Incentivized AC and below 20 ms (recommended for real-timevideo conferencing and gaming) with Incentivized AC.

By controlling flow admittance based on bitrate demand,performance guarantees are delivered to latency-sensitive real-time flows as well as other video and audio streams. Wethus validate our premise of session-oriented wireless resourceprovisioning (more results in our extended report [19]) andshift our focus to the design of the incentive mechanismemployed by the AC algorithm. Note that given a processfor forecasting network resource availability and reconcilingit with session demands, our incentive mechanism is agnosticto the Radio Access Technology (RAT) in use, a point wediscuss further in Section IX.

IV. MODELING AUCTIONS FOR SESSION-LEVELRESOURCE GUARANTEES

We now design the incentive mechanism that determinessession admission and cost. We focus on real-time applicationsthat require immediate access, but our model is generalizableto other application types. The network operator discretizestime into a total of T slots per day, and an auction At isheld in each time slot t. All resource requests are assumed tohave a maximum duration of φmax time slots; a longer sessionmay simply submit another bid for resources after φmaxslots. For instance, the network may hold an auction eachminute (i.e. T = 1440), allowing users to procure resourceguarantees almost spontaneously as they launch sessions, andset φmax = 20, for resources to be periodically freed up onceevery 20 minutes. Similarly, the operator supports a discrete setof resource modes mn for n = 1, . . . ,M , each correspondingto an operating bandwidth, bitrate, or similar. In Section V,we define a generic mechanism for the network to definethese modes, which allows for a wide range of supportedbitrates by common applications, while significantly reducingcomputational overhead of the auction. To characterize theresources being auctioned, the operator computes a forecast

4

(a) Network Throughput (b) Network Latency

Fig. 2: (a) Due to high data demand, the network is congested anddelivers around 30 Mbps throughput despite 54 Mbps capacity and(b) experiences latency spikes between 250− 750 ms and high jitter.

Fig. 3: The incentive mechanism induces hetero-geneity in requested resolution rates that allows thenetwork to admit all initiated sessions.

(a) Network Throughput under AC (b) Mean latency - Real-time Video (c) CDF of per-second Jitter

Fig. 4: With AC, (a) network throughput increases and real-time video sessions have (b) latencies within 25− 50 ms and (c)jitter less than 40 ms. Incentivized AC further reduces data demand and improves both jitter and latency.

Requested Bundle

Bitr

ate

Time(t,0)

𝜼𝒊𝒕

t+ϕa

𝝁𝒊𝒕 = 𝜼′

Received Bundle

𝜼/

𝝁𝒊𝒕 = 𝜼𝒊𝒕

Fig. 5: A bundle Bit received by a user i for request Ritmay provide a subset of the requested resources. For example,Rit specified ηit for the duration φit but received a bundleproviding µik = η′ < ηit for k < a and µik = ηit for k ≥ a.

C(τ)t for auction At of the bandwidth resources that will

be available in each time slot t + τ, τ ∈ {0, . . . , φmax},accounting for resources reserved in earlier winning bids.

The total number of users submitting bids for At is denotedby It. For users to express a desired combination of resourcesthat the network can actually serve, they must know theresources available to bid on. We develop a two-round interac-tion mechanism for this resource discovery. In the first roundof each auction At, user i expresses a request Rit that includesthe desired duration φit ≤ φmax and the desired resource modeηit ∈ {m1, . . . ,mM}, along with a corresponding valuationvit. We assume the requested ηit is constant over the dura-tion φit, as real-time applications typically have fairly stableresource needs over time. In response to resource requestsRit = (ηit, φit) from users i, the network operator determinesif granting Rit is feasible (given projected availability). If not,it generates a set Sit of alternate resource bundles (based onforecast capacity and adjusting for underlying wireless channelstates), where each bundle Bit ∈ Sit enumerates the offeredresource µik ≤ ηik for time slots k = t + 1, . . . , t + φit.

Figure 5 illustrates an example bundle offered in response toa resource request. Given the set Sit of available bundles, useri may bid on a bundle Bit ∈ Sit by assigning a new bid valuevit on it, yielding a bid bit = (Bit, vit) for the select bundleBit ∈ Sit. Once auction At is executed, user i learns the resultxit ∈ {0, 1} of the bid and starts consumption if xit = 1.

Maximizing social welfare to determine bid winners is de-sirable since Vickrey-Clarke-Groves (VCG) [23] payments canthen be charged to incentivize truthful bidding. The resultingcomputation, however, is an NP-hard problem, the solutiontime of which is exponential in bid durations. Thus, in thenext section we develop novel reductions to the problem byexploiting the spontaneous nature of winning sessions, i.e. thatthey begin consumption immediately. With this, the networkcan implement the VCG mechanism in real time, and statingtrue product valuations vit becomes the dominant strategyof users. This allows users to avoid complex estimation ofother bidders’ strategies to maximize their own utility andallows the network operator to discover the distribution oftrue valuations across bidders, indicating the perceived valueof network resources and the potential revenue. However,bidders are multi-parameter agents [24] in this setting; theystate not just their valuations but also the desired mode η andduration φ. Incentivizing bidders to truthfully report φ requiresmodifications to the mechanism that account for temporalcorrelations between the decisions taken in different auctions.We develop the resulting allocation and payment schemes inSection VI and analyze their auction properties. Proofs can befound in our extended report [19].

V. WINNER DETERMINATION

Since offering performance guarantees is a service in ad-dition to users’ normal mobile data plans, it will likely

5

represent a small portion of overall operator revenue, and theoperator may rather wish to maximize users’ welfare. Indeed,for “public utility” goods like network resources that arecompetitively auctioned, Cramton [25] argues for maximizingsocial welfare rather than network revenue for the sake oflong-term user engagement. As such, we study the winnerdetermination problem with the intent of optimizing socialwelfare and evaluate the achieved revenue via simulation inSection VIII. Thus, in this Multi Unit Combinatorial Auction(MUCA), the network maximizes the declared user valuationsvit in auction At subject to the resource capacity constraintsover time slots, yielding the optimization problem:

max{xit∈{0,1}}

It∑i=1

vitxit

s.t.I∑i=1

µi(t+τ)xit ≤ C(τ)t , τ = 1, . . . , φmax.

(1)

We recognize this MUCA formulation as the NP-hard multi-dimensional knapsack problem (MKP) [26]. The dimension-ality stems from the combinatorial nature of the bids, whereinthey span multiple time slots (generalizable to multiple basestations and flows needing uplink/downlink capacity). Solving(1) is thus prohibitive for real-time network use. Many existingalgorithms for fast MUCA winner determination [27]–[29]rely on assumptions such as bidder multi-mindedness, sub-modularity and low number of dimensions in the MKP, whichdo not apply to our auction model. Using approximation algo-rithms or other heuristics [30] to find a solution could resultin significant loss of network revenue when the number ofusers or the bid duration increases, especially with frequentlyrepeated auctions. More importantly, an exact solution to theMKP is required to incentivize users to bid truthfully in theauction [31]. We instead exploit the nature of real-time flowdemands to reduce the complexity of (1) by considering aseries of conditions on bid quantities, durations and resourceavailability. We show the simplification of (1) for each case,gradually leading up to more realistic and less restrictiveconditions. The MKP mostly reduces to the knapsack problemsolvable in pseudo-polynomial time [32].

A. Bundle Generation Policy

Our first task is to define the network operator’s policy forgenerating the set Sit of bundles in response to a resourcerequest Rit. The operator constructs a bundle Bit with re-sources at time t + τ given by µi(t+τ) = min

(ηit, C

(τ)t

)for τ = 1, . . . , φit, corresponding to the highest possibleresource level not exceeding the request ηit, based on projectedavailability. If µi(t+τ) = 0 for any τ , no bundle is offered tothat user due to severe lack of resource availability. Hence thebundle a user receives comes closest to what the user requestedfor the specified duration, given capacity constraints. Giventhis construction of Bit, a bundle submitted to the auctionmay have different resource demands at different time slots,while the original request Rit does not. Note that if Rit isfeasible, then Bit perfectly satisfies it by construction above,

and gets submitted immediately to the auction At. Constructedbundles may have the features defined below.

Definition 1 (Upswitch). A bundle Bit exhibits an upswitchif ∃ τ ∈ [1, φit − 1] s.t. µi(t+τ) < µi(t+τ+1). The number ofupswitches in Bit is denoted by UBit . Bundle Bit has an a/b-upswitch if ∃ τ ∈ [1, φit−1] s.t. a = µi(t+τ) < µi(t+τ+1) = b.

Definition 2 (Downswitch). A bundle Bit exhibits a down-switch if ∃ τ ∈ [1, φit − 1] s.t. µi(t+τ) > µi(t+τ+1). Thenumber of downswitches in Bit is denoted by DBit . Further,bundle Bit has an a/b-downswitch if ∃ τ ∈ [1, φit − 1] s.t.a = µi(t+τ) > µi(t+τ+1) = b.

B. Reduction to Tractable Optimization Problems

We first consider a network capacity projection that in-creases monotically over time. That is, if the resource avail-ability projection C

(τ)t for future timesteps τ ∈ [1, φmax]

shows no decline within that time period, then the auctionround At satisfies this condition and is said to exhibit propertyP1(t). This would hold, for instance, in any time slot t whereno sessions carry over from previous auctions. The entirenetwork’s resource capacity is then available equally at t forall future time slots. If At exhibits P1(t), then there can be nodownswitches in any bundles submitted to this auction givenour bundle generation policy, i.e., DBit = 0 ∀i. We now showseveral simplifications possible to (1) when P1(t) holds, andalso determine conditions under which P1(t) is guaranteed tohold.

Definition 3 (Uniform quantity bid). A bid bit is a uniformquantity bid on a bundle Bit if the corresponding resourcelevels µit are all equal, i.e., if ∀τ ∈ [1, φit − 1], µi(t+τ) =µi(t+τ+1).

Theorem 1. If P1(t) holds and each bid to auction At is auniform quantity bid, then the outcome {xit} that solves (1)with only the capacity constraint at τ = 1 is the solution to(1), reducing it to a knapsack problem. Further, P1(t + 1) isguaranteed to hold.

While Theorem 1 simplifies the winner determination sig-nificantly by reducing the MKP to a single knapsack problemin one time slot, it only applies when all bids are of uniformquantity. Consider the following instance. The network hasno active sessions at time t and projects Cτt = 10, ∀τ . At t,it admits two flows that consume 3 Mbps for 5 minutes and5 Mbps for 7 minutes, respectively. At t + 1, the projectedavailability is Cτt+1 = 2, τ = 1, . . . , 5 and Cτt+1 = 5, τ = 6, 7,and Cτt+1 = 10, τ = 8, 9, and so on. A user requesting 4 Mbpsfor 10 minutes at t+ 1 would thus receive a bundle granting2 Mbps for the first 4 minutes and 4 Mbps thereafter. Thisbundle exhibits an upswitch and is therefore not uniform. Thenetwork may be able to force the construction of suboptimaluniform bundles if this is a reasonable restriction for someapplications or use cases. However, upswitches are likelydue to varying availability constraints and data consumptionpatterns.

6

Lemma 1. If a bundle Bit exhibits an a/b-upswitch at timeτ , then any bundle in At corresponding to a request Ri′t withηi′t ≥ b exhibits an a/b′-upswitch with b′ ≥ b at time τ .

Proof. Given a bundle Bit with a = µi(t+τ) < µi(t+τ+1) =

b, the bundle generation policy implies C(τ)t = a, ηit ≥ b,

C(τ+1)t ≥ b. Hence, for any other bid bi′t with ηi′t ≥ b, we

have µi′(t+τ+1) = min(ηi′t, C

(τ+1)t

)≥ b, µi′(t+τ) = a.

We now derive results that guide the network operatorin defining its operating bitrate modes such that the MKPcan be simplified even with upswitches. First, if the quantityexpressed by each supported mode is equally spaced, we calledthese evenly dispersed modes.

Definition 4 (Evenly dispersed). A set of modes{m1, . . . ,mM} is evenly dispersed if ∃y, z ∈ N s.t.mn = z(n+ y), ∀n ∈ {1,M}.

Theorem 2. If the auction modes are evenly dispersed, thenthe outcome {xit} that solves (1) with only the capacityconstraint at τ = 1 is the solution to (1), reducing it to aknapsack problem. Further, P1(t) holds for all t if auctionmodes are evenly dispersed.

Theorem 2 allows the network operator to support differentoperating bitrates while solving (1) with a single knapsack.Specifically, if the supported modes were evenly dispersed,e.g., 2, 4 and 6 Mbps, then it is sufficient in every auction tosolve (1) in the first time slot. The operator can choose theexact operating modes by examining those required by targetapplications and its ability to reserve resources. However,certain real-time applications may not lend themselves to this,e.g., Skype has discrete modes with unevenly dispersed bitraterequirements6, and the network may therefore offer arbitrarymodes to serve these applications. We first note that even inthis case, the upswitch count UBit cannot exceed M − 1 aslong as P1(t) holds.

Theorem 3. If P1(t) holds, modes are not evenly dispersed,and upswitches occur at time slots τ1, τ2, . . . , τk (as inLemma 1), where k ≤ M − 1, then restricting the capacityconstraint in (1) to these time slots, along with the first timeslot, yields the overall optimal solution. However, P1(t + 1)need not hold.

Theorem 3 shows that even with arbitrarily defined modes,the complexity of the winner determination scales only withthe number of modes supported and not the number of timeslots, as long as the availability projection at t increasesmonotonically. However, there is no guarantee that capacityprojections will continue to be monotonically increasing forfuture auctions. We can simplify the MKP without P1 if bidsare uniform quantity.

Theorem 4. If P1(t) does not hold, but bids are of uniformquantity, solving (1) in only the time slots given by Algorithm 1yields the optimal solution. Further, if all bids are of equalduration, then Algorithm 1 reduces to a single knapsack

6http://download.skype.com/share/business/guides/skype-connect-requirements-guide.pdf

Algorithm 1 Pruning time slots under uniform quantity bids(µit = µi) when P1(t) does not hold. W is set of bids.

1: procedure COMPUTECONSTRAINEDTIMES-LOTS(W,C, t, φmax)

2: slots[0, :]← [0]3: consSlotInInterval← 04: consV alue← 05: for τ ← 1, . . . , φmax do6: sumAllAsks← 07: endInterval← 08: for i ∈W do9: sumAllAsks← sumAllAsks+ µi

10: if φi == (t+ τ) then11: endRegion← 1

12: currSlotConstraint← sumAllAsks

C(τ)t

13: if currSlotConstraint > consV alue then14: consV alue← currSlotConstraint15: consSlotInInterval← τ16: if endInterval == 1 then17: slots = [slots, τ ]18: consSlotInInterval← 019: consV alue← 020: return slots

problem, solved for the time slot with the largest ratio ofrequested to available capacity (i.e.,

∑i:bit 6=∅ µit/C

(1)t ).

Algorithm 1 iterates over each time-slot τ ≤ t + φmax. Ifa submitted bid(s) is scheduled to finish consumption at τ ,it finds the time-slot with the largest ratio of requested toavailable capacity between τ and the last time-slot when asubmitted bid ended. These timeslots are used to solve (1).Hence, if bids are uniform quantity, the dimensionality of (1)scales with the variance in bid durations, not the number oftime slots, leading to relatively fast solutions even for largeφmax.

VI. INCENTIVE COMPATIBILITY

We have shown several ways for the network to simplify thewinner determination in (1), making it feasible to optimizefor social welfare in real time. This allocation objective, inconjunction with carefully designed payment schemes, caninduce strong properties. We first consider a single auctionAt in isolation, and induce a myopic notion of truthfulnessusing the VCG mechanism. We then frame At in the contextof repeated auctions, where we account for the impact ofdecisions made in At on subsequent auctions. In both cases,we ensure dominant strategy incentive compatibility whileinducing desirable properties that are often challenging toachieve simultaneously, such as revenue monotonicity and expost individual rationality.

A. Myopic TruthfulnessThe VCG mechanism has gained wide popularity in its

ability to guarantee socially optimal results through dominantstrategy incentive compatibility (DSIC); i.e., every bidder’sbest interest is to bid truthfully, regardless of the strategies ofother bidders [23]. Since the network maximizes social wel-fare, it can implement the VCG mechanism by charging auc-tion winners their social cost. The social cost of each bidder i

7

is computed as the difference between the maximum feasiblewelfare without i and the welfare to others given i’s presence,i.e., maxxit∈[0,1]

∑k=Itk=1,k 6=i vktxkt −

∑k=tk=1,k 6=i vktx

∗kt, where

x∗kt represents the optimal solution with i present.When applied in combinatorial auctions, however, the VCG

mechanism is known to exhibit undesirable failures in bidderrevenue monotonicity [33], meaning the network’s revenuefrom VCG payments may in fact decrease when some bidsenter the system. An auction is said to be robust for a set ofbidders ∆ under VCG payments pit if

∀j ∈ ∆∑i∈∆

pit(vit,∆) ≥∑

i∈∆\{j}

pit(vit,∆ \ {j}). (2)

See Rastegari et al. [33] for a more formal treatment ofrevenue monotonicity. Another type of VCG failure in com-binatorial auctions is goods revenue monotonicity failure [34],when the operator could increase revenue by not auctioningcertain goods (in our case, resource quantities and time slots),hence acquiring an incentive to hide goods from bidders.Most prior work on combinatorial auction frameworks doesnot address the issue of VCG-induced monotonicity failures,which are especially challenging to manage in MUCA settingssuch as ours. We show, however, that under certain conditionsapplying VCG payments is guaranteed to result in revenuemonotonicity. To do this, we rely on the property of biddersubmodularity [33], [35] which builds on the maximum socialwelfare V (∆) of a set of bidders ∆, corresponding to theobjective in (1) restricted to ∆. Bidder submodularity holdsfor bidder sets ∆ and ∆′ with ∆ ⊆ ∆′ if and only if∀i V (∆

⋃{i})− V (∆) ≥ V (∆′

⋃{i})− V (∆′).

Theorem 5. If (1) can be solved in a single time slot t′ (e.g.,as in Thm 1), and µit′ = µi′t′ ∀i, i′ ∈ [1, It], then At isguaranteed to be revenue monotonic in bidders under VCGpayments.

In this scenario, winner determination is a knapsack problemwhere bids request the same resources but potentially differentvaluations. Then, all bids compete equally for capacity, andtherefore, removing a bid cannot increase another’s social cost,resulting in revenue monotonicity.

Lemma 2. In our auction, bidder revenue monotonicity im-plies goods revenue monotonicity.

In our auction, bidders desire and bid on exactly one bundle,a property referred to as single mindedness. When single-minded bidders exhibit bidder revenue monotonicity, they aregoods revenue monotonic as well [34]. Thus, by exploiting thestructure of users’ real-time resource requests, we have shownthat under reasonable conditions, users have an incentive tobid truthfully and the network operator has no incentive todiscourage bids from users or hide resources, as doing so willnot increase its revenue. However, we also note the followinglimitation.

Lemma 3. As long as the auctioneer solves (1) for winnerdetermination of At and charges winners their social cost,bidders may have an incentive to submit a false sessionduration φit.

Consider a case where P1(t) holds and all bids in At areuniform quantity. The auctioneer then only solves the knapsackproblem in the first time-slot (Thm. 1), and users’ choice of φithas no impact on their bid allocations or payments. Indeed,as seen in Section V, even when P1(t) does not hold andresults in arbitrary up/downswitches, the winner determinationand hence payments depend only on the time slots of theseswitches. Hence, maximizing the social welfare at At onlywith respect to At does not directly incentivize truthfulness indeclaration of φit.

B. Truthfulness Amidst Temporal Correlations

In our setting, the true social cost of a bit bit is not onlya function of other bids submitted to the current auction(as discussed earlier). Selecting a bid bit as a winner ofauction A(t) directly reduces available capacity in the nextφit time slots, which impacts the bids that can be accepted insubsequent auctions Aj , j ∈ [t + 1, t + φit]. To account forthis temporal correlation between these periodic auctions, weare faced with the challenge of factoring in the uncertaintyin future bids in admitting the present bids. We thus developmechanisms to push this uncertainty either to the user or thenetwork, inducing different properties accordingly.

The temporal trickle effect of a winning bid may in fact ex-tend beyond its duration; for instance, allocating resources fora bid bit might preclude allocation for a bid bj(t+φit−1), whichmight in turn allow for the allocation of a bid at t + φit + 1that would have been infeasible had j’s bid been allocated.We argue, however, that it is unreasonable to charge userstheir social cost beyond the duration φit (unlike the treatmentby Parkes et al. [36]). First, it is extremely challenging topredict and model the trickle effects starting from allocation ofa bid until the last auction in the system, leading to significantcomputational overhead and possible infeasibility. Second,since mobile network use is dense and diversified, the extendedeffects of a single bid in the system would arguably be too littleto cause a significant impact in the overall social welfare andhence not worth accounting for. Hence we propose to hold useri accountable for their “first-order” social cost with respectto arriving bids during [t, t + φmax], hence capturing directimpact during i’s consumption and any immediate ripples untilt+ φmax.

We now formulate strategies that induce desirable propertiesdespite this temporal correlation amongst auctions and futureuncertainty. We first provide definitions of these properties(see [23] for a thorough treatment). Individual Rationalityis achieved when no bidder receives a negative utility fromparticipating in the auction, i.e., no winning bid is chargedmore than its reported value vit and no losing bid is charged.The winner determination is Allocatively Efficient whensocial welfare is maximized in the allocation outcomes. Ifthe sum of all payments charged by the auctioneer is non-negative, i.e., the auctioneer does not suffer a net loss, thenthe mechanism is (weakly) Budget Balanced. A propertyis said to hold ex ante if it holds in expectation over theprivate and unknown information of all bidders, ex interimif it holds when a bidder knows their private information but

8

others only in expectation, and ex post if it is guaranteed tohold even when all bidder parameters are revealed. We nowdevelop allocation and payment schemes that navigate trade-offs in these properties by factoring in future bid uncertaintydifferently.

1) Maximize Expected Social Welfare: Let us consider anallocation strategy alternate to (1) to determine winners ofauction At. Let Ot be the set of all feasible allocations of At.Then

o∗t = argmaxOt

It∑i=1

vit(ot) +

t+φmax∑j=t+1

EtD(j)

Ij∑k=1

[vkj(otj)], (3)

where Ij is the number of bidders placing bids in the system attime j and EtD(j) denotes the expectation at time t taken overthe distribution D(j) of bids at j (consisting of bid arrivals,requested mode in the bid, duration and valuation). With thisallocation rule, the network maximizes the expected socialwelfare of the next φmax time steps in deciding the allocation,rather than maximizing only for the welfare of bidders atAt. This is implicit in the dependence between ot and otj ,wherein the latter captures the allocation decision taken at timet for the timestep j in the estimated look-ahead model. Pointestimates derived from, for example, Monte Carlo sampling ofoutcomes starting from t can be used for unknown parametersof future timesteps [37]. The approach in (3) presumes thatthe network has learned this distribution of bids spanning thenext φmax time steps. Indeed, computing a deterministic φmax-step look-ahead model is far more feasible than computing theoptimal solution for the multi-stage stochastic programmingproblem of expected welfare maximization for all remainingauctions [37], [38]. Models requiring computation of theoptimal value function at every time-step [36] pose severefeasibility challenges. Further, the time period φmax intuitivelylends itself as a reasonable look-ahead period since all userallocations starting at t must end by then, providing a standardand relatively short time window for computing prices.

2) Charge Expected Social Cost: We now design paymentrules which operate in conjunction with the allocation rule in(3) to induce desirable auction properties. First, we considera rule similar to above wherein a winning bidder i is chargedits expected social welfare cost pit as the difference betweenmaximum welfare without i and welfare to others given i’spresence, i.e.,

pit = W−it −

It∑l=1,l 6=i

vlt(o∗t ) +

t+φmax∑j=t+1

Ij∑k=1

EtD(j)[vkj(o∗tj)]

,

(4)where W−it represents the welfare without i as

W−it = maxOt

It∑l=1,l 6=i

vlt(o−it ) +

t+φmax∑j=t+1

Ij∑k=1

EtD(j)[vkj(o−itj )]

,

(5)o−it represents an allocation outcome at time t without i inthe system, and o∗t is the optimal solution with i present, withlook-ahead model decisions o∗tj . Note that we assume usershave quasi-linear utility functions, as ubiquitously done [23].

Theorem 6. The mechanism implementing the allocationrule in (3) and the payment rule in (4) is DSIC in all bidparameters. Further, it is ex post individually rational andbudget balanced.

By simply maximizing social welfare in expectation of[t, t+ φmax] and charging winning bids their expected socialcost, the network can not only incentivize dominant strategytruthfulness in ηit, φit and vit, but also ensure that no biddersare charged more than what they bid for. We now introduce anew property to evaluate this payment scheme.

Definition 5 (Payment Efficient). A mechanism is said toexhibit Payment Efficiency if it charges winners their socialcost. For instance, the VCG mechanism is payment efficient,since winning bids are charged the difference in social welfareto others due to their presence.

Lemma 4. The mechanism implementing the allocation rule in(3) and the payment rule in (4) is ex ante allocatively efficientand ex ante payment efficient at t for the time-period [t, t +φmax].

While this mechanism maximizes expected social welfare,it may well be the case that the auctioneer under-predictsdemand between [t, t + φmax] in retrospect which makes theallocation and payment decisions made earlier suboptimal. Theauctioneer’s revenue would then be higher if winning bids att were charged their social cost at t + φmax after observingthe actual demand.

3) Charge Realized Social Cost: The allocation decisionfor At must be made in real-time for immediate session needs.However, we realize that payments need not be computed orcharged in real time for users to start their sessions. Considerthe payment rule given by the actual after-the-fact differencebetween welfare without and with i after t+ φmax time stepshave elapsed:

pt+φmax

it = maxOt

t+φmax∑j=t

Ij∑k=1,k 6=i

(vkj(o

−i)− vkj(o∗)), (6)

where o∗ is the optimal allocation of users other than i givenby

o∗ = argmaxOit

t+φmax∑j=t

Ij∑k=1,k 6=i

vkj(o), (7)

and Oit is the set of all feasible allocations given i is allocated.Admitted users are now charged at time t + φmax, by whichtime all bids starting consumption at t are guaranteed to end.The auctioneer may now use its retrospective knowledge ofbids that came in from t to t+φmax to calculate the exact first-order social cost for each bid at t. Before analyzing the uniqueproperties that this rule yields, we first define a modified notionof DSIC.

Definition 6 (DSICE). Consider bidders that maximize ex-pected utility. If truthful revelation maximizes the expectedutility of bidders, regardless of the strategy of other bidders,then the mechanism is said to be Dominant Strategy IncentiveCompatible in Expectation.

9

Theorem 7. The mechanism implementing the allocation rulein (3) and the payment rule in (6) is DSICE in all bidparameters, ex interim individually rational and ex post budgetbalanced.

By charging bidders their true social cost at t + φmax

based on actual bids that arrived after t, the mechanismessentially shifts the risk of demand under-prediction to thebidder. However, winning bidders now bear the risk of beingcharged more than their bid.

Lemma 5. The mechanism implementing the allocation rulein (3) and payment rule in (6) is ex ante allocatively efficientand ex post payment efficient with respect to bids at t for[t, t+ φmax].

As we allocate using (3), ex ante allocative efficiency holds.By design of (6), winning bids pay their true social cost att + φmax and hence the network is also payment efficient.However, if the auctioneer over-predicts resource demandbetween [t + 1, t + φmax], charging users their actual socialcost at t+ φmax yields less revenue than charging them theirexpected cost at t.

We have proposed three distinct payment mechanisms forour auction model: traditional VCG (Section VI-A), payingthe expected social cost, and paying the realized social cost(Section VI-B). Since the latter two mechanisms requireknowledge of future bids, the network would likely introduceVCG payments first. By evaluating each At in isolation (i.e.,using (1) and VCG payment), the network can learn users’true bid valuations and required bitrates. By estimating biddurations with historical usage, the network may use thisdistribution of future bid parameters it has learned to offer thelatter mechanisms. This would likely increase social welfarein the system, since the network now accounts for the impactof allocation decisions between multiple rounds of auctionas in (3). In choosing between the two payment rules as inTable I, the network must make a design choice. It may eitherguarantee individual rationality by choosing (4) and assumethe risk of under-predicting resource demand, or it may choose(6) and ensure payment efficiency while allowing winners tobe charged higher than their bids. However, in the latter case,the network now assumes the risk of over-predicting resourcedemand.

VII. USABILITY CONSTRAINTS

In Sections V and VI, we developed practical allocation andpayment strategies for our auctions that achieve spontaneousresource guarantees for real-time applications. We now turnto challenges faced by end users of this system. Most dataplans in the US provide known and fixed up-front pricing forthe month [39], so engaging spontaneously in auctions mayadd uncomfortable expense uncertainty for the average mobileuser. Further, explicitly conveying an app’s resource needs andbid parameters every time the user desires guaranteed dataaccess can be a significant deterrent. To address these usabil-ity issues, we propose to have automated agents on users’devices that act on their behalf, abstracting them away fromresource specification and bidding overhead. We first formulate

a user-parameterized utility framework using which agents candiscover user valuations transparently for resource guaranteesof specific sessions. We then propose a reinforcement learningstrategy to enforce users’ daily budgets.

A. Bundle Utility

We determine user i’s valuation of a resource bundle Rit =(ηit, φit). Let ψit denote the utility per unit of time consump-tion for the mode ηit requested, normalized between [0, 1]across applications and pre-configured by the user. We usean α-fair model [40] to capture diminishing returns in utilityover longer session durations φit. The user’s utility associatedwith request Rit is thus given by Uit(Rit) =

φ1−αit

1−α ψit. If thenetwork responds with an alternate bundle Bit that returns amode below that requested in Rit, we impose a penalty to rep-resent the dissatisfaction in receiving a lower mode. We modelthis penalty for each affected time slot t+τ as a multiplicativefactor ξiτ , such that a higher penalty corresponds to a smallermode. We denote ξiτ = 1/(1 +ψit −ψ∗i(t+τ)), where ψ∗i(t+τ)

denotes user i’s valuation of the mode corresponding to theoffered µi(t+τ). Since users may be especially dissatisfied iftheir session experiences a downswitch, we further apply adownswitch penalty that grows with the magnitude (a − b)of an a/b-downswitch. We model this penalty for each τ asa multiplicative factor ζiτ = ρi/(µi(t+τ) − µi(t+τ+1)) whenµi(t+τ) > µi(t+τ+1), where ρi ∈ [0, 1] is user-specific. Theutility of Bit relative to that of Rit is then given by

Ui(Bit|Rit) =φ1−αit

1− α

φit∑τ=1

ξiτζiτψ∗i(t+τ). (8)

B. Budget Constraints

Building on (8), we develop an algorithm for the user agentto satisfy a daily budget constraint while placing bids that areproportional to the user’s true utility Ui. If agents distributebudgets poorly (as some of the naıve algorithms demonstratedin Section VIII do), users consistently lose in their auctionsof interest, hence forming the false impression that the marketrate is prohibitively high and exiting the system. The budgetdistribution problem relies on a policy to select a valuation vitto declare on a given bundle Bit (interchangeably Rit) thatmaximizes the user’s total expected future utility, subject tothe budget constraint. Without loss of generality, we collapsethe distinction between Bit and Rit by considering a virtualround where the network offers Bit = Rit if Rit is perfectlyavailable. To discover the optimal policy, we model the userenvironment as a Markov Decision Process (MDP) whereinactions correspond to placing bids, and the user receives areward equal to the utility Ui(Bit|Rit) if the bid wins andzero otherwise. As seen in Figure 6, the state of user i intime slot t is defined as σit = (t, βit, Bit, Rit), where βitis the remaining budget at time t (with βi1 as the total dailybudget). The overall system state determines the probability ofwinning the bid Pwin(σit, vit), which also represents the state

10

PaymentScheme DSIC Payment

EfficiencyIndividualRationality

AllocativeEfficiency

BudgetBalance

ExpectedSocial Cost X Ex Ante Ex Post Ex Ante Ex Post

RealizedSocial Cost In Expectation Ex Post Ex Interim Ex Ante Ex Post

TABLE I: We summarize trade-offs between charging expected social cost at t and realized social cost at t+ φmax.

t𝑹𝒊𝒕𝑩𝒊𝒕

t’’

𝑩𝒊𝒕((

User initiates request at t’’

Auction Execution

Policy ImprovementBid𝑣*+ 𝑣*+ ≤ 𝐵./0,+

Action = 𝑣*+

Charge 𝒑𝒊𝒕

𝑹𝒕 − 𝒑𝒊𝒕t’’

𝑩𝒊𝒕((𝑹𝒊𝒕((

Per-Bid State Transition

Policy Evaluation

End of Day/Episode

Receives 𝑩𝒊𝒕((

Reward = Utility

Fig. 6: The MCPI agent bids based on the current policy.Rewards from the states encountered and the actions takenduring the day are used to update the policy end of day.

transition probabilities. The optimal budget distribution policyis then given by:

maxbit

Ui(Bit|Rit)Pwin(σit, vit)+

T∑k=t+1

Ui(Bik|Rik)Pwin(σik, vik)

s.t.T∑k=t

pikPwin(σik, vik) ≤ βit.

(9)

However, users cannot solve (9) as the environment is onlyPartially Observable (yielding a POMDP); they only observetheir own actions and rewards and therefore cannot computethe transition probabilities Pwin(σit, vit). We hence employ amodel-free reinforcement learning mechanism to determinethe optimal user actions under uncertainty. The offline andepisodic Monte Carlo Policy Iteration (MCPI) technique [18]is particularly suitable here as users typically exhibit period-icity in daily mobile activities and resource needs, allowingus to consider a day as an episode. MCPI seeks the optimalbidding policy π∗it = v∗it(σit) by iteratively evaluating acandidate policy π and updating the action value functionqπ(σit, vit) from episodes sampled from the POMDP. Wedefine qπ(σit, vit) as the return obtained by placing bid vit instate σit and then following policy π, averaged over all futurestates and actions. The return Git =

∑Tj=t λ

jUi(Bij |Rij) fora given series of states is the total future discounted reward,where λ is a discount factor representing how much presentvalue a user assigns to future rewards. This captures a degreeof uncertainty about the future that stems from the environmentas well as the user’s estimate of their future session desires. Atthe end of each episode (e.g., day), the action-value functionis updated using

qπ(σit, vit)← qπ(σit, vit) + χ(Git − qπ(σit, vit)), (10)

where χ is the learning rate. We follow the well-knownε-greedy approach [18] to balance the trade-off betweenexploring the environment further to know it better (i.e.,choose vit randomly with probability ε) and exploiting currentknowledge of the environment to maximize current returns(i.e., chooses vit to maximize qπ). A bid vit cannot exceedβit, the current available budget. We anneal epsilon to even-tually always exploit after the environment has been exploredsufficiently, which yields the optimal policy if the environmentis periodic [18]. We set ε = 1/N(s), where N(s) is thenumber of times state s is visited, resulting in continuousreduction of exploration from a state as it is visited furtherand guaranteeing that πit approaches the optimal v∗it(σit) asN(s) → ∞. In Section VIII, we show that this explorationleads to convergence in relatively few iterations and study itsperformance under increasing complexity.

VIII. EVALUATION OF BUDGET DISTRIBUTION STRATEGY

Setup: We simulate a network of 100 users that participatein auctions for performance guarantees over the course of 80days and have predetermined schedules of resource requestsRit to place during the day. To allow all users to have areasonable chance of acquiring at least some of their desiredresources, we set uniform daily budgets as βi1 = β. Notethat the MCPI strategy can only help each user achieve themaximum utilities for their given budget and is of little helpif market rates are prohibitively high (see Section IX).

Baseline strategies: To assess the performance of MCPI-based budget distribution, we introduce two alternative strate-gies to compare against. Bidders of the greedy bidding strategybid as much of their remaining budget as needed to achieve thedesired utility in each auction as vit = min(Ui(Bit|Rit), βit).Bidders of the rationed bidding strategy spread their dailybudgets evenly over their (known) daily resource requests,setting vit to the minimum of the bid utility Ui(Bit|Rit) andthe session budget. Any residual session budget rolls over tothe next session. The maximum realizable daily utility subjectto a user’s daily budget is

Ui,max = max{yit∈{0,1}}

T∑t=1

Ui(Bit|Rit)yit s.t.T∑t=1

θityit ≤ βi1,

(11)where θit is the critical price of the auction At, i.e., the socialcost of admitting the bid in this allocation. Since our auctionsincentivize truthfulness, θit is also the payment charged to useri if their bid wins. Note that users themselves cannot computeUi,max due to partial observability. We use this value as the

11

theoretical maximum to evaluate the MCPI generated utilityagainst.

Performance of MCPI-based bidding: We first study adeterministic setting where we ensure resources are availablefor every auction, meaning every request elicits a viablebundle. Four scenarios are simulated, with the first two usingthe greedy and rationed strategies for all users, respectively.In these cases, we measure the fraction of Ui,max that bid-ders achieves at the end of each day and show the meanand standard deviation across bidders in Figure 7(a). In thethird and fourth scenarios, we introduce one bidder usingMCPI amongst the greedy and rationed users, respectively,then measure the MCPI bidder’s Ui. As Figure 7(a) shows,the MCPI bidder succeeds at exploring various actions andreaching 100% Ui,max by day 40 (even sooner against therationed bidders), while neither of the naıve strategies achievesmore than 40%. In the presence of MCPI bidders, greedy andrationed bidders continue to have poor mean performance butmarginally higher deviation until the MCPI agent converges(not shown). Hence, a bidder that previously realizes no morethan 40% of the maximum utility achievable with their budget(potentially believing that the market rate is prohibitivelyhigh), now wins more by bidding per the MCPI-based budgetdistribution algorithm. To increase realism, we next considera congested setting wherein some resource requests may notelicit any bundles. This is done by increasing session durations,which also increases the likelihood of a resource requestbeing turned away due to ongoing consumption of previouslyadmitted flows. Hence, in addition to budget constraints, MCPIlearning must implicitly account for resource availability. Forinstance, if resources are typically unavailable at 6:00PM, thenthe optimal strategy might be to distribute the budget to othertimes of day, since the agent will likely not get a chanceto express a bid for resource needs at 6:00PM. Figure 7(b)shows the resulting increase in the time for the strategy tostabilize. The MCPI bidder still outperforms the naıve biddersand converges, but only 85% of Ui,max is reached by day30. In this case, exploration has a ripple effect on returnedbundles in subsequent auctions, which affects budget changesand slows convergence.

We increase uncertainty in the environment by offsetting re-source request schedules by a random time period ωit for frac-tion f of the user population. Hence a user’s resource requesttimes are no longer perfectly periodic. Resource availability aswell as critical prices during [t, t+ φit + ωit] are affected bythis variance, making the MCPI learning more challenging.However, as Figure 7(c) shows, the MCPI bidder’s utilitydoes not decrease as a function of ωmax or f , even as thesevary from 1-3 hours and 1-100% respectively, indicating thatMCPI is beneficial to deploy in realistic network scenarios. Wealso study the impact of MCPI bidding on network revenue.Figure 7(d) shows the network revenue increasing marginallywith the number of MCPI bidders (never decreasing). This is adirect effect of combining our budget distribution strategy withan auction payment and allocation scheme that incentivizetruthfulness. Users, in their best interest, request resourcesonly when needed and have no incentive to misrepresent theirvaluation (and no value for leftover budget end of the day).

MCPI bidding then serves only to spend the budget in waysthat simultaneously best represents users’ utilities and thechances of winning. More results from our simulation studyare provided in our extended report [19].

IX. DISCUSSION

We now discuss practical considerations around deployingour system.

Network Requirements: Our choice of WiFi and LTEfor feasibility experiments is motivated by the widespreadproliferation of these RATs. The auction model can, in-fact, beexecuted on any topology where resource availability can beforecasted and reconciled with session needs. Hence, emergingRATs like mmWave (5G) and other WiFi versions are candi-date network topologies. The resource modeling mechanism,however, depends on the RAT in use. For instance, in a WiFinetwork using the MAC-layer point coordination function in-stead of the distributed coordination function, the access pointshas more centralized control of flows and therefore may bettereliminate wireless externalities. This would likely be factoredin the forecast model. In addition, the underlying networkmust support real-time flow control, e.g., with software-definednetworking, to ensure flows do not consume more than theirstated bandwidth.

Discovering Session Needs: The app-specific resourceneeds of a session are best determined by the app itself.Hence, wrapper libraries for network access protocols suchas TCP/IP can be used by apps to state the required resources(e.g., bitrate or latency) as they open a new socket connection.The user agent may then be a background process that receivesthis information. User-specific factors such as intended sessionduration or daily budget can be explicitly set by the user orestimated by the agent based on historical user activity.

Diversity in Budgets: In Section VI, we propose mul-tiple methods for the network to maximize social welfareand incentivize truthfulness in users’ resource and valuationspecifications. This ensures a certain notion of competitivefairness [25]. Since it is not in users’ best interest to lieabout their needs, only users who truly have the highest valuefor resources win allocations. However, since this value isexpressed in monetary terms, their budgets play a limitingrole in their winning chances. In this case, the RL strategyformulated in Section VII will be especially valuable to helpmore constrained users place bids in auctions with lowercritical prices and achieve the best utility possible given theiraffordability constraints.

X. RELATED WORK

Auctions in wireless networks have been mainly studiedin three contexts: spectrum license allocation, secondary cog-nitive radio allocation, and QoS-aware resource allocation.Auctions for long-term spectrum licenses are held over hoursor days with multiple bidding rounds before the auction endsand winners are determined, like the popular simultaneousascending and combinatorial clock auctions [25], [41]. Theydo not account for the faster time scales of session-level allo-cations for spontaneous application sessions. Further, the com-binatorial nature of our auctions presents significant challenges

12

0

20

40

60

80

100%

Rea

lized

Util

ity

0 10 20 30 40 50 60 70 80Day

MPCI Bidder with Greedy BiddersMPCI Bidder with Rationed Bidders

Greedy BiddersRationed Bidders

(a) User utility from bidding

0

20

40

60

80

100

% R

ealiz

ed U

tility

0 10 20 30 40 50 60 70 80Day

MPCI BidderRationed Bidders

(b) Bundle Variance Impact

Maximum Variance0

20

40

60

80

100

120

% U

tility

Rea

lized

3 hr2 hr1 hr

1% Population50% Population100% Population

(c) Schedule Variance Impact

0 1 2 5 10Number of MCPI bidders

800

850

900

950

Net

wor

k R

even

ue ($

)

(d) Revenue Impact

Fig. 7: (a) MCPI far outperforms naıve strategies, achieving 100% of the maximum utility. b) MCPI realizes 85% utility despitehigh uncertainty in resource availability. (c) Temporal variance in resource requests does not significantly degrade performance.(d) MCPI bidders increase revenue by driving up critical prices in auctions.

in the context of this prior work. Cognitive radio auctions [42]do not consider session-level app performance, instead availingopportunistic spectrum for much shorter time scales. Auctionshave also been employed for QoS-aware real-time channelallocation to primary users in mobile networks. The goalsof such approaches [5], [9] differ from ours in their focuson sub-carrier allocation with millisecond granularity andinterference mitigation. Using auctions for short-term resourceallocation does not guarantee session-level performance, whichintroduces new combinatorial characteristics that we address.

Our work furthers 5G’s envisioned network slicing capa-bilities [13], [14]. We verify the premise of RAN slicingfor both LTE and WiFi networks, recently also studied byFoukas et al. [15] and others [16], [43] in the cellular context,and provide incentive-compatible mechanisms for modelingslices and admitting users to them. We also address budgetoptimization in the context of repeated auctions, which hasbeen studied in limited settings and even fewer of them combi-natorial. Gummadi et al. [44] study budget-constrained biddingfor sponsored search auctions, but with strong assumptionsabout the system that guarantee equilibrium. Janssen et al. [45]study the combinatorial setting, but their work is limited to thecombinatorial clock auction. Almost no work has consideredwhether reinforcement learning can inform auction biddingstrategies as we do here.

XI. CONCLUSION

We design an auction model that captures the market forsession-level performance guarantees, decoupling the user-facing auction from network-facing wireless resource man-agement. Our model moves away from radio resource auc-tions and focuses on application-level provisioning, whichis especially useful for emerging real-time multimedia ap-plications. Through trace-driven LTE simulations and exten-sive WiFi experiments, we verify that not only can wirelessexternalities be minimized with resource-aware admissioncontrol of flows, but more flows can be accommodated byimplementing incentive-compatible auction-based admissioncontrol. We further analyze the winner determination of ourproposed auction model with regard to real-time multimediaapplications and show that there are several realistic conditionsthat render the multi-dimensional knapsack problem solvablein psuedo-polynomial time. These reductions make it feasibleto implement the incentive-compatible VCG mechanism andeven lead to revenue monotonicity in certain cases. We alsoanalyze the impact of temporal correlations between auctions

on incentive compatibility and define novel payment and allo-cation schemes to handle future bid uncertainty and navigatetrade-offs in desirable properties. Finally, we use the MonteCarlo Policy Iteration technique to show that even budget-constrained users can achieve high utility from these auctions.

XII. ACKNOWLEDGEMENTS

We thank Sriram V. Iyer (Head - Innovation Labs, Myntra-Jabong-Flipkart) and Phani Ramamurthy for their valuablecontributions to our experimental setup and insights.

REFERENCES

[1] Q. Xu, S. Mehrotra, Z. Mao, and J. Li, “Proteus: Network PerformanceForecast for Real-time, Interactive Mobile Applications,” in Proceedingof the 11th Annual International Conference on Mobile Systems,Applications, and Services, ser. MobiSys ’13. Taipei, Taiwan: ACM,2013, pp. 347–360.

[2] M. Jarschel, D. Schlosser, S. Scheuring, and T. Hoßfeld, “Gaming in theclouds: QoE and the users perspective,” Mathematical and ComputerModelling, vol. 57, no. 11-12, pp. 2883–2894, 2013.

[3] S. Barakovic and L. Skorin-Kapov, “Survey and Challenges of QoEManagement Issues in Wireless Networks,” Journal of Computer Net-works and Communications, vol. 2013, 2013.

[4] W. Miao, G. Min, Y. Jiang, X. Jin, and H. Wang, “QoS-aware resourceallocation for LTE-A systems with carrier aggregation,” in 2014 IEEEWireless Communications and Networking Conference (WCNC), April2014, pp. 1403–1408.

[5] M. S. ElBamby and K. M. F. Elsayed, “An auction approach to resourceallocation with interference coordination in LTE-A systems,” in 2014IEEE Wireless Communications and Networking Conference (WCNC),Istanbul, Turkey, April 2014, pp. 1885–1890.

[6] X. K. Zou, J. Erman, V. Gopalakrishnan, E. Halepovic, R. Jana, X. Jin,J. Rexford, and R. K. Sinha, “Can accurate predictions improve videostreaming in cellular networks?” in Proceedings of the 16th InternationalWorkshop on Mobile Computing Systems and Applications. ACM, 2015,pp. 57–62.

[7] T. Stockhammer, “Dynamic Adaptive Streaming over HTTP –:Standards and Design Principles,” in Proceedings of the Second AnnualACM Conference on Multimedia Systems, ser. MMSys ’11. San Jose,CA, USA: ACM, 2011, pp. 133–144.

[8] M. Xiao, C. Zhou, V. Swaminathan, Y. Liu, and S. Chen, “Bas-360:Exploring spatial and temporal adaptability in 360-degree videos overhttp/2.” IEEE, 2018.

[9] K. Yang, N. Prasad, and X. Wang, “An Auction Approach to ResourceAllocation in Uplink OFDMA Systems,” IEEE Transactions on SignalProcessing, vol. 57, no. 11, pp. 4482–4496, Nov 2009.

[10] T. Wolf, J. Griffioen, K. L. Calvert, R. Dutta, G. N. Rouskas, I. Baldin,and A. Nagurney, “ChoiceNet: Toward an Economy Plane for theInternet,” ACM SIGCOMM Comput. Commun. Rev., vol. 44, no. 3, pp.58–65, jul 2014.

[11] Y. Peng, D.-K. Kang, F. Al-Hazemi, and C.-H. Youn, “Energy andQoS Aware Resource Allocation for Heterogeneous Sustainable CloudDatacenters,” Opt. Switch. Netw., vol. 23, no. P3, pp. 225–240, Jan.2017.

[12] A. Karamoozian, A. Hafid, M. Boushaba, and M. Afzali, “QoS-awareresource allocation for mobile media services in cloud environment,”in 2016 13th IEEE Annual Consumer Communications NetworkingConference (CCNC), Jan 2016, pp. 732–737.

13

[13] X. Foukas, G. Patounas, A. Elmokashfi, and M. K. Marina, “NetworkSlicing in 5G: Survey and Challenges,” IEEE Communications Maga-zine, vol. 55, no. 5, pp. 94–100, May 2017.

[14] M. Harishankar, P. Tague, and C. Joe-Wong, “Network Slicing as anAd-Hoc Service: Opportunities and Challenges in Enabling User-DrivenResource Management in 5G,” in Proceedings of 1st InternationalWorkshop on Trustworthy and Real-time Edge Computing for Cyber-Physical Systems (TREC4CPS). Nashville, TN, USA: Institute forSoftware Integrated Systems, Vanderbilt University, Dec 2018.

[15] X. Foukas, M. K. Marina, and K. Kontovasilis, “Orion: RAN Slicingfor a Flexible and Cost-Effective Multi-Service Mobile NetworkArchitecture,” in Proceedings of the 23rd Annual InternationalConference on Mobile Computing and Networking, ser. MobiCom ’17.Snowbird, Utah, USA: ACM, 2017, pp. 127–140.

[16] C. Marquez, M. Gramaglia, M. Fiore, A. Banchs, and X. Costa-Perez,“How Should I Slice My Network?: A Multi-Service EmpiricalEvaluation of Resource Sharing Efficiency,” in Proceedings of the24th Annual International Conference on Mobile Computing andNetworking, ser. MobiCom ’18. New Delhi, India: ACM, 2018, pp.191–206.

[17] S. Ha, S. Sen, C. Joe-Wong, Y. Im, and M. Chiang, “Tube:Time-dependent pricing for mobile data,” in Proceedings of theACM SIGCOMM 2012 Conference on Applications, Technologies,Architectures, and Protocols for Computer Communication, ser.SIGCOMM ’12. Helsinki, Finland: ACM, 2012, pp. 247–258.

[18] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction.MIT press, 2018.

[19] M. Harishankar, S. Pilaka, P. Sharma, N. Srinivasan,C. Joe-Wong, and P. Tague, “Procuring Spontaneous Session-level Resource Guarantees or Real-time Applications: AnAuction Approach,” tech. rep., 2018. [Online]. Available:https://www.dropbox.com/s/coib0cjvjgw9e2j/JSACExtendedPaper.pdf.

[20] F. Capozzi, G. Piro, L. A. Grieco, G. Boggia, and P. Camarda, “Down-link Packet Scheduling in LTE Cellular Networks: Key Design Issuesand a Survey,” IEEE Communications Surveys Tutorials, vol. 15, no. 2,pp. 678–700, Second 2013.

[21] M. Afanasyev, T. Chen, G. M. Voelker, and A. C. Snoeren, “UsagePatterns in an Urban WiFi Network,” IEEE/ACM Transactions onNetworking, vol. 18, no. 5, pp. 1359–1372, Oct 2010.

[22] K. Fukuda, H. Asai, and K. Nagami, “Tracking the Evolution andDiversity in Network Usage of Smartphones,” in Proceedings of the2015 Internet Measurement Conference, ser. IMC ’15. Tokyo, Japan:ACM, 2015, pp. 253–266.

[23] V. Krishna, Auction Theory. Academic press, 2009.[24] S. Chawla, J. D. Hartline, D. L. Malec, and B. Sivan, “Multi-parameter

Mechanism Design and Sequential Posted Pricing,” in Proceedings ofthe Forty-second ACM Symposium on Theory of Computing, ser. STOC’10. Cambridge, Massachusetts, USA: ACM, 2010, pp. 311–320.

[25] P. Cramton, “Spectrum Auction Design,” Review of IndustrialOrganization, vol. 42, no. 2, pp. 161–190, 2013.

[26] A. Freville, “The multidimensional 0-1 knapsack problem: Anoverview,” European Journal of Operational Research, vol. 155, no. 1,pp. 1–21, 2004.

[27] P. Krysta, O. Telelis, and C. Ventre, “Mechanisms for Multi-unitCombinatorial Auctions with a Few Distinct Goods,” in Proceedings ofthe 2013 International Conference on Autonomous Agents and Multi-agent Systems, ser. AAMAS ’13. St. Paul, MN, USA: InternationalFoundation for Autonomous Agents and Multiagent Systems, 2013, pp.691–698.

[28] T. Kelly, “Generalized Knapsack Solvers for Multi-unit CombinatorialAuctions: Analysis and Application to Computational ResourceAllocation,” in Proceedings of the 6th AAMAS International Conferenceon Agent-Mediated Electronic Commerce: Theories for and Engineeringof Distributed Mechanisms and Systems, ser. AAMAS’04. New York,NY: Springer-Verlag, 2005, pp. 73–86.

[29] A. Giovannucci, J. A. Rodriguez-Aguilar, J. Cerquides, and U. Endriss,“Winner Determination for Mixed Multi-unit Combinatorial Auctionsvia Petri Nets,” in Proceedings of the 6th International Joint Conferenceon Autonomous Agents and Multiagent Systems, ser. AAMAS ’07.Honolulu, Hawaii: ACM, 2007, pp. 104:1–104:8.

[30] M. J. Varnamkhasti, “Overview of the algorithms for solving themultidimensional knapsack problems,” Advanced Studies in Biology,vol. 4, no. 1, pp. 37–47, 2012.

[31] V. M. Coppinger, V. Smith, and J. A. Titus, “Incentives and Behavior inEnglish, Dutch and Sealed-Bid Auctions,” Economic Inquiry, vol. 18,no. 1, pp. 1–22, 1980.

[32] D. Pisinger, “Where Are the Hard Knapsack Problems?” Comput. Oper.Res., vol. 32, no. 9, pp. 2271–2284, Sep. 2005.

[33] B. Rastegari, A. Condon, and K. Leyton-Brown, “Revenue Monotonicityin Combinatorial Auctions,” SIGecom Exch., vol. 7, no. 1, pp. 45–47,dec 2007.

[34] N. Muto and S. Yasuhiro, “Goods Revenue Monotonicity inCombinatorial Auctions,” Graduate School of Economics, HitotsubashiUniversity, Discussion Papers 2013-13, Oct. 2013.

[35] L. M. Ausubel and P. R. Milgrom, “Ascending auctions with packagebidding,” Advances in Theoretical Economics, vol. 1, no. 1, 2002.

[36] D. C. Parkes and S. Singh, “An MDP-based Approach to OnlineMechanism Design,” in Proceedings of the 16th InternationalConference on Neural Information Processing Systems, ser. NIPS’03.Whistler, British Columbia, Canada: MIT Press, 2003, pp. 791–798.

[37] W. B. Powell, “Clearing the Jungle of Stochastic Optimization,” inBridging data and decisions. Informs, 2014, pp. 109–137.

[38] M. Romer and T. Mellouli, “Future Demand Uncertainty In PersonnelScheduling: Investigating Deterministic Lookahead Policies Using Op-timization And Simulation,” in ECMS, Regensburg, Germany, 2016, pp.502–507.

[39] L. Zheng, C. Joe-Wong, M. Andrews, and M. Chiang, “Optimizing dataplans: Usage dynamics in mobile data networks,” in IEEE INFOCOM2018-IEEE Conference on Computer Communications. Atlanta, GA,USA: IEEE, 2018, pp. 2474–2482.

[40] L. Zheng, C. Joe-Wong, J. Chen, C. G. Brinton, C. W. Tan, andM. Chiang, “Economic viability of a virtual ISP,” in IEEE INFOCOM2017 - IEEE Conference on Computer Communications, Atlanta, GA,USA, May 2017, pp. 1–9.

[41] P. Cramton, “Simultaneous Ascending Auctions,” Wiley Encyclopedia ofOperations Research and Management Science, 2010.

[42] Y. Zhang, D. Niyato, P. Wang, and E. Hossain, “Auction-based Re-source Allocation in Cognitive Radio Systems,” IEEE CommunicationsMagazine, vol. 50, no. 11, pp. 108–120, November 2012.

[43] V. Petrov, M. A. Lema, M. Gapeyenko, K. Antonakoglou,D. Moltchanov, F. Sardis, A. Samuylov, S. Andreev, Y. Koucheryavy,and M. Dohler, “Achieving End-to-End Reliability of Mission-CriticalTraffic in Softwarized 5G Networks,” IEEE Journal on Selected Areasin Communications, vol. 36, no. 3, pp. 485–501, March 2018.

[44] R. Gummadi, P. Key, and A. Proutiere, “Repeated Auctions underBudget Constraints: Optimal Bidding Strategies and Equilibria,” in theEighth Ad Auction Workshop, 2012.

[45] M. Janssen, V. Karamychev, and B. Kasberger, “Budget Constraints inCombinatorial Clock Auctions.”

Madhumitha Harishankar is a PhD Candidate atthe Department of Electrical and Computer Engi-neering, Carnegie Mellon University, Silicon Valley.She is advised by Prof. Patrick Tague as a memberof the MEWS research group. She received her B.S.in Electrical and Computer Engineering at RutgersUniversity (summa cum laude, James Slade Schol-ars, Honors program) and was a software engineerat Barclays Investment Bank and then at AmazonWeb Services, before joining CMU. Her researchinterests include wireless networking and network

economics, specifically in applying pricing and learning techniques for betterresource allocation in data networks. She received an NSF GRFP HonorableMention in 2017, as well as CIT Dean’s 1-year fellowship in 2016. Her morerecent research interests are in blockchain economics.

14

Sireesha Pilaka Sireesha Pilaka has received theMS degree from Information Networking Institute,Carnegie Mellon University in 2017 and B.Tech. de-gree from Indian Institute of Technology, Kharagpurin 2014. She was a research assistant at CarnegieMellon University in 2017. She is currently a Sr.Network Engineer at Visa’s network architectureteam. Her research interests include wide area net-works, data-center networks, SDN/NFV, networkautomation and machine learning.

Pragya Sharma received her Bachelors in Technol-ogy (B.Tech) from the Department of Electronicsand Communication, Vellore Institute of Technology,India in 2017 and her M.S. in Electrical and Com-puter Engineering from Carnegie Mellon Universityin 2018. She is currently an Embedded SoftwareEngineer at Amazon Lab126. Her research interestsinclude wireless communications, low-power sensornetworks and embedded systems.

Nagarjun Srinivasan received his Bachelor in Tech-nology (B.Tech) in Electrical Engineering from theIndian Institute of Technology (IIT), Hyderabad in2016 and Master of Science (MS) in Electrical andComputer Engineering from Carnegie Mellon Uni-versity in 2017. He currently works as a Developerat Mist Systems, a Juniper Company, on building re-altime, reliable and scalable Indoor Location BasedServices. His research interests include Signal Pro-cessing and Filtering, Indoor Location Technologies,Optimization and Reinforcement Learning.

Carlee Joe-Wong is an Assistant Professor of Elec-trical and Computer Engineering at Carnegie MellonUniversity. She received her A.B. degree (magnacum laude) in Mathematics, and M.A. and Ph.D.degrees in Applied and Computational Mathematics,from Princeton University in 2011, 2013, and 2016,respectively. Dr. Joe-Wong's research is in opti-mizing networked systems, particularly on applyingmachine learning and pricing to resource allocationin data and computing networks. From 2013 to2014, she was the Director of Advanced Research

at DataMi, a startup she co-founded from her Ph.D. research on mobiledata pricing. Her research has received several awards, including the AROYoung Investigator Award in 2019, the NSF CAREER Award in 2018, theINFORMS ISS Design Science Award in 2014 and the Best Paper Award atIEEE INFOCOM 2012.

Patrick Tague is an Associate Research Professorat Carnegie Mellon University with appointmentsin the Electrical and Computer Engineering Depart-ment and the Information Networking Institute, andhe is also the Associate Director of the INI. Patrickleads the Mobile, Embedded, and Wireless Securitygroup at the Silicon Valley Campus of CMU. Hisresearch interests include wireless communicationsand networking; wireless/mobile security and pri-vacy; robust and resilient networked systems; andanalysis and sense-making of sensor network data.

He received PhD and MS degrees in Electrical Engineering from the Uni-versity of Washington as a member of the Network Security Lab and BSdegrees in Mathematics and Computer Engineering from the University ofMinnesota. Patrick received the Yang Research Award for outstanding grad-uate research in the UW Electrical Engineering Department, the OutstandingGraduate Research Award from the UW Center for Information Assuranceand Cybersecurity, and the NSF CAREER award. He is a member of IEEE,including the Communication, Computer, and Signal Processing Societies, andACM, including SIGCOMM, SIGMobile, and SIGSAC.

15

Documents

Procuring Spontaneous Session-level Resource Guarantees ...a video conferencing session for a job interview who cannot procure any guarantees for the call quality and performance