Virtual reality (VR) provides many exciting new application opportunities, but also present new challenges. In contrast to 360° videos that only allow a user to select its viewing direction, in fully immersive VR, users can also move around and interact with objects in the virtual world. To most effectively deliver such services it is therefore important to understand how users move around in relation to such objects. In this paper, we present a methodology and software tool for generating run-time datasets capturing a user’s interactions with such 3D environments, evaluate and compare different object identification methods that we implement within the tool, and use datasets collected with the tool to demonstrate example uses. The tool was developed in Unity, easily integrates with existing Unity applications through the use of periodic calls that extracts information about the environment using different ray-casting methods. The software tool and example datasets are made available with this paper.
here is a continuous struggle for control of resources at every organization that is connected to the Internet. The local organization wishes to use its resources to achieve strategic goals. Some external entities seek direct control of these resources, for purposes such as spamming or launching denial-of-service attacks. Other external entities seek indirect control of assets (e. g., users, finances), but provide services in exchange for them. less thanbrgreater than less thanbrgreater thanUsing a year-long trace from an edge network, we examine what various external organizations know about one organization. We compare the types of information exposed by or to external organizations using either active (reconnaissance) or passive (surveillance) techniques. We also explore the direct and indirect control external entities have on local IT resources.
The World Wide Web and the services it provides are continually evolving. Even for a single time instant, it is a complex task to methodologically determine the infrastructure over which these services are provided and the corresponding effect on user perceived performance. For such tasks, researchers typically rely on active measurements or large numbers of volunteer users. In this paper, we consider an alternative approach, which we refer to as passive crowd-based monitoring. More specifically, we use passively collected proxy logs from a global enterprise to observe differences in the quality of service (QoS) experienced by users on different continents. We also show how this technique can measure properties of the underlying infrastructures of different Web content providers. While some of these properties have been observed using active measurements, we are the first to show that many of these properties (such as location of servers) can be obtained using passive measurements of actual user activity. Passive crowd-based monitoring has the advantages that it does not add any overhead on Web infrastructure, it does not require any specific software on the clients, but still captures the performance and infrastructure observed by actual Web usage.
Congestion-aware scheduling in the case of traditional downlink cellular communication has neglected the heterogeneity in terms of secrecy among different clients. In this paper, we study a two-user congestion-aware broadcast channel with heterogeneous traffic and different security requirements. The traffic with security requirements is intended for a legitimate user and it has bursty nature. The incoming packets are stored in a queue at the source. Furthermore, there is a second traffic flow intended for another user, it is delay tolerant and does not have secrecy constraints. The receiver which needs to be served with confidential data has full-duplex capabilities, and it can send a jamming signal to hinder eavesdropping of its data at the other user. We consider two randomized policies for selecting which packets to transmit, one is congestion-aware by taking into consideration the queue size, whereas the other one is non-congestion-aware. We analyse the throughput and the delay performance under two decoding schemes at the receivers and provide insights into their relative security performance and into how congestion control at the queue holding confidential information can help decrease the average delay per packet. We show that the two policies have the same secrecy performance for large random access probabilities. The derived results also take account of the self-interference caused at the receiver for whom confidential data is intended due to its full-duplex operation while jamming the communication at the other user.
In this paper, we consider the two-user broadcast channel with security constraints. We assume that a source broadcasts packets to two receivers, and that one of them has secrecy constraints, i.e., its packets need to be kept secret from the other receiver. The receiver with secrecy constraint has full-duplex capability, allowing it to transmit a jamming signal to increase its secrecy. We derive the average delay per packet and provide simulations and numerical results, where we compare different performance metrics for the cases when both receivers treat interference as noise, when the legitimate receiver performs successive decoding, and when the eavesdropper performs successive decoding. The results show that successive decoding provides better average packet delay for the legitimate user. Furthermore, we define a new metric that characterizes the reduction on the success probability for the legitimate user that is caused by the secrecy constraint. The results show that secrecy poses a significant amount of packet delay for the legitimate receiver when either receiver performs successive decoding. We also formulate an optimization problem, wherein the throughput of the eavesdropper is maximized under delay and secrecy rate constraints at the legitimate receiver. We provide numerical results for the optimization problem, where we show the trade-off between the transmission power for the jamming and the throughput of the non-legitimate receiver. The results provide insights into how channel ordering and encoding differences can be exploited to improve performance under different interference conditions.
There is limited prior work studying how the ad personalization experienced by different users is impacted by the use of adblockers, geographic location, the user's persona, or what browser they use. To address this void, this paper presents a novel profile-based evaluation of the personalization experienced by carefully crafted user profiles. Our evaluation framework impersonates different users and captures how the personalization changes over time, how it changes when adding or removing an extension, and perhaps most importantly how the results differ depending on the profile's persona (e.g., interest, occupation, age, gender), geographic location (US East, US West, UK), what browser extension they use (none, AdBlock, AdBlock Plus, Ghostery, CatBlock), what browser they use (Chrome, Firefox), and whether they are logged in to their Google account. By comparing and contrasting observed differences we provide insights that help explain why some user groups may feel more targeted than others and why some people may feel even more targeted after having turned on their adblocker.
Video dissemination through sites such as YouTube can have widespread impacts on opinions, thoughts, and cultures. Not all videos will reach the same popularity and have the same impact. Popularity differences arise not only because of differences in video content, but also because of other "content-agnostic" factors. The latter factors are of considerable interest but it has been difficult to accurately study them. For example, videos uploaded by users with large social networks may tend to be more popular because they tend to have more interesting content, not because social network size has a substantial direct impact on popularity.
In this paper, we develop and apply a methodology that is able to accurately assess, both qualitatively and quantitatively, the impacts of various content-agnostic factors on video popularity. When controlling for video content, we observe a strong linear "rich-get-richer" behavior, with the total number of previous views as the most important factor except for very young videos. The second most important factor is found to be video age. We analyze a number of phenomena that may contribute to rich-get-richer, including the first-mover advantage, and search bias towards popular videos. For young videos we find that factors other than the total number of previous views, such as uploader characteristics and number of keywords, become relatively more important. Our findings also confirm that inaccurate conclusions can be reached when not controlling for content.
This paper considers the problem of adapting the BitTorrent protocol for on-demand streaming. BitTorrent is a popular peer-to-peer file sharing protocol that efficiently accommodates a large number of requests for file downloads. Two components of the protocol, namely the Rarest-First piece selection policy and the Tit-for-Tat algorithm for peer selection, are acknowledged to contribute toward the protocol's efficiency with respect to time to download files and its resilience to freeriders. Rarest-First piece selection, however, does not augur well for on-demand streaming. In this paper, we present a new adaptive Window-based piece selection policy that achieves a balance between the system scalability provided by the Rarest-First algorithm and the necessity of In-Order pieces for seamless media playback. We also show that this simple modification to the piece selection policy allows the system to be efficient with respect to utilization of available upload capacity of participating peers, and does not break the Tit-for-Tat incentive scheme which provides resilience to freeriders.
This paper develops a framework for studying the popularity dynamics of user-generated videos, presents a characterization of the popularity dynamics, and proposes a model that captures the key properties of these dynamics. We illustrate the biases that may be introduced in the analysis for some choices of the sampling technique used for collecting data; however, sampling from recently-uploaded videos provides a dataset that is seemingly unbiased. Using a dataset that tracks the views to a sample of recently-uploaded YouTube videos over the first eight months of their lifetime, we study the popularity dynamics. We find that the relative popularities of the videos within our dataset are highly non-stationary, owing primarily to large differences in the required time since upload until peak popularity is finally achieved, and secondly to popularity oscillation. We propose a model that can accurately capture the popularity dynamics of collections of recently-uploaded videos as they age, including key measures such as hot set churn statistics, and the evolution of the viewing rate and total views distributions over time.
In the age of the General Data Protection Regula-tion (GDPR) and the California Consumer Privacy Act (CCPA),privacy and consent control have become even more apparent forevery-day web users. Privacy banners in all shapes and sizes askfor permission through more or less challenging designs and makeprivacy control more of a struggle than they help users’ privacy.In this paper, we present a novel solution expanding the AdvancedData Protection Control (ADPC) mechanism to bridge currentgaps in user data and privacy control. Our solution moves theconsent control to the browser interface to give users a seamlessand hassle-free experience, while at the same time offering contentproviders a way to be legally compliant with legislation. Throughan extensive review, we evaluate previous works and identifycurrent gaps in user data control. We then present a blueprintfor future implementation and suggest features to support privacycontrol online for users globally. Given browser support, thesolution provides a tangible path to effectively achieve legallycompliant privacy and consent control in a user-oriented mannerthat could allow them to again browse the web seamlessly.
The Internet is playing an increasingly important role in today’s society and people are beginning to expect instantaneous access to information and content wherever they are. As content delivery is consuming a majority of the Internet bandwidth and its share of bandwidth is increasing by the hour, we need scalable and efficient techniques that can support these user demands and efficiently deliver the content to the users. When designing such techniques it is important to note that not all content is the same or will reach the same popularity. Scalable techniques must handle an increasingly diverse catalogue of contents, both with regards to diversity of content (as service are becoming increasingly personalized, for example) and with regards to their individual popularity. The importance of understanding content popularity dynamics is further motivated by popular contents widespread impact on opinions, thoughts, and cultures. This article will briefly discuss some of our recent work on capturing content popularity dynamics and designing scalable content delivery techniques
The recent Energy Efficient Ethernet (EEE) standard and the eBond protocol provide two orthogonal approaches that allow significant energy savings on routers. In this paper we present the modeling and performance evaluation of these two protocols and a hybrid protocol. We first present eeeBond, pronounced ``triple-e bond'', which combines the eBond capability to switch between multiple redundant interfaces with EEE's active/idle toggling capability implemented in each interface. Second, we present an analytic model of the protocol performance, and derive closed-form expressions for the optimized parameter settings of both eBond and eeeBond. Third, we present a performance evaluation that characterizes the relative performance gains possible with the optimized protocols, as well as a trace-based evaluation that validates the insights from the analytic model. Our results show that there are significant advantages to combine eBond and EEE. The eBond capability provides good savings when interfaces only offer small energy savings when in short-term sleep states, and the EEE capability is important as short-term sleep savings improve.
With the proliferation of cloud services, cloud-based systems can become a cost-effective means of on-line content delivery. In order to make best use of the available cloud bandwidth and storage resources, content distributors need to have a good understanding of the tradeoffs between various system design choices. In this work we consider a peer-assisted content delivery system that aims to provide guaranteed average download rate to its customers. We show that bandwidth demand peaks for contents with moderate popularity, and identify these contents as candidates for cloud-based service. We then consider dynamic content bundling and cross-swarm seeding, which were recently proposed to improve download performance, and evaluate their impact on the optimal choice of cloud service use. We find that much of the benefits from peer seeding can be achieved with careful torrent inflation, and that hybrid policies that combine bundling and peer seeding often reduce the delivery costs by 20% relative to only using seeding. Furthermore, all these peer-assisted policies reduce the number of files that would need to be pushed to the cloud. Finally, we show that careful system design is needed if locality is an important criterion when choosing cloud-based service provisioning.
This book constitutes the refereed proceedings of the 13th International Conference on Passive and Active Measurement, PAM 2012, held in Vienna, Austria, in March 2012. <br>The 25 revised full papers presented were carefully reviewed and selected from 83 submissions. The papers were arranged into eight sessions traffic evolution and analysis, large scale monitoring, evaluation methodology, malicious behavior, new measurement initiatives, reassessing tools and methods, perspectives on internet structure and services, and application protocols.
The demand and usage of 360°video services are expected to increase. However, despite these services being highly bandwidth intensive, not much is known about the potential value that basic bandwidth saving techniques such as server or edge-network on-demand caching (e.g., in a CDN) could have when used for delivery of such services. This problem is both important and complicated as client-side solutions have been developed that split the full 360°view into multiple tiles, and adapt the quality of the downloaded tiles based on the user’s expected viewing direction and bandwidth conditions. This paper presents new trace-based analysis methods that incorporate users’ viewports (the area of the full 360°view the user actually sees), a first characterization of the cross-user similarities of the users’ viewports, and a trace-based analysis of the potential bandwidth savings that caching-based techniques may offer under different conditions. Our analysis takes into account differences in the time granularity over which viewport overlaps can be beneficial for resource saving techniques, compares and contrasts differences between video categories, and accounts for uncertainties in the network conditions and the prediction of the future viewing direction when prefetching. The results provide substantial insight into the conditions under which overlap can be considerable and caching effective, and inform the design of new caching system policies tailored for 360°video.
Geographically distributed cloud platforms enable an attractive approach to large-scale content delivery. Storage at various sites can be dynamically acquired from (and released back to) the cloud provider so as to support content caching, according to the current demands for the content from the different geographic regions. When storage is sufficiently expensive that not all content should be cached at all sites, two issues must be addressed: how should requests for content be routed to the cloud provider sites, and what policy should be used for caching content using the elastic storage resources obtained from the cloud provider. Existing approaches are typically designed for non-elastic storage and little is known about the optimal policies when minimizing the delivery costs for distributed elastic storage.
In this paper, we propose an approach in which elastic storage resources are exploited using a simple dynamic caching policy, while request routing is updated periodically according to the solution of an optimization model. Use of pull-based dynamic caching, rather than push-based placement, provides robustness to unpredicted changes in request rates. We show that this robustness is provided at low cost \textendash{} even with fixed request rates, use of the dynamic caching policy typically yields content delivery cost within 10\% of that with the optimal static placement. We compare request routing according to our optimization model to simpler baseline routing policies, and find that the baseline policies can yield greatly increased delivery cost relative to optimized routing. Finally, we present a lower-cost approximate solution algorithm for our routing optimization problem that yields content delivery cost within 2.5\% of the optimal solution.
Content delivery providers can improve their service scalability and offload their servers by making use of content transfers among their clients. To provide peers with incentive to transfer data to other peers, protocols such as BitTorrent typically employ a tit-for-tat policy in which peers give upload preference to peers that provide the highest upload rate to them. However, the tit-for-tat policy does not provide any incentive for a peer to stay in the system beyond completion of its download.
This paper presents a simple fixed-point analytic model of a priority-based incentive mechanism which provides peers with strong incentive to contribute upload bandwidth beyond their own download completion. Priority is obtained based on a peer's prior contribution to the system. Using a two-class model, we show that priority-based policies can significantly improve average download times, and that there exists a significant region of the parameter space in which both high-priority and low-priority peers experience improved performance compared to with the pure tit-for-tat approach. Our results are supported using event-based simulations.
Greedy geographic routing is attractive for large multi-hop wireless networks because of its simple and distributed operation. However, it may easily result in dead ends or hotspots when routing in a network with obstacles (regions without sufficient connectivity to forward messages). In this paper we propose a distributed routing algorithm that combines greedy geographic routing with two non-Euclidean distance metrics, chosen so as to provide load balanced routing around obstacles and hotspots. The first metric, Local Shortest Path, is used to achieve high probability of progress, while the second metric, Weighted Distance Gain, is used to select a desirable node among those that provide progress. The proposed Load Balanced Local Shortest Path (LBLSP) routing algorithm provides loop freedom, guarantees delivery when a path exists, is able to efficiently route around obstacles, and provides good load balancing.
Content-delivery applications can achieve scalability and reduce wide-area network traffic using geographically distributed caches. However, each deployed cache has an associated cost, and under time-varying request rates (e.g., a daily cycle) there may be long periods when the request rate from the local region is not high enough to justify this cost. Cloud computing offers a solution to problems of this kind, by supporting dynamic allocation and release of resources. In this paper, we analyze the potential benefits from dynamically instantiating caches using resources from cloud service providers. We develop novel analytic caching models that accommodate time-varying request rates, transient behavior as a cache fills following instantiation, and selective cache insertion policies. Within the context of a simple cost model, we then develop bounds and compare policies with optimized parameter selections to obtain insights into key cost/performance tradeoffs. We find that dynamic cache instantiation can provide substantial cost reductions, that potential reductions strongly dependent on the object popularity skew, and that selective cache insertion can be even more beneficial in this context than with conventional edge caches. Finally, our contributions also include accurate and easy-to-compute approximations that are shown applicable to LRU caches under time-varying workloads.
With BitTorrent-like protocols a client may download a file from a large and changing set of peers, using connections of heterogeneous and time-varying bandwidths. This flexibility is achieved by breaking the file into many small pieces, each of which may be downloaded from different peers. This paper considers an approach to peer-assisted on-demand delivery of stored media that is based on the relatively simple and flexible BitTorrent-like approach, but which is able to achieve a form of “streaming” delivery, in the sense that playback can begin well before the entire media file is received. Achieving this goal requires: (1) a piece selection strategy that effectively mediates the conflict between the goals of high piece diversity, and the in-order requirements of media file playback, and (2) an on-line rule for deciding when playback can safely commence. We present and evaluate using simulation candidate protocols including both of these components.
Video on demand, particularly with user-generated content, is emerging as one of the most bandwidth-intensive applications on the Internet. Owing to content control and other issues, some video-on-demand systems attempt to prevent downloading and peer-to-peer content delivery. Instead, such systems rely on server replication, such as via third-party content distribution networks, to support video streaming (or pseudostreaming) to their clients. A major issue with such systems is the cost of the required server resources.
By synchronizing the video streams for clients that make closely spaced requests for the same video from the same server, server costs (such as for retrieval of the video data from disk) can be amortized over multiple requests. A fundamental trade-off then arises, however, with respect to server selection. Network delivery cost is minimized by selecting the nearest server, while server cost is minimized by directing closely spaced requests for the same video to a common server.
This article compares classes of server selection policies within the context of a simple system model. We conclude that: (i) server selection using dynamic system state information (rather than only proximities and average loads) can yield large improvements in performance, (ii) deferring server selection for a request as late as possible (i.e., until just before streaming is to begin) can yield additional large improvements, and (iii) within the class of policies using dynamic state information and deferred selection, policies using only “local” (rather than global) request information are able to achieve most of the potential performance gains.
Systems delivering stored video content using a peer-assisted approach are able to serve large numbers of concurrent requests by utilizing upload bandwidth from their clients to assist in delivery. In systems providing download service, BitTorrent-like protocols may be used in which “tit-for-tat” policies provide incentive for clients to contribute upload bandwidth. For on-demand streaming delivery, however, in which clients begin playback well before download is complete, all prior proposed protocols rely on peers at later video play points uploading data to peers at earlier play points that do not have data to share in return. This paper considers the problem of devising peer-assisted protocols for streaming systems that, similar to download systems, provide effective “tit-for-tat” incentives for clients to contribute upload bandwidth. We propose policies that provide such incentives, while also providing short start-up delays, and delivery of (almost) all video frames by their respective playback deadlines.
Previous scalable protocols for downloading large, popular files from a single server include batching and cyclic multicast. With batching, clients wait to begin receiving a requested file until the beginning of its next multicast transmission, which collectively serves all of the waiting clients that have accumulated up to that point. With cyclic multicast, the file data is cyclically transmitted on a multicast channel. Clients can begin listening to the channel at an arbitrary point in time, and continue listening until all of the file data has been received.This paper first develops lower hounds on the average and maximum client delay for completely downloading a file, as functions of the average server bandwidth used to serve requests for that file, for systems with homogeneous clients. The results show that neither cyclic multicast nor batching consistently yields performance close to optimal. New hybrid download protocols are proposed that achieve within 15% of the optimal maximum delay and 20% of the optimal average delay in homogeneous systems.For heterogeneous systems in which clients have widely varying achievable reception rates, an additional design question concerns the use of high rate transmissions, which can decrease delay for clients that can receive at such rates, in addition to low rate transmissions that can be received by all clients. A new scalable download protocol for such systems is proposed, and its performance is compared to that of alternative protocols as well as to new lower bounds on maximum client delay. The new protocol achieves within 25% of the optimal maximum client delay in all scenarios considered.
A peer-assisted content delivery system uses the upload bandwidth of its clients to assist in delivery of popular content. In peer-assisted systems using a BitTorrent-like protocol, a content delivery server seeds the offered files, and active torrents form when multiple clients make closely-spaced requests for the same content. Scalability is achieved in the sense of being able to accommodate arbitrarily high request rates for individual files. Scalability with respect to the number of files, however, may be much more difficult to achieve, owing to a “long tail” of lukewarm or cold files for which the server may need to assume most or all of the delivery cost. This paper first addresses the question of how best to allocate server resources among multiple active torrents. We then propose new content delivery policies that use some of the available upload bandwidth from currently downloading clients to “inflate” torrents for files that would otherwise require substantial server bandwidth. Our performance results show that use of torrent inflation can substantially reduce download times, by more than 50% in some cases.
In the current digital age, massive amounts of data are generated in many different ways and forms. The data may be collected from everything from personal web logs to purposefully placed sensors. Today, companies and researchers use this data for everything from targeted personalized ads based on social data to solving important scientific problems that may help future generations of word citizens. Regardless if measured in monetary profit or other measures, the value of this data has proven valuable for many purposes and has led us into the Big Data era. Due to the large volume of data, Big Data requires significant storage, processing, and bandwidth resources. To date, the Cloud provides the largest collection of disk storage, CPU power, and network bandwidth, which makes it a natural choice for housing the Big Data.
Conventional video consists of a single sequence of video frames. During a client's playback period, frames are viewed sequentially from some specified starting point. The fixed frame ordering of conventional video enables efficient scheduled broadcast delivery, as well as efficient near on-demand delivery to large numbers of concurrent clients through use of periodic broadcast protocols in which the video file is segmented and transmitted on multiple channels. This paper considers the problem of devising scalable protocols for near on-demand delivery of “nonlinear” media files whose content may have a tree or graph, rather than linear, structure. Such media allows personalization of the media playback according to individual client preferences. We formulate a mathematical model for determination of the optimal periodic broadcast protocol for nonlinear media with piecewise-linear structures. Our objective function allows differing weights to be placed on the startup delays required for differing paths through the media. Studying a number of simple nonlinear structures we provide insight into the characteristics of the optimal solution. For cases in which the cost of solving the optimization model is prohibitive, we propose and evaluate an efficient approximation algorithm.
Anonymous network communication protocols provide privacy for Internet-based communication. In this paper, we focus on the performance and scalability of anonymityprotocols. In particular, we develop performance models for two anonymityprotocols from the prior literature (Buses and Taxis), as well as our own newly proposed protocol (Motorcycles). Using a combination of experimental implementation, simulation, and analysis, we show that: (1) the message latency of the Buses protocol is O(N2), scaling quadratically with the number of participants; (2) the message latency of the Taxis protocol is O(N), scaling linearly with the number of participants; and (3) the message latency of the Motorcycles protocol is O(log2N), scaling logarithmically with the number of participants. Motorcycles can provide scalable anonymous network communication, without compromising the strength of anonymity provided by Buses or Taxis.
With BitTorrent, efficient peer upload utilization is achieved by splitting contents into many small pieces, each of which may be downloaded from different peers within the same swarm. Unfortunately, piece and bandwidth availability may cause the file-sharing efficiency to degrade in small swarms with few participating peers. Using extensive measurements, we identified hundreds of thousands of torrents with several small swarms for which reallocating peers among swarms and/or modifying the peer behavior could significantly improve the system performance. Motivated by this observation, we propose a centralized and a distributed protocol for dynamic swarm management. The centralized protocol (CSM) manages the swarms of peers at minimal tracker overhead. The distributed protocol (DSM) manages the swarms of peers while ensuring load fairness among the trackers. Both protocols achieve their performance improvements by identifying and merging small swarms and allow load sharing for large torrents. Our evaluations are based on measurement data collected during eight days from over 700 trackers worldwide, which collectively maintain state information about 2.8 million unique torrents. We find that CSM and DSM can achieve most of the performance gains of dynamic swarm management. These gains are estimated to be up to 40% on average for small torrents.
Tracker-based peer-discovery is used in most commercial peer-to-peer content distribution systems, as it provides performance benefits compared to distributed solutions, and facilitates the control and monitoring of the overlay. But a tracker is a central point of failure, and its deployment and maintenance incur costs; hence an important question is how high tracker availability can be achieved at low cost. We investigate highly available, low overhead peer discovery, using independent trackers and a simple gossip protocol. This work is a step towards understanding the trade-off between the overhead and the achievable peer connectivity in highly available distributed overlay-management systems for peer-to-peer content distribution. We propose two protocols that connect peers in different swarms efficiently with a constant, but tunable, overhead. The two protocols, Random Peer Migration (RPM) and Random Multi-Tracking (RMT), employ a small fraction of peers in a torrent to virtually increase the size of swarms. We develop analytical models of the protocols based on renewal theory, and validate the models using both extensive simulations and controlled experiments. We illustrate the potential value of the protocols using large-scale measurement data that contains hundreds of thousands of public torrents with several small swarms, with limited peer connectivity. We estimate the achievable gains to be up to 40% on average for small torrents.
Motivated by improved models for content workload prediction, in this paper we consider the problem of dynamic content allocation for a hybrid content delivery system that combines cloud-based storage with low cost dedicated servers that have limited storage and unmetered upload bandwidth. We formulate the problem of allocating contents to the dedicated storage as a finite horizon dynamic decision problem, and show that a discrete time decision problem is a good approximation for piecewise stationary workloads. We provide an exact solution to the discrete time decision problem in the form of a mixed integerlinear programming problem, propose computationally feasible approximations, and give bounds on their approximation ratios.Finally, we evaluate the algorithms using synthetic and measuredtraces from a commercial music on-demand service and give insight into their performance as a function of the workload characteristics.
With Twitter and other microblogging services, users can easily express their opinion and ideas in short text messages. A recent trend is that users use the real-time property of these services to share their opinions and thoughts as events unfold on TV or in the real world. In the context of TV broadcasts, Twitter (over a mobile device, for example) is referred to as a second screen. This paper presents the first characterization of the second screen usage over the playoffs of a major sports league. We present both temporal and spatial analysis of the Twitter usage during the end of the National Hockey League (NHL) regular season and the 2015 Stanley Cup playoffs. Our analysis provides insights into the usage patterns over the full 72-day period and with regards to in-game events such as goals, but also with regards to geographic biases. Quantifying these biases and the significance of specific events, we then discuss and provide insights into how the playoff dynamics may impact advertisers and third-party developers that try to provide increased personalization.
This paper presents the design of a simple emulation framework for performance evaluation and testing of mobile applications. Our testbed combines production hardware and software to allow emulation of realistic and repeatable mobility scenarios, in which the mobile user can travel long distances, while being served by an application server. The framework allows (i) geo-location information, (ii) client network conditions such as bandwidth and loss rate, as well as (iii) the application workload to be emulated synchronously. To illustrate the power of the framework we also present the design, proof-of-concept implementation, and evaluation of a geo-smart scheduler for application updates in smartphones. This geo-smart scheduler reduces the average download time by using a network performance map to schedule the downloads when at places with relatively good conditions. Our trace-driven evaluation of the geo-smart scheduler, illustrates the workings of the emulation framework, and the potential of the geo-smart scheduler.
Todays Web provides many different functionalities, including communication, entertainment, social networking, and information retrieval. In this article, we analyze traces of HTTP activity from a large enterprise and from a large university to identify and characterize Web-based service usage. Our work provides an initial methodology for the analysis of Web-based services. While it is nontrivial to identify the classes, instances, and providers for each transaction, our results show that most of the traffic comes from a small subset of providers, which can be classified manually. Furthermore, we assess both qualitatively and quantitatively how the Web has evolved over the past decade, and discuss the implications of these changes.
Predicting game or season outcomes is important for clubs as well as for the betting industry. Understanding the critical factors of winning games and championships gives clubs a competitive advantage when selecting players for the team and implementing winning strategies. In this paper, we work with NBA data from 10 seasons and propose an approach for predicting game outcomes that is then used for predicting which team will be champion and which stages a team will reach in the playoffs. We show that our approach has a similar performance as the odds from betting companies and does better than ELO.
Secondary spectrum auctions have been suggested as a strategically robust mechanism for distributing idle spectrum to competing secondary users. However, previous work on such auction design have assumed a static auction setting, thus failing to fully exploit the inherently time-varying nature of spectrum demand and utilization. In this paper, we address this issue from the perspective of the primary user who wishes to maximize the auction revenue. We present an online auction framework that dynamically accepts bids and allocates spectrum. We prove rigorously that our online auction framework is truthful in the multiple dimensions of bid values, as well as bid timing parameters. To protect against unbounded loss of revenue due to latter bids, we introduce controlled preemption into our mechanism. We prove that preemption, coupled with the technique of inflating bids artificially, leads to an online auction that guarantees a 1/5-fraction of the optimal revenue as obtained by an offline adversary. Since the previous guarantee holds only for the optimal channel allocation, we further provide a greedy channel allocation scheme which provides scalability. We prove that the greedy scheme also obtains a constant competitive revenue guarantee, where the constant depends on the parameter of the conflict graph.
The advent of multi‒core technology motivates new studies to understand how efficiently Web servers utilize such hardware. This paper presents a detailed performance study of a Web server application deployed on a modern eight‒core server. Our study shows that default Web server configurations result in poor scalability with increasing core counts. We study two different types of workloads, namely, a workload with intense TCP/IP related OS activity and the SPECweb2009 Support workload with more application‒level processing. We observe that the scaling behaviour is markedly different for these workloads, mainly because of the difference in the performance of static and dynamic requests. While static requests perform poorly when moving from using one socket to both sockets in the system, the converse is true for dynamic requests. We show that, contrary to what was suggested by previous work, Web server scalability improvement policies need to be adapted based on the type of workload experienced by the server. The results of our experiments reveal that with workload‒specific Web server configuration strategies, a multi‒core server can be utilized up to 80% while still serving requests without significant queuing delays; utilizations beyond 90% are also possible, while still serving requests with ‘acceptable’ response times.