Seamless offloading of web app computations from mobile device to edge clouds via HTML5 web worker migration, Jeong et al., SoCC’19 [^1]
This paper caught my eye for its combination of an intriguing idea (opportunistic offload of computation from mobile devices to the edge) and the elegance of the way the web worker interface supports this use case. It’s live migration – but for web workers instead of the more usual VMs or containers.
Why would we want to live migrate web workers?
Emerging mobile applications, such as mobile cloud gaming or augmented reality, require strict latency constraints as well as high computer power… A survey on the latency of games has reported that less than ~50ms of network latency is preferred for time-critical games, which is hard to achieve with a traditional cloud system where computing servers are located in datacenters far from clients…
So you’ve got mobile devices without the computing power needed to deliver a great experience, and cloud computing that has all the needed power that’s too far away. Edge servers are the middle ground – more compute power than a mobile device, but with latency of just a few ms. The kind of edge server envisaged here might, for example, be integrated with your WiFi access point.
The design of the HTML5 Web Worker interface turns out to be a great match for migration – Workers are already designed to work on a separate thread communicating via message passing, and are typically used to offload expensive computation that you don’t want to do in the main thread with the user waiting.
Since interactivity is important for many mobile apps, computations are commonly run in web workers. As such, web workers are a natural target to offload to a more powerful server.
Challenges in web worker live migration
Beyond snapshotting, we also need to figure out when and where to migrate workers to. Since we’re talking about mobile applications, we have to assume a changing environment over time, including the possibility of losing internet connectivity altogether.
The Mobile Web Worker (MWW) System
The Mobile Web Worker (MWW) System introduces a new client-side Mobile Web Worker Manager component which is responsible for managing web workers, including their migration when this is estimated to be beneficial. On the server side (in the edge or cloud) another Mobile Web Worker Manager component is responsible for managing pools of mobile workers.
The current system assumes an application specific regression model is available on the servers which can predict processing time given the current parameters of the job (e.g. the number of cubes to be rendered in the physics simulation). The MWW manager on the client sends a query to all nearby edge servers including the input size of the worker. These use their regression models to estimate processing time (which will depend on the hardware available, current load, etc.). The client MWW combines these estimates with an estimate of the input/output transmission time (latency) to find the worker with the minimum overall execution latency. This could of course be a local worker on the mobile device. The location selection algorithm is run periodically to see if better options are now available.
If the client is disconnected from an edge server to which one of its workers has migrated, then that worker is further migrated to a pre-determined fallback server with stable connectivity (e.g., in the cloud). The idea is that the client should be able to reconnect to its worker from there wherever the client has moved to in the meantime. If the client can’t even access the fallback server (e.g., due to total loss of internet connectivity) then the worker needs to be restarted. If loss of connectivity can be predicted ahead of time, then the worker can be migrated back to the client in advance of connectivity loss.
The end-to-end migration process looks like this:
In the edge server (“Destination”) in the figure above, pre-built workers are on standby, waiting to receive a snapshot and copy of any wasm-backing linear memory in order to take over from a given client worker. To speed up migration and quickly restore wasm functions at the destination, the wasm instantiate function is intially called with a dummy linear memory, and then this is later replaced once the real memory has arrived over the network.
Move it! – MWWF in action
The figure below shows the migration time for these three applications, broken down into snapshot capture, snapshot restoration, and ‘other’ (including the network transfer time). The different applications have quite different profiles in terms of their snapshot size, linear memory size, and wasm source file size. The opencv app has the largest state (4.6 MB snapshot, 128 MB memory, and 5.9 MB of wasm), filter is the smallest (77 KB snapshot, 16 MB memory, and 34 KB of wasm). For these results, upload and download speed for mobile client to edge was set at 10 Mbs and 36 Mbps respectively, and for edge-to-edge and edge-to-cloud 42 Mbps and 118 Mbps respectively.
Given the large sizes involved, opencv takes the longest to migrate (3.8-11.9 seconds), meanwhile cubes can migrate in 1-3 seconds, and the blur filter is on the order of a few hundred ms.
[^1]: Accessing this paper from this blog post should grant you access to this paper in the ACM Digital Library – thank you ACM!
Future work includes delving into more realistic use cases and addressing other challenges related to mobile computing such as energy efficiency. Integration with outer edge computing techniques is also of strong interest. For example, we can adapt modile workers to serverless edge computing to support offloading of stateful computations without external storage…