Wagner James Au

7 October 2020

14 min read

The decades-long crawl to the metaverse

The concept of the metaverse has mesmerized Silicon Valley for nearly 30 years, ever since it was first described in Neal Stephenson’s Snow Crash in 1992. Technology is now emerging that could make the metaverse a reality and usher in a massively multi-user virtual world that mirrors our own, where imagination and reality are instantly accessible in a shared and dynamic 3D space.

In the early- to mid-2000s, pioneering virtual worlds like Second Life showed promise in advancing that vision but struggled to grow beyond a niche of early adopters. In more recent years, a wave of new virtual reality (VR) devices promised a revival of that long-dormant dream. Stephenson himself even joined the effort, briefly becoming “Chief Futurist” at AR startup Magic Leap, but mass adoption still remains elusive.

Very recently, however, several converging trends have revived the hopes that market interest and technical capacity could lead to a future where the metaverse -- or many metaverses -- may finally be at hand.

In the eyes of many industry insiders, edge computing will be an essential pillar in the metaverse’s creation, by enabling developers to reduce latency in virtual worlds. This technology could empower new and better devices and new virtual experiences that support many thousands -- and eventually, millions -- of concurrent users in the same shared 3D space.

Combined with new technical capabilities that edge computing makes possible, a number of trends are converging to make the metaverse more desirable:

Renewed interest in virtual worlds during the pandemic

Broadly encompassing both virtual and augmented reality platforms, the concept of mixed reality or “XR” in online content such as multiplayer games and virtual worlds is gaining adoption as real-life interactions are limited by the COVID-19 pandemic. Spread of the disease has forced a system-wide reevaluation of the importance of socially crowded public spaces, and XR offers viable alternatives to them.

In recent months, we’ve seen a kind of virtualization of the public space, with a number of industries shifting from offline and in-person activities to experimenting with virtual conferences and event spaces, virtual corporate and college campuses, and even avatar-driven music concerts. Perhaps the best example of this last experience occurred in April when a giant replica of rap star Travis Scott appeared in Fortnite’s virtual world -- a performance that drew 12 million attendees.

Rise of 3D engines

Thanks to the massive growth of the mobile gaming market and buttressed by PC and console games, 3D engines are now conservatively estimated to be used by more than 1 billion consumers. The dominant market leaders, Unity 3D and Epic Games’ Unreal engine, have extended their use cases far beyond games into 3D prototyping, simulation, and hyper-realistic animation. For example, most of the live-action Disney series The Mandalorian was created in Unreal.

Mass adoption of virtual worlds

Once relegated to a subset of niche gamers, virtual worlds have evolved to become as relevant to Gen Y and Gen Z as social media. Just three leading virtual worlds -- Roblox, Minecraft, and Fortnite -- account for roughly half a billion active users total, most of whom are in their early 20s or younger. For many of these users, virtual worlds are their preferred social space to interact with friends, challenging both social media and legacy content. As Netflix said in a letter to investors last year, “We compete with (and lose to) Fortnite more than HBO.”

Together, these trends signal a future for the metaverse that would extend beyond the pandemic era, long after potential vaccines and other public health regimens will have returned the real world to a sense of normalcy. As the Minecraft generation enters young adulthood, its love of virtual spaces with infinite possibility will likely remain.

But this convergence would not be impactful without achieving a technical leap that’s close to being realized.

“The simplest way to use the edge is as a Content Distribution Network caching static images, sounds, geometry, shaders and scripts as deployed by [a virtual world]” - Jim Purbrick, Oculus/Facebook veteran
The edge of enlightenment

Edge computing -- in which compute, storage, security, and networking occurs physically closer to end users and their devices -- will enable us to think about the metaverse and related applications in a new light.

“The simplest way to use the edge is as a Content Distribution Network caching static images, sounds, geometry, shaders and scripts as deployed by [a virtual world],” Oculus/Facebook veteran Jim Purbrick said. “Users can then download these assets from their nearest edge server.”

From a user’s perspective, the world and other users in a scene can be displayed much more quickly, while still reducing bandwidth requirements for both the data center and the end user.

“The low latency promise of edge computing and streaming will enable whole new categories of applications,” Theta Lab’s Mitch Liu explains. “For example, the emerging category of cloud gaming services like Google Stadia and others will evolve to a whole new level. They stream game data in real-time from servers, and more importantly the computational and application logic remains on the edge server, effectively making the gamer's device a very thin client that only renders input/output.”

Edge computing could also improve virtual worlds as experienced by people who live relatively close to one another. It’s increasingly typical, for example, for teens and college students who attend the same school or live in the same neighborhood to socialize with each other in Fortnite or other virtual worlds -- but many still often endure lag due to the game server they’re connected to being on the other side of the country or even the world.

According to Purbrick, applications on the edge can “run server processes simulating regions of the virtual world close to the users in that region. Where the real world and virtual world distribution of people are [geographically close], this can significantly reduce the roundtrip latency between clients and the server simulating their part of the world and so reduce perceived lag.”

Component-Edge2b
Edge computing and new ways of seeing

Many XR experts believe edge computing could finally enable a long sought-after form factor that’s an integral part of the metaverse vision -- that is, a powerful headset as lightweight and easy to wear as a pair of sunglasses.

In theory, edge computing could obviate the latency issues associated with moving large amounts of data back and forth from the cloud, particularly high-quality 3D graphics and other assets that would be used by low-profile form factors like glasses.

“Since HMD [head-mounted display] and other XR devices essentially become simple data input/output and rendering devices,” Theta’s Mitch Liu says, “their form factor can be vastly reduced.”

Infinite Retina’s Irena Cronin sees edge computing as a necessary condition for next-gen XR headsets “When we’re talking about visuals especially,” she tells me, “without having these kinds of features, and the massive amounts of data that are going to be used having to do with spatial computing, it’s a non-starter,” she said.

For Philip Rosedale -- founder of virtual world startups Linden Lab and High Fidelity -- achieving low latency is just as important as the display. “If 5G and appropriate edge devices can drop the one-way latency to less than 50 milliseconds,” he said, “I feel confident that compelling XR experiences can and will be streamed.”

50 milliseconds is Rosedale’s estimate of how much latency a person will tolerate for a 3D image delivered to each eye from an edge server before a stream will need to be locally corrected to make it seem less laggy.

Beyond the sunglasses form factor, many point to an even thinner HMD -- smart contacts.

“Imagine,” as Liu says, “contact lenses that can fully immerse users in 3D digital virtual worlds or implantable devices that can interact directly with human sensory systems.”

Mojo Vision is doing exactly that. The startup’s initial applications for this technology include navigation/travel, shopping, and productivity-related applications.

“We are building a new category, Invisible Computing, which emphasizes giving people access to the information they want, yet allowing them to stay engaged in the real world around them all while letting them continue to look like themselves,” Mojo Vision’s Steve Sinclair tells me. “The combination of low latency, reliable wireless connectivity with edge computing will allow us to build solutions that understand a person’s context and react in near real-time with assistive information at just the right moment.”

Not everyone is bullish on the idea that edge computing will be a complete panacea for bulky HMDs. Jeri Ellsworth, CEO of AR startup Tilt Five, says edge computing is a necessary but insufficient condition for lighter headsets.

“There seems to be the misconception that edge computing will mostly eliminate the need for complex compute on the XR headsets and will be the quick gateway to thin and stylish immersive devices in just a few years,” she says. “Headsets will always need an in-device CPU for high-speed tracking systems that can understand the world around the user to perform local latency correction and fill in for variable transmission latency or temporary loss of connection.”

However, while she believes edge computing won’t necessarily lead us immediately to the sunglasses form factor, it can still provide other advantages to XR devices.“I do see a big benefit for improving the fidelity of the experience by moving high compute aspects like rendering and game/application logic into the cloud,” Ellsworth said.

“If 5G and appropriate edge devices can drop the one-way latency to less than 50 milliseconds, I feel confident that compelling XR experiences can and will be streamed.” - Philip Rosedale, Linden Lab
Remaking the Game: Resource Sharing & Live Voice

Edge computing opens up new avenues of opportunity that may help us reconsider what a Metaverse can and could be. Consider the following use cases.

Live Voice

The pandemic has accelerated our desire for personal forms of social interaction online, but in a 3D setting, importing a user’s voice remains a challenge.

“In large virtual crowds, sending individual audio streams from every user to every other user quickly exhausts the available bandwidth to clients and the processing requirements to mix the streams on mobile devices becomes prohibitive,” Purbick explains.

It’s for this reason that users of massive game worlds often speak with each other on third-party VOIP services. However, edge computing can create a greater sense of audio immersion and presence among users.

“Audio servers on the edge could be used to combine ambisonic streams from the server with individual streams from nearby users to provide higher quality spatialized audio, or participate in a Distributed Partial Mixing network to optimally trade off quality, network and processor bandwidth,” says Pubrick.

The practical upshot would be that virtual worlds could truly convey the vocal presence of hundreds of people all around you from across the globe, with enough spatialized audio quality to still pick out the voice of a personal friend.

Indeed, Philip Rosedale is doing just this with High Fidelity, a kind of mixed-reality virtual world where the voices of its many users are simulated within a shared 3D environment. To create the sense of live, immersive audio -- the sensation that are you in the same location as hundreds or even thousands of people around the world -- Rosedale turned to edge computing:

“[W]e deploy audio servers that are like cell towers into the cloud,” he explains. “Each one of them handles the audio for 100 or so end users that are near each other in the virtual world.” High Fidelity users already employ this technology to host virtual cocktail parties, conference sessions, and beyond.

“Anytime you take reality in front of you and you portray it in a different way and you can share it with others, that, to me, is the metaverse.” - Irena Cronin, CEO of Infinite Retina
Resource sharing

Edge computing opens up the opportunity -- first modeled by SETI@home and more recently hinted at by the Pied Piper Network of Silicon Valley -- of building a network of personal devices that all share available resources with each other and reward users for that computational altruism.

“Our vision at Theta is to provide the infrastructure and reward mechanism for millions of users to form a vast edge network, peer-to-peer, enabling next generation streaming and computing applications,” Theta Labs Liu tells me. “Theta network users that share 2-5 sec video segments with each other from their PC, mobile, TV, or IOT devices, receive $0.0001 micropayment powered by blockchain technology. We believe this benefits all stakeholders and builds the foundation for a vast network that can power edge computing applications.”

The Theta network already has nearly 2,000 nodes worldwide and plans to increase that distribution by 10-100x in the next few years.

Barriers to the edge

Notwithstanding its exciting potential, the industry should remain cautious about the speed and direction of mass adoption in edge computing, and the downstream effects it might have in enabling XR applications. The technology still faces a number of challenges.

The stickiness of end-user streaming

While Google Stadia and other services have finally made cloud-based streaming of AAA games available, core gamers have been slow to adopt these offerings. Slow adoption is partly due to platform lock-in, since most gamers already own consoles or high-end PCs that obviate the need for streaming, but also due to technical limitations.

Streaming services usually contend with a "stickiness" of input delay -- i.e., the split second between a user’s input and when it’s registered by a game -- which can often be a matter of virtual life and death in a real-time twitch game.

Edge computing could become the last-mile solution to overcome this stickiness, but improvement may not be sufficiently widespread.

“[T]he main problem is really all about the cost of scale,” Tadhg Kelly, VP of Partnerships at VR spectator streaming service LIV, says. “5G has a range issue (basically more throughput needs higher frequencies, which have less range). That implies a need for many more towers. That means it’s more likely an urban-only solution.”

Not only does this translate into high costs for installation, but, as Kelly puts it, ”the social will to allow the appropriation of that much space to satisfy range needs.”

The content conundrum

Beyond the social dimension and implementation, edge computing remains a technology where its uses are not immediately apparent in relation to what most consumers are already familiar with. As it has for decades on countless previous platforms, edge computing needs a killer app to drive adoption.

“[T]here’s the chicken-egg issue where the metaverse is not useful until everyone’s using it (because of Metcalfe’s Law) but getting everyone into it in the first place is tough unless they can see the value,” Kelly observes. “Arguably simple adoption (i.e. phones) will just sort of happen because people naturally upgrade their phones. But that doesn’t mean they suddenly become active 5G users, in the same vein as headset-wearing 5G. A slightly faster load time for their Pokemon game is not a step change.”

Philip Rosedale shares some of these concerns, but adds, “Edge deployments -- especially around 5G -- are going to be really spotty at first. So they need to be transparent acceleration layers to the developer. The edge, for the developer, needs to look exactly like using a normal Linux-based cloud computer, and at a competitive price to today's cloud providers.”

Edging toward a metaverse we haven’t yet imagined

Beyond these concerns, the industry should be humble about how its vision of a metaverse will fare once consumers -- few of whom will have read Snow Crash -- begin arriving en masse.

After many years on retail shelves, VR headset sales have steadfastly refused to reach anything like mass growth, and evidence that edge computing will help make headsets less bulky is scant. Finally, the Minecraft generation may not want their simple enjoyment of multiplayer gaming subsumed by a platform that makes everything part of some 3D mirror world.

Indeed, if we accept that Minecraft, Fortnite, and ROBLOX are early steps to the metaverse, their usage patterns may come to define those experiences in the future. That could mean very little usage of XR headsets, as the vast majority of consumers continue to connect via mobile, PC, or game consoles. In this scenario, edge computing’s primary role would not be around improving HMDs, but enabling vastly improved cross-platform experiences so users can better enjoy virtual spaces with greater fidelity and far less latency.

None of this is to suggest the industry must discard its goals -- just that we may want to embrace a broader definition of the metaverse that’s amenable to how consumers may come to use their technology.

“Anytime you take reality in front of you and you portray it in a different way and you can share it with others,” Infinite Retina’s Irena Cronin says, “That, to me, is the metaverse.”