Alex McRoberts – Blog

Follow me on Mastodon: @alexmcroberts@mastodon.social

This is first post in a series of Apple and Augmented Reality:

The reality of Apple and AR

Apple's work on Augmented Reality and Virtual Reality has been staring us in the face, for quite a while. With the rumour mill working over time on Apple's much anticipated headset, I thought it would be worthwhile to look at what's already in Apple's ecosystem, and how each piece might come together in an Apple designed headset.

For context, I worked for a startup, Recon Instruments, in the early-to-mid 2010's working on a form of Augmented Reality – ski glasses, and sunglasses with a built in display. Recon Instruments was then acquired by Intel, where we started working on some other interesting projects.

I'll cover a few components as they exist today in Apple's ecosystem, and how the technology might be transformed, and applied in a Augmented Reality or Virtual Reality headset.


ARKit

Starting with the obvious. Apple has not only written ARKit, they've also provided an entire section in their famous Human Interface Guidelines. I'm going to cherry pick some of the highlights here for guidelines that can work on iPhone, and can easily be transformed to work on a headset.

One thing strikes me when reading the guidelines. The Human Interface Guidelines, does not suggest that ARKit is limited to iPhone. A quick search on the page shows:

This section will only cover the highlights of the Human Interface Guidelines. If this post seems interesting enough to folks, I'll turn this into a short series of posts, covering the Human Interface Guidelines in more detail, as well as doing a deeper dive into the ARKit documentation itself

Below, I've copied some of the key guidelines from the Human Interface Guidelines. Later in the series, I'll dig into these a bit deeper.

Looking at that list, it's easy to see where Augmented Reality is heading.

Create a realistic experience, including handling reflective surfaces. This is going to be crucial to make sure that the experience feels polished, and not gimmicky to people when they first try on the headset. Immersive audio (more on this later in the post), and encouraging developers to reduce clutter also adds to the level of polish.

Consider where people will use the app – and consider their comfort and safety. Interactive experiences on iPhone are typically short-lived. A fun game catching virtual creatures, or an app to test out what new furniture would look like in a room. Not much to worry about there for comfort. Saftey concerns are real though – imagine if Pikachu was in the middle of a deep puddle, or worse the middle of a road - a person might ignore the real world environment, to catch 'em all. Comfort and safety wearing a headset will be even more important!

Coaching is a vital component of making people familar with a headset experience. Typically, there's a lot of hidden controls in Augmented Reality. It's easy to equate this experience to using Siri. It's often difficult to remember the exact phrase needed when asking Siri a question. There's no menu of prompts. A person often has to remember the magic incantation to make it work.

The last two points are quite noteworthy – designing multiuser experiences that handle people occlusion. When combined with the reference to handle people using the app in a variety of environments, it's clear that Apple sees small groups of people wearing their headset and collaborating together. In Apple's own ARKit documentation they state

Collaborative sessions work best with up to four participants.

https://developer.apple.com/documentation/arkit/arworldtrackingconfiguration/3152987-iscollaborationenabled

Wifi display powered by iPhone

AirPlay lets you share videos, photos, music and more from Apple devices to your Apple TV, favourite speakers and popular smart TVs.

– apple.com AirPlay

Let's change that sentence just a bit.

We know that displays require a lot of battery power. iPhone could determine what to display on the headset and transmit that to the headset using AirPlay. This is tricky. It would be likely that iPhone would need to be aware of the environment, which could require two-way communication to send a user's orientation – where they're looking and how their head is tilted. This would require an iPhone to use and operate the headset. It's unlikely that Apple would want to tether a headset to iPhone – albeit possible for a 1st generation product.

This lines up well with the idea that the headset will include a waist-mounted battery pack.

AirPlay lets you share videos, photos, music and more from Apple devices to your Apple TV, Apple Headset, favourite speakers and popular smart TVs.

– me

Face ID sensors for eye tracking

Face ID on iOS already has settings for Attention built in. It's possible to set iOS to Require Attention for Face ID - to verify that you're looking at iPhone before it authenticates your face.

Attention Aware Features is even more interesting to me. iPhone will check for attetnion before dimming the display, expanding a notification when locked, or lowering the volume of alerts

A quick test here showed me Face ID works at a distance of about 15 centimetres (6 inches) away from my face. Assuming Face ID can be tuned to function closer to the faces, it's possible that it could run inside a headset to determine, not only if you're paying attention, but where your eyes are focused on. This would deliver the allow the headset to track your eyes. This opens up new possibilities for inputs, and events to drive interactions with the headset and apps running on it.


LiDAR & Camera for inside-out tracking

In March 2020, Apple launched iPad Pro with a rear facing LiDAR sensor. With a range of upto 5 metres (~16 feet), this would provide enough distance to wear a Virtual Reality headset and make the human aware of their surroudings – "Watch out for the chair!"

Results showed that with Modelar's laser scanner application, absolute accuracies of ± 3 cm horizontally and ± 7mm vertically is achieved, while also achieving a relative accuracy of ± 3 cm

– Payton Chase et al, Department of Geodesy and Geomatics Engineering, University of New Brunswick

According to independent tests, the Apple LiDAR sensor has a fairly high resolution and accuracy. This provides a great ability for the wearer to move with confidence in their space for comfort and safety. It also provides the opportunity to address the guideline of dealing with surface reflection and occlusion. Of course, the camera could also be used in parallel with the LiDAR sensor to present the real world in a Virtual Reality headset. No surprises there.


Spatial Audio

September 2020 saw Apple launch Spatial Audio. Also known as Virtual Surround Sound, it can provide an ability to hear the music all around you. Rather than splitting the audio to left and right, as we're used to with Stereo Sound; Spatial Audio presents the audio in a format that makes the person wearing the headset feel as those they're in the middle of the recording. By turning around on the spot, the audio is perceived to move around the person wearing the audio headset.

Virtual Reality headsets do the same with displays today. As the person wearing the headset turns left or right, the display changes to show what is in front of the person, in the direction they are looking.

Spatial Audio achieves the same effect with audio. Now imagine wearing a Virtual Reality headset and perceiving the audio changing to reflect the direction you're looking in.

Immersive audio is so important to Apple they called it out in their Human Interface Guidelines. I'm willing to bet the headset will have built in speakers, and also work with Airpods that support Spatial Audio.


Homepod Mic Arrays

The 1st generation Homepad contained a 6-mic array, while the Homepod mini contains a 3-mic array, and the recently released 2nd Generation (Version 1.5 according to the kind folks at ATP.fm), contains a 4-mic array.

Was the number of mics reduced due to cost, or due to improved software technology meaning fewer mics are required? If we assume the latter, that's a big win for a headset – a gadget where every gram of weight truly matters.

Mic array technology will be extremely useful to permit the headset to show a user where audio is coming from. I can imagine an experience where Spatial Audio projects the Mic array, and with the Mic array information available as an SDK, it would be possible to show an in-app visual experience of where audio is happening in a person's current environment.


Homepod Room Sense Technology

If you're wearing a Virtual Reality headset, there's a good chance you'll be standing in a room. Even so, with an Augmented Reality headset, how audio is presented to a user will depend on where they're located in the real world. Are they standing against a wall, in the middle of a mall, in a showroom? The consequences to audio treatment can use this technology to deliver an immersive audio experience.

With room sensing technology, HomePod recognizes sound reflections from nearby surfaces to determine if it is against a wall or freestanding, and then adapts sound in real time. Precise directional control of its beamforming array of five tweeters separates and beams direct and ambient audio, immersing listeners in crystal-clear vocals and rich instrumentation.

– apple.com, Press Release, January 18, 2023

The ability to have room sensing technology, which can recognize sound reflections, and adapt sound in real time will lead to higher quality audio experiences wearing a headset. Suddenly you're not just wearing a headset talking to somebody else wearing a headset, but the sound quality just went way up.


AirTags and Ultra-wideband

Ultra-wideband, UWB, ultra wideband, ultra-wide band, ultraband. However you spell it, this technology is ideal for precision location – operating with accuracy as good as 5-10 centimetres (2-4 inches).

A stretch, but this could pair well with a LiDAR to detect handheld controls for a headset. I'd doubt the AirTag form factor would remain, and it's unlikely to morph into a VR Power Glove. Perhaps Hermes will manufacturer some high end gloves instead 😬.

In reality, it's possible that Ultra-wideband could be used to show objects to find, or interact with. If you've ever used the Find My app, you can see how the experience could translate onto an Augmented Reality headset.

In practice, UWB signals are able to effectively measure distance between two devices with 5- to 10-cm accuracy, compared to roughly 5-m accuracy for Wi-Fi and Bluetooth. When implemented in a system of fixed beacons tracking tag locations, the locations can be calculated to within 10-cm accuracy.

– electronicdesign.com, What’s The Difference Between Measuring Location By UWB, Wi-Fi, and Bluetooth?

AppleTV as a Hub

File this under a thought based on a rumour. MacRumors shared an interesting take from Jon Prosser in October 2020, that the HomePod mini and Apple TV would act as Ultra-wideband base-stations

If the headset has Inside-out tracking, perhaps there's some accuracy gains to be had adding Outside-In tracking using an AppleTV as a point of reference.

I think this is unlikely. While Apple Watch without cellular requires an iPhone to operate, I don't see Apple applying the same level of demand here.

HomePod mini & the new Apple TV will both act as UWB base-stations 🧐

  • Will precisely track your location as you walk inside house with other U1 devices.
  • Use info for media controls, brightness/volume control, & door locks.

Turns regular hardware into HomeKit hardware.

– "not jon prosser"

Summary

I think it's clear that Apple's journey into Augmented Reality and/or Virtual Reality has been going on for quite some time. I'm excited to see what the final product looks like when it's unveiled. Some of my ideas above might be far fetched, but I still think it's likely some of these will happen.

Let me know if you think this post was interesting. I'm @alexmcroberts@mastodon.social