Sunday, November 17, 2013

What is Glass?

Note: The last few paragraphs were scratched and replaced to discuss the new GDK on 4 December 2013

When Steve Jobs introduced the iPhone, he called it a telephone, an iPod, and an internet communicator. One of the most common questions I get when I wear Google Glass is "What is it?" I wish I had an answer as concise as Steve's.

Glass isn't a phone, or an iPod. I think it qualifies as an internet communicator. But more than that, I claim that it is a tool for simplifying and speeding up the common interactions you might have with a smartphone or computer. Perhaps it will inspire computer interactions that don't yet exist. Like most devices with Apps, what it does depends a lot on the software you use.

The Pieces

If you tear open a Glass device like these folks, you will find a camera, a display, a bone conduction speaker, a touchpad, a shutter button, an accelerometer / gyroscope, WIFI and Bluetooth transceivers, and a CPU. It comes with a snap-in polarized sun shield too.

The display technology works by projecting an image into the prism which sits above the right eye. The images it creates are translucent; you can see right through them. The positioning of the display above the eye -- not in front of it -- means that you aren't trying to peer through it to see the world. The prism appears to have a photosensitive film on the side away from your eye to create a darker background for the display in bright light.

The bone conduction speaker is a tiny pill-shaped apparatus that touches the head near the right ear. It looks temptingly like a button. One of the curious properties of the bone conduction speaker is that in loud environments you can hear it better by plugging your ears. It also tickles just a little bit when it makes sound.

Using Glass

To begin interacting with Glass, you either need to tap the touchpad near your temple, or perform the Glass head flip. The head flip involves tilting your head up until the screen activates, a configurable behavior. With both the tap and the flip, the display shows the home screen and begins listening for the magic words: “OK Glass”.

If you say “OK Glass”, you can verbally select from a menu of commands: send a message, take a picture, record a video, get directions, make a call, start a video hangout, or take a note. Speak and Glass obeys. Some of the commands appear depending on what services you have connected, or how your phone is configured. The “Take a Note” command, for instance, can be handled by the Evernote Glass app, and it lets you dictate a note.

Using the touchpad, the same options are available, and you can also navigate the card-based interface of Glass. To the left of the home screen there are a collection of cards mostly related to the Google Now service. You might find a card with upcoming events on your Google Calendar, a card for the local weather, a card for stock prices, or a card offering driving times and directions to destinations recently searched for or for calendar events. These cards and their contents are contextually sensitive just like the Google Now cards on an Android Device, or in the Google Search app on iOS. They appear and vanish depending on what Google thinks is most useful. The card furthest to the left is the settings card, which shows the battery charge, and allows the user to configure Glass.

To the right of the home screen, there is a row of cards in reverse-chronological order, starting with the most recent. These are cards which come from your own interactions with the device, from communications, or from third party services. If you take a photo, you’ll find a card for it in the timeline. Did you search google? You'll find a card for that. Text messages and emails too. The interface feels like a long strip of film that you can click through one frame at a time, like an old-fashioned slide show.

Typically, each card responds to a tap on the touchpad. Depending on the card, it will either show a menu or another collection of cards that were metaphorically stacked. Text messages are a good example of stacked cards. In the timeline, only the most recent text is visible. Tapping on that message reveals a card for each message in the conversation. Tapping on any one of those cards offers a menu: reply, read aloud, call, delete.
You can see in the image above a diagram of the Glass interface stitched together from actual Glass screenshots. Click or tap on it to see a larger version. The home screen is where you usually start an interaction, and is one of the indications that Glass is ready for a verbal command. If you swipe forward on the touchpad, the next thing you would see is a text message, followed by a photo taken with glass, and finally a search for "will it rain today."

The triangular dog-ear in the top right corner of the message card is a clue that this is a stack of cards. If you tap on the touchpad while the text message is visible, if will dive into that stack of cards: a list of messages in that conversation. Follow the yellow arrows above. If you tap on any of those message cards, you are offered a menu.

The vertical stacking of the interface is a useful way to think about the UI. Swiping down on the touchpad will return the user to the next level up. From any of the menu cards, you can swipe back to the messages list, and from there you can swipe back to the top level timeline. From there, an additional swipe down will turn off the screen.

Incoming Communications

If you get a new message or interaction from an App, Glass will chime. If you respond immediately by tapping or performing a head flip, you will be shown the card associated with the notification. Depending on the notification, there will often be an “OK Glass” cue on the card indicating that you can address the notification verbally. When I get a new email or text message, I can say “OK Glass, read aloud”. Glass will then read the contents of the message to me. When it’s finished, I can say “OK Glass, reply,” and then dictate a response.

One useful UX trick that Glass uses is that it will display the text you dictate for a few seconds before performing an action. This gives you an opportunity to cancel a message in case there was a transcription error..

Photo and Video

In addition to the audio commands, you can take photos or a video by using a physical button on top of Glass. Photos and videos can be explicitly shared or pulled off of Glass using USB. In addition, once Glass is on a WIFI network and has a decent battery charge, it will automatically upload the media to a private Google+ album. You can choose to share or download the images from there.

The camera Glass has (at least the first generation Explorer Edition I have), is a wide-angle fixed-focus camera. I don’t really think that the camera compares to what you would find in an iPhone 5. However, Glass uses some computational photography techniques to create better photos that what the hardware normally would produce.


I've only used Glass for walking navigation, but it works really well. Walking navigation seems like a killer app to me. The map is continuously projected in the display, and is oriented in real time with your head motion. Since the map spins so that it orients where your head is aimed, there is no need to look at street signs. Just line up the arrow with the path and walk.

You look like a normal, purposeful human being using walking navigation on Glass. Compare that to folks trying to navigate with their smart phone. They walk with their heads either down, or looking for street signs. They walk ten feet in a direction before making a u-turn to go the correct direction. Glass is a nice improvement.


Glass offers two main paths for developing apps. The first is the Mirror API. To use the Mirror API, the app developer doesn't write code for Glass. Instead she writes server code. Your server interacts with Google servers which then act as a proxy for a user’s Glass device. The server and the Mirror API interact with JSON and HTML representations of timeline items through RESTful endpoints.

Since the app runs on your server, you can use whatever technology you want to implement your side. Google has example code written in a variety of different technologies.

Users enable an app that uses the Mirror API by authorizing them with a familiar Google authentication flow. If Google has approved an app, it can be switched on through the MyGlass Android app, or the Glass dashboard.

The second path for creating Glassware is writing an Android app. You can load apps using the traditional Android development tools and load them with a USB cable. Developers will want to heavily customize their app for Glass since there is no touch screen and most apps aren't quite ready for a 640 x 360 display. At the moment there is no simple distribution method for apps created this way. Glass doesn't yet come with the equivalent of the Play Store.

Update: Google has released a preview of what they're calling the Glass Development Kit (GDK). The GDK is an Android library that gives developers direct access to the Glass-specific elements of the device: the timeline, the cards, menus, and so on. It also enables Glass apps with real-time interaction.

Apps developed using the GDK are installed the same way the Mirror API apps are: through the MyGlass web interface, or the MyGlass Android app. Flip the switch, and the package is pushed on to the device and installed. Glass requires an internet connection to retrieve the APK.

Along with the GDK, Google and several third-party companies released GDK based apps. Google released a compass app. Word Lens released an impressive translation app that replaces text in a live video feed from the camera. And there are several other apps in categories like sports.

The GDK opens up a lot of new possibilities for Glass development. It offers more challenges too, since it takes careful work to get smooth performance and low energy usage from code run on the device itself. That's just the sort of thing I enjoy. I've already started having fun with the GDK.

No comments: