Lqd's Journal


It's actually pronounced liquid!

Uncovering an invisible button on the iPhone, with sound

I’d like to explain the idea I had last summer on how to “add a new button” to the iPhone, thanks to the microphone, as an interaction design mockup. I initially wanted to do that in an article dedicated to interaction design strategies for iPhone games, but it doesn’t look I’ll be able to do that soon, so here goes the 1st of them. It’ll be clearer than 140 characters at a time on twitter.

I thought about this a week after Sonar Ruler (the site was down when I tried to find a link, so here’s a demo on Vimeo) was released. This app made me think of other uncommon uses of the microphone. While using your voice has become a 1st class citizen in the interaction landscape (or close to it) probably starting with Science-Fiction a while back, I was trying to focus on sound in an indirect way, as a side-effect or by-product of the interaction.

The “Sonic Button”

I never really thought about a name for this, but SonicButton and sonic tapping is probably descriptive enough (suggestions are welcome and appreciated).

The concept, as the name and the title of this post suggest, is using the microphone to listen to the sound of interactions with the iPhone, effectively turning it into a button: listen to taps, and especially taps made using a nail. The great thing about this is it works anywhere on the phone: front, back or the sides, the whole surface is available for the interaction. In turn, this allows great flexibility for the users, any finger can be used, in any orientation, you handle the phone regularly and use the finger you want to tap wherever you want. You can even tap with the nail on the screen itself, and not register a finger touch. As I said, the *whole* surface is available for you to use.

Of course, this is applicable to any phone and not just the iPhone, provided it has a microphone (pretty much all of them except the ones used by mimes) usable from an API (far less common). Android comes to mind as an additional platform, and surely you could name others. However, I only tested the microphone behavior on my iPhone 3G.


While I’m not an iPhone or Android developer (yet) I did the best I could to test this theory by using existing applications. I used the SoundMeter app to see how the microphone reacted to this interaction, under different conditions: portrait and landscape orientations, holding the phone with one and two hands, for regular use and games (where the grip is usually different), and in calm and noisy environments.

The typical sonic tap will obviously manifest itself as a short spike in the audio stream. Depending on the noise conditions, where and how you tap (with the nail or a fingertip), the intensity will of course be different, but in a calm room, hitting with a nail, I usually get between 25 and 35 dB for a soft-to-regular-strength sonic tap.

Something interesting happens using a “game grip” (your hands, thumbs and 1st fingers pretty much covering every side of the phone). The microphone is obstructed and picks up sounds at a high level (80-90 dB, out of, what I think the maximum is, around 105dB or so). Even here, with the microphone completely blocked, a sonic tap registers a spike in I believe the same way as people hear their own voice, even when they block their ear canals with their fingers: with sound travelling through the skull rather than from the outside in (note: my biology knowledge is pretty limited, so this might be wrong). Here I think the sonic tap travels inside the phone and the mic picks it up.

This is actually something that can be taken advantage of, in a noisy room, where a sound spike coming from the environment would be considered a sonic tap, you can block the microphone deliberately and still use this interaction.


To my eyes, the most interesting part of this is the fact that it allows to interact with something other than the screen. Even though it’s only one button, the fact that it’s a button that can be used without obscuring the view is really nice.

It’s also a discrete and simple event, and in that sense would be far easier to use than the accelerometer for instance (which depending on the use case can be seen as rather imprecise, and breaking down when using it for two dimensions). It’s not that it’s hard to tilt your phone, it’s that it’s hard to tilt just the amount you need to do what you want, (said amount is also app-dependent), whereas a tap is a tap, in every app. Sure, the variety of the environments, of the implementation thresholds can turn this into the same non-deterministic behavior, but a strong sonic tap should generate a high-enough spike to be detected in most implementations. We’ll see.

Just like with the accelerometer, a problem is that it would probably require calibration to match the user’s behavior and environment, even though sensible defaults could be chosen regarding strength, and an app could detect a noisy environment and take appropriate action, be it changing thresholds or notifying the user to switch his grip for instance.

It’s also about as hard to discover as it gets, otherwise I wouldn’t be talking about it. However, I don’t feel discoverability to be such an issue, the usual in-app “tutorial” solves those kind of problems with ease most of the time, and even if it’s rare to see them in utility apps rather than games, any app with a different enough UI offers one.

The initial (really short) learning curve passed, I feel an interaction like this one would be fun and useful, and should offer a great experience to the users, which is what the iPhone spirit is all about.

I would see this being used in immersive apps (like games and such) for local consumption only, ie I don’t think it would be useful to broadcast the sonic tap event over the network, other than maybe if you wanted to do a mini-drums simulator that’d work remotely from the back of your phone, or a human metronome, who knows.

The possible interactions

The most common way to hold a phone is (in my own experience, and limited testing with real people) in portrait mode, with one hand. This can be seen as a vertical handling of the position called “the dealer’s grip” in the card-playing world. In this position, the 1st finger is almost not used for holding, resting most of the time close to the lock button (using the left hand; this button is not located here by chance) or on the back (and probably not lower than the Apple logo) while the others are touching the sides. This is the simplest case to hit the SonicButton, on the back side of the phone. It’s also possible to use the 2nd finger, however it’s not as comfortable, so it didn’t look as good a choice in my tests (I only tested with a handful of people, though). As I said before, letting the users choose will end up naturally on the most comfortable choice and position for them, in practice I found this to be pretty powerful.

Using both hands in portrait mode can happen when the user is typing, on the web or writing a mail or SMS, and only if the user is skilled at typing on the virtual keyboard (the small number of beginners I know type approximatively the same way, by holding the phone with one hand and either hitting the keyboard with the holding hand’s thumb or with the 1st finger of the other hand. The latter being more common, and this also held true for the ones having a phone with a physical portrait keyboard, sliding up or down, or with clamshell phones). In this position, a very interesting situation comes up where you can still use your thumbs but hit the glass *under* the screen, at the left or right of the Home button. I did say you could sonic tap anywhere.

In landscape mode, people rarely use it with a single hand, but it can happen. If the user’s holding the phone and not interacting with it, reading or watching a movie, his fingers aren’t in front of the screen and he can sonic tap on the back. If he’s interacting with the phone, it’s usually with the thumb and here a sonic tap to the sides of the screen around the top speaker or home button, or once again on the screen with the nail only, is doable. In practice, I didn’t see this behavior in my limited testing. Still works if you do use it that way.

When they use the phone with both hands in landscape mode, the grip matters (in blocking the microphone) but basically you can sonic tap with your thumbs on the sides of the screen on the front, or with your 1st fingers on the back of the phone. Using the “game grip” you can also use your 1st fingers on the top edge, or depending on the user’s dexterity the 2nd fingers on the lower part of the back side, close to the bottom edge.

As with regular taps, you can have a multiple sonic taps, even though I suspect filtering will modify the way to detect the following taps.

What’s also interesting with this, is it allows eye-free interactions like a real physical button, even though the environment noisiness could be an issue, a double tap might be the gesture to use. It’s not that interesting in practice because it’d mean an app would have to run when the phone is in a pocket, unlocked, and with the microphone and sound processing the battery probably wouldn’t last long.


As this is an interaction concept, the only pointers I could give would be the Audio Queue Services inside CoreAudio, for the iPhone, and also Stephen Celis’ sc_listener which seems to be the perfect candidate for the job.
I don’t know if you can get a live stream on Android, but the AudioRecord, MediaRecorder, and this tutorial certainly could be a start.

In conclusion

This sonic button is closely related to Chris Harrison’s ScratchInput: I actually read his project page when it came out, and I remembered it and looked it up again after coming up with this. The mechanics are roughly the same and use the exact same principles, but in ScratchInput the listening to scratching on surfaces is done via custom hardware (because scratching is a lot softer than tapping), and could be adapted into mobile phones, turning them into passive listening devices (that can broadcast data over the local WiFi network) used to turn a regular surface into a scratch-enabled one; whereas in what I presented here, the interaction is focused on taps and happening on the phone itself (and not a wall or desk) and the input data is actively used by the iPhone and its apps to enable new features. So in my mind, they’re really close and part of the same interaction family, but I wouldn’t say they’re exactly the same.

I’d love to get feedback on the concept, you can me find a twitter.com/lqd.

Mockup for short links on Twitter

And now for something a little different: most if not all of my posts here have been code-related even though a big part of my work and interests are design-related, something with which you might be familiar if we’re friends on twitter

Recently Chris Messina offered a suggestion on how Twitter could better integrate short links.

Chris' proposal

This prompted me to summarize my thoughts on the problem they introduce in the UX. The Twitter Fan wiki page on short links explains those technical and design problems, and lists current practices and alternative solutions, including the one I’ll present here today. Even though I’ll offer several different ideas, they’re sequential in my mind. I see them as different versions of a solution, as part of the design process, rather than different solutions.

For reference, this is how Twitter displays links at the moment.
short links on Twitter right now

The most obvious thing to do is integrate expansion on demand as exists on the Twitter search (ex-Summize) page. The version A is exactly that.
expand Button version A

The version B is a modification I made to make the expansion process secondary (and it’s the one I use at the moment in our own unreleased client, but this will change).
expand Button version B

The expand buttons (possibly underlined) would be slightly transparent in order to blend in better but would have full opacity on mouse hover.
However those 2 versions only show the information you’re looking for when you interact with the expand buttons. This interaction is pure excise.

So, the next version, “With Host”, adds feedforward on the short link by showing the host/domain it points to. It’s only slightly better, but offers reassurance if the site is one you trust and visit frequently.

Showing the domain

However, in my mind short links are a hindrance to the experience and shouldn’t be a focal point in the tweet — the real links are actually more important in the message. The next version, “Inverting the polarity”, keeps the feedforward of the previous version, but shows more info and puts the expanded link in the tweet itself. The short link being relegated to the sideline here.
The expanded link inline

I strongly think the expanded URL shouldn’t be shown fully here, the links people shorten are often huge; they would offer little added value (what can you tell from a 150-character link you couldn’t from a 100-character one), they’d mess up the layout, etc. So only the first X characters would be shown, say 20 or so as I used here, and an ellipsis if the whole URL was cut off.

Continuing in that direction, the last version “Bye-bye short link” removes the obscure short link from the main content altogether.

Bye-bye ugly short link

In those last 2 versions, you could add an expand button (I originally designed the ellipsis to be this expand button but I feel it’d be a hard-to-hit target—even with more space between the link and the ellipsis, which is not shown here—but I don’t think we’d need one, for a simple reason. There already exists an interaction for knowing where a link points to: put the mouse cursor over it, and look in the browser’s status bar. Of course, just adding this to the other versions (including the one Twitter uses) would shed some light on the short link black hole, even though hijacking the status text like this has to take accessibility into consideration.

Of course, a tooltip window similar to Chris’ could be another direction, however I feel that just using “link” could be seen as giving even less information than the already obscure short link.
It could go really well with the ‘Bye-bye short link’ design, where the tooltip window would show the original short link url (and who knows maybe also some stats, even though I suppose they’re probably only used by a tiny fraction of people)

I don’t feel the URL expansion being slow is a problem, provided it’s done *before* showing the tweet in the timeline. Be it on the twitter.com website or in a client, I believe users have no easy way to tell they received a tweet 30 seconds later than they were “supposed to” (this could be the maximum timeout allowed for the URL expansion to succeed). If the link wasn’t already expanded in the tweet (last 2 versions), the Summize style expansion would here be immediate.

Even though my personal opinion has no value compared to user testing, removing the short link altogether is my favorite :) These different versions all evolved into removing it, and this last version represents my current thoughts on the problem.

Shortly after I tweeted this, Chris came up with another redesign blending several ideas. I’m happy with the direction and feel we’re getting somewhere interesting.

Chris' redesign

I’d love to get some feedback, tweet about this using the #shortlink hashtag. You can find me at twitter.com/lqd