Twitter’s Smart Search Correcting Spelling Errors For You

Want to search for Justin Bieber on Twitter but can’t remember if the “i” comes before the “e” in Bieber? Or searching for the ever popular (and always entertaining) Twitter account of Alec Baldwin and want to find some of his costars from 30 Rock as well?

Now that info is ready and waiting – all you need to do is watch for and click the handy links!

Twitter now offers spelling corrections and related queries next to search results – because they know you can’t spell:

At the core of our related queries and spelling correction service is a simple mechanism: if we see query A in some context, and then see query B in the same context, we think they’re related. If A and B are similar, B may be a spell-corrected version of A; if they’re not, it may be interesting to searchers who find A interesting. We use both query sessions and tweets for context; if we observe a user typing [justin beiber] and then, within the same session, typing [justin bieber], we’ll consider the second query as a possible spelling correction to the first — and if the same session will also contain [selena gomez], we may consider this as a related query to the previous queries.

For example, a search for Ashton ‘Krutcher’ would suggest this:


And a search for his correctly spelled name would offer suggestions, as such:

Does it sound a bit intrusive? A little ‘we track your world Google-ish?’ Well, don’t worry your pretty little head over it, because “the data [they] process is anonymized — [they] don’t track which queries are issued by a given user, only that the same (unknown) user has issued several queries in a row, or continuously tweeted.” That’s plausible, right?

And regardless of whether or not you’re feeling this “anonymized” vibe, Twitter is doing us a solid on this one. I mean, outside of handing over this little piece of your privacy, what option is left – learning to spell?! And this spellcheck thingy ain’t easy for Twitter:

Twitter’s spelling correction has a number of unique challenges: searchers frequently type in usernames or hashtags that are not well-formed English words; there is a real-time constancy of new lingo and terms supplied by our own users; and we want to help people find those in order to join in the conversation. To address all of these issues, on top of our context-based mechanism, we also index dictionaries of trending queries and popular users that are likely to be misspelled, and use Lucene’s built-in spelling correction library (tweaked to better serve our needs) to identify misspelling and retrieve corrections for queries.

So start searching in text-speak on Twitter if you want, I guess. It will likely dominate their spelling correction library eventually anyway . . . or check out when a word escapes you, if it isn’t too much trouble.


(Misspelled list image from Shutterstock)

Recommended articles