If you have been near Twitter the last couple of hours, you have probably noticed the huge amount of strange tweets. And if you moved your mouse over it, all kinds weird things started to happen - everything from innocent popups, to suddenly retweeting the message, to being redirected to external "adult-oriented" sites.
You can read more about it over at Techcrunch, Mashable, or Sophos (and many other places).
Twitter has now fixed their site, but the problem wasn't specific to Twitter's website - any web app using Twitter is potentially affected. Anything from websites listing tweets, to widgets that you add to your blog.
Most of these works just fine, and was never affected, but every single web developer, using Twitter, needs to check their code.
Also none of the Twitter apps (desktop, iPhone, iPad, Android) apps were affected, because they are not "web" based. Instead they just outputted gibberish into your stream.
It's simple. When you use the Twitter API, you get a XML or JSON output with the tweet as clear text. Here is one of the many examples.
Note: This one was the one responsible for most of the retweeting going on (not harmful, but really annoying). There were many others, some much worse than this.
Every web app will then try to find any link in the text, and convert that into something you can actually click on. It is a very simple operation, done all the time, pretty much everywhere.
In the above case, the problem was with the quotation mark, but it could be other things too.
The "safe" characters are generally (when converting raw text to links): a-zA-Z0-9;/?:@&=+$,-_.!~*() Anything else isn't part of the link.
Note: For developers, a regex like this (https?|ftp|file)://[a-zA-Z0-9;/?:@&=+$,-_.!~*()]+ works. This is the one I use for all my twitter apps.
Most web apps actually do this the right way, e.g.,
Seesmic Web didn't have the problem, because they had done it right way to begin with.
It is really up to each individual developer. Every web app is vulnerable by default. It's your job as a developer to make sure you are not affected.
Twitter has solved the problem with their site, but in a rather curious way. Instead solving the matching algorithm, they are now simply converting " into "e;. It works, but it is not really the right way to do it.
Update: Video of the exploit in action (via Sophos)
Almost every time a news site launched something new, they also cover the same stories the same way.
Editorial analytics is the tool we use to define how to report the news.
Google wants to build tracking into the browser, and then remove personal identifiers ... but is that good?
AIs can be both good and bad, but using an AI to fake some text is always bad.
Many people in the media wants newspapers to be tax exempt, but what about the rest of the media?
When a publishers says that WhatsApp converts 12 times more people than their website, what does that actually mean?
Facebook said that it wouldn't block misleading political ads, so let's talk about that
Cookies today are doing all kinds of bad things, but did you know that the original creators wanted to stop that?
We all knew this would happen, but Google won't pay publishers for snippets.
Founder, media analyst, author, and publisher. Follow on Twitter
"Thomas Baekdal is one of Scandinavia's most sought-after experts in the digitization of media companies. He has made himself known for his analysis of how digitization has changed the way we consume media."
Swedish business magazine, Resumé