Sorry, we could not find the combination you entered »
Please enter your email and we will send you an email where you can pick a new password.
Reset password:
 

free

 
By Thomas Baekdal - July 2006

Introducing: WEB2RSS

RSS feeds are one the things that I really like. It allows me to get the information I want, the way I want it. There is just one problem. Not every site offers an RSS feed, and those that do are not offering the right content.

This is something I wanted to do something about.

Say hello to WEB2RSS

WEB2RSS is a simple web application that can convert almost any normal website/page into an RSS feed. If you want to receive updates in your feed reader when your favorite site changes, this is the tool for you.

Technical speaking WEB2RSS request the page you want, strips it from all non-text elements and output the rest into a RSS file. What you get is the content of any page, without form elements, script blocks, iframes etc.

Some examples: If one your friend's website does not offer an RSS feed, now it does. If you are really into the art of "Camera tossing" you can now search for it on Flickr and subscribe to the search results. ...or what about being kept up-to-date with the forum you visit frequently, or maybe just a specific topic. What about getting the latest updates to a seminar you are interested in - like the HCI class on Stanford University. Maybe, you just want to get the daily comic strip from your favorite cartoonist, or the latest news from your favorite hokey team.

Or, what about getting Google search results as a RSS feed, or when Google publishes something new on Google Labs. ...or maybe you just want to know when you competitors publishes new content.

You can do all of these things with WEB2RSS.

How to use it

Using WEB2RSS is pretty simple. Type in the URL you want to use, check the link and... Boom (as Steve Jobs would say) you get RSS link you can subscribe to.

There are a number of more advanced features that you can use. The easiest one is "Include Images". By default, all images are removed from the feed to prevent layout images to clutter up the results. A common problem on table based sites. But, you can force it to include images if you want to do that.

The two other advanced settings are much harder to deal with. They are "Match" and "Exclude". With these you can either match a specific part of a page. If you want to get only a specific image, DIV or table, this is what you use. You can do the same with "Exclude", except it will remove the element you define instead of keeping it.

The hard part of these two settings is that they function using something called "Regular Expressions". If you do not know what this is, my best advice is to not use them. Regular Expression is a very complex beast, but it makes the "WEB2RSS" very flexible and powerful.

Are there any limitations?

Yes, of course. There is a number of things that can prevent a successful conversion. Like:

  1. Sites made entirely using Flash cannot be converted, for the simple reason that the underlying code does not contain any text. No text, nothing to convert.
  2. Sites with a lot of DHTML and AJAX etc. does not work either. In this case the content is often not a part of the main page.
  3. There is a bug in XMLHTTPRequest that prevents from handling redirects. So if you trying to convert a page that is redirected to another page it fails. At the moment this a problem out if my control.
  4. The last major problem are sites that is so badly coded that WEB2RSS cannot make heads or tails of it.

It does work very well with simple table based sites, and - of course - all the sites that is designed using XHTML and CSS.

Why Beta?

WEB2RSS is released as a beta primarily because of worries about scalability. I am not sure how the server will react to heavy use if it. I need some real life usage to know more.

Try it

... and let me know what you think. General comments can be posted here, on this page. Use this page for bugs and feature requests.

 
 
 

The Baekdal Plus Newsletter is the best way to be notified about the latest media reports, but it also comes with extra insights.

Get the newsletter

Thomas Baekdal

Founder, media analyst, author, and publisher. Follow on Twitter

"Thomas Baekdal is one of Scandinavia's most sought-after experts in the digitization of media companies. He has made ​​himself known for his analysis of how digitization has changed the way we consume media."
Swedish business magazine, Resumé

 

—   thoughts   —

plus

thoughts:
Why publishers who try to innovate always end up doing the same as always

plus

thoughts:
A guide to using editorial analytics to define your newsroom

free

thoughts:
What do I mean when I talk about privacy and tracking?

plus

thoughts:
Let's talk about Google's 'cookie-less' future and why it's bad

free

thoughts:
I'm not impressed by the Guardian's OpenAI GPT-3 article

free

thoughts:
Should media be tax exempt?