RSS feeds are one the things that I really like. It allows me to get the information I want, the way I want it. There is just one problem. Not every site offers an RSS feed, and those that do are not offering the right content.
This is something I wanted to do something about.
WEB2RSS is a simple web application that can convert almost any normal website/page into an RSS feed. If you want to receive updates in your feed reader when your favorite site changes, this is the tool for you.
Technical speaking WEB2RSS request the page you want, strips it from all non-text elements and output the rest into a RSS file. What you get is the content of any page, without form elements, script blocks, iframes etc.
Some examples: If one your friend's website does not offer an RSS feed, now it does. If you are really into the art of "Camera tossing" you can now search for it on Flickr and subscribe to the search results. ...or what about being kept up-to-date with the forum you visit frequently, or maybe just a specific topic. What about getting the latest updates to a seminar you are interested in - like the HCI class on Stanford University. Maybe, you just want to get the daily comic strip from your favorite cartoonist, or the latest news from your favorite hokey team.
Or, what about getting Google search results as a RSS feed, or when Google publishes something new on Google Labs. ...or maybe you just want to know when you competitors publishes new content.
You can do all of these things with WEB2RSS.
Using WEB2RSS is pretty simple. Type in the URL you want to use, check the link and... Boom (as Steve Jobs would say) you get RSS link you can subscribe to.
There are a number of more advanced features that you can use. The easiest one is "Include Images". By default, all images are removed from the feed to prevent layout images to clutter up the results. A common problem on table based sites. But, you can force it to include images if you want to do that.
The two other advanced settings are much harder to deal with. They are "Match" and "Exclude". With these you can either match a specific part of a page. If you want to get only a specific image, DIV or table, this is what you use. You can do the same with "Exclude", except it will remove the element you define instead of keeping it.
The hard part of these two settings is that they function using something called "Regular Expressions". If you do not know what this is, my best advice is to not use them. Regular Expression is a very complex beast, but it makes the "WEB2RSS" very flexible and powerful.
Yes, of course. There is a number of things that can prevent a successful conversion. Like:
It does work very well with simple table based sites, and - of course - all the sites that is designed using XHTML and CSS.
WEB2RSS is released as a beta primarily because of worries about scalability. I am not sure how the server will react to heavy use if it. I need some real life usage to know more.
... and let me know what you think. General comments can be posted here, on this page. Use this page for bugs and feature requests.
Founder, media analyst, author, and publisher. Follow on Twitter
"Thomas Baekdal is one of Scandinavia's most sought-after experts in the digitization of media companies. He has made himself known for his analysis of how digitization has changed the way we consume media."
Swedish business magazine, Resumé