Publish And Subscribe
In the BlogWeb context, this is about handling a reader's interest in multiple blogs. Specifically, having a mechanism so that the reader gets alerted to changes (new postings).
Why is anything resembling real-time necessary? Would it be such a bad thing if everyone just grabbed stuff once a day?
that would probably suck the energy out of the "HotLink-s"/Flash Crowd-s process
- one the other hand, will we get to a point when we feel like that process is more like watching TV than thinking?
An RssAggregator typically has a single global setting (sometimes not even user-editable) controlling how frequently it polls every subscribed blog to see whether it's changed.
Some aggregators do an HTTP GET, which can involve grabbing a fair amount of data if the blog includes its full content.
Some aggregators (I believe) just do an HTTP HEAD first, so the blog server responds with the date-time of the last change - then the aggregator compares that to the last time it got data, and if the last change is more recent it then does a full HTTP GET.
If an aggregator passes as part of its request the date-time it most recently got data, theoretically the blog server can decide to send no data if there's been no change. That reduces/avoids the bandwidth concern, but adds some computational overhead to each such request. But probably the best trade-off. See HttpConditionalGet
RSS readers should probably check here before asking individual sites for updated RSS. This would improve scalability (vs current practice of grabbing every chosen site's RSS every 30-60 min regardless of whether it's been updated).
I wonder what % of sites with RSS ping weblogs.com? (or any other site) I wonder how many do separate pings for main content and feeds?
How handle a world of non-centralized weblogs.com sites? Assuming that each blog picks a single ping-site to ping, they could store that site's URL in (a) a 'link' tag (just like many sites point to their RSS feed) and (b) an RSS channel property.
An issue is what to do when you've been offline for more than 3 hours (or haven't been running your ping-checker). I guess:
is the "real" solution to have Ping Service-s allow querying, so you could submit a list of blogs and get back a list of last-ping-times? I suppose that's more "expensive" for the ping-site to process.
Radio Userland, Userland Manila - using XmlRpc or SOAP (over HTTP only?) - http://www.thetwowayweb.com/soapmeetsrss and http://backend.userland.com/publishSubscribeWalkthrough |walkthrough directions for Userland Manila
DJAdams did something similar with Jabber. There's a Jabber PubSub http://www.jabber.org/jeps/jep-0024.html |spec I could see this making sense for use while the reader's machine is online, then with a batch catch-up process after a period offline. You'd want the update messages dumped into your Universal Inbox for prioritizing.
While it's not an issue for engines that render upstream to the server separately from saving content, simpler systems there are lots of incremental changes to items (esp on a wiki, vs a weblog). Does it really make sense to shove all those "saves" out to the network? No, it makes sense to use some sort of periodicity/batching model. One would be to have the author trigger an updated-ping to his subscibers when he thinks it's appropriate (I manually ping WeblogsCom). Another would be to have an agent check for changes on a time period (half an hour?), then generate updated-ping messages.
Will we hit a point where there are 10million+ blogs? And 10million+ blog readers? If everyone's sucking in lots more stuff than they can or want to read (so their Universal Inbox can rate/filter it), what does that imply for bandwidth scalability? Will ISP-s run Caching Proxy Server-s? Or will writers start to look for ways to limit the number of semi-readers they get? "If I subscribe to your feed then you get real-time PubSub pings, otherwise you can only read/poll once a day."
Apache http://mod-pubsub.sourceforge.net/ |module Web Sphere Java Messaging System http://advisorevents.com/CIW0306p.nsf/4e89a750092af55b88256b66006b2eef/4b451ac5cc3583ae88256c8e005bbe0f?OpenDocument |support Traditional enterprise messaging like Tib Co.
Edited: | Tweet this!