Noise in the Bloggosphere

As I read through my RSS feeds in Google Reader today, thief.jpgI was once again struck by the increasing number of familiar headlines. By this I don’t mean similar themes continue to be explored (although true – Hilary is clearly a bad, bad, bad woman and John McCain throws kittens into wells), but rather that I had already read the articles that were popping as new posts. My immediate thought was that Reader wasn’t catching my ‘mark as read’ flags, or that I had inadvertently created duplicate feeds. Alas, neither the case. These are the same posts…simply with different authorship claimed. Note that I am not even getting into the automated blog post piracy that is designed only to attract search engine attention.

When you try to stay on top of all your news feeds with a reader and attempt to strategically manage the multitude of feeds, the collapsing of feeds into headlines makes this phenomenon rather obvious. As I considered this, I realized that there is a certain tiering in the bloggosphere. Digg, Redit and other aggregators are at the lowest level and explicitly point to other’s posts. At the ‘highest’ level you have blogs that create absolutely original, thoughtful and unique posts. Between these there are all manners of variants. Review sites are somewhere in this milieu and they account for a substantial amount of this overlap. Some new gadget is released and the sites all tend to either hear about it or get their hands on it around the same time. Yet, it is interesting to note (when you have far too many RSS feeds coming in) post gravity and proliferation.

If you are subscribing to original source feeds, you might pickup on the press release. This follows the endless rumour posts from the industry watchers. The press release sparks a multitude of posts on how correct an industry site was and then comments on these. All of these posts get picked up at some point in the aggregators and you have substantial data ballooning.

An Historical Example

All of this is to say that things haven’t changed that much. In my studies of 19th Century hotels in a small rural entrepôt, I came across a charge in the newspaper in 1900 that this centre had more drinking spots per capita than any other place in the Province. A stiff charge, and one that I found echoed every decade or so in books, newspapers and in academic papers. Suddenly one could find many references to the rather intemperate characteristic of the city in question. As artefact, the charge went unexamined and reappeared with little additional embellishment, but was accepted simply for its proliferation. For whatever reason, I questioned this charge in 2003, and decided to challenge the evidence. The simple facts were that the original author used a business directory to compile a list of establishments. He was interested in unique statistics, but the business directory was designed to be very user friendly and direct readers to the address or proprietor os the business. As a result, they attempted to list as many permutations as people would know an establishment by. For example, The British Hotel was listed under that name as well as Jones’ Hotel (the proprietor), Smith’s Hotel (the guy that owned it two years ago), The Anglo-American (as it was known when Smith owned it) and often even others. As a result when the journalist sought to count hotels, he found four references to what was in fact the same establishment, but each seemed unique to him. What he didn’t seem to consider was that the business directory was not designed to provide the information he wanted – and this went unnoticed by every person revisiting his article for the next 100 years. Learned academics ended up overestimating the number of bars by 4x or more. This isn’t to point out that I am just a better sceptic, but simply to re-emphasize that a critical approach to journalism is required in a blog as much as any other media.

Post Proliferation = Truth

So what this have to do with post proliferation? Much I think. Just as an original post gains credibility, simply through proliferation, the evidence of the same happening today is amplified by the bloggosphere. And I sense all the more so as quantity seems to be the measure. Most of our metrics revolve around readership, hits, and references, not originality, veracity or such qualitative measures.

Getting back to our original point, I have a sense that one might be able to construct a tiered model to show the evolution of a post. There are originators who focus on actual creation, then there are the clear regurgitators, and then the aggregators. In between there are a variety of others, adding two cents to others posts, using an idea as a foundation and riffing off. This is what it’s all about. These are not all bad things. The study of the evolution is a glorious discipline itself.

Yet, in the back of my mind when I note the great repetition there’s a sense that people are participating, but not getting it. Maybe its all part of the flow, but I sense there is a flaw too, in the measures. If its about the collateral advertising with Google Adsense, then its about click thru. Get someone to the post and whether they read it or not, connect the headline to the adverts carefully selected by Google.

On Microblogging

Last year, I added the sideblog plug-in to my own blog, basically a spot where I could collect links and things of note without adding anything to them. While I could have left this in my mainstream of posts, somehow this degraded my sense of my own place in the bloggosphere. Now I am by no means the original contributor on the level of most of those whom I track, but I generally make a point of trying to indicate when I have seen a cool post and do have something to add and therefore include it in my stream.

Originally I did not publish my asides. They were posted in the top corner of my blog, but not included in my RSS feed. I flicked the switch on them yesterday morning – mainly as an exercise in microblogging. JoeResort, recently found that the twitter-like microblogging is something that really fit his mode. I have had a twitter account (there’s actually two – another story) for quite a while. I have installed widgets, blog and browser plugins, but its not quite something that has clicked for me. The asides have really been self-focussed, much like a parallel del.icio.us (I recently added my own collection as a tag cloud on my blog). Not that I was concerned at all with filtering and creating a public persona, but simply that I found delicious and asides the most convenient way to track things. The third ( and parallel means) I have been tracking ‘things’ of note is of course through Zotero in Firefox. Although Zotero is a smashing tool and offers great data interchange, that syncing not automatic, and I use a variety of machines, each of which has its own sqlite Zotero database. I am sure that I could create a syncing mechanism (probably overweening confidence on my part) but the local nature of the database is both positive (speed, functionality, standalone operation) and a roadblock (syncing). Additionally, I don’t use FireFox all the time. I like Safari quite often for its blazing speed, and more importantly, on the machine I am writing this on, Firefox has problems with the old G4 and is a CPU hog. But I am going off topic here.

Lessons

So where are we? Attribute stuff when you steal it. Much like the students to whom I try to explain that referencing is a good thing – it just demonstrates that you have been a good researcher – sharing information is a good thing. I may not have seen the post you did and it is through you I find it.

Try to add your own stamp of varying length or erudition. If you can. This is where we are able to demonstrate the real value of the media. Collaborative consideration and construction. The way you see something is different from my perspective, if only in some minute way. Share that perspective and watch the synergy happen.

Find the tool that works for you and maybe it’s something as simple as microblogging. You have to find a means if information sharing that suits your self. Not every one is going to sit in their basements and create video podcasts, nor is blogging something that appeals to all. But there are a wide variety of means in the online space to participate and the critical eye is often not blinded by the technology.

If you are watching this stuff, reflect on the way in which material is repurposed/mashed up/stolen or simply blindly passed along the chain. Maybe what we need is a qualification for noting the originality of a post and thereby allowing one to set your own thresholds for viewing. The challenge here of course is measuring relative originality and freshness. Maybe is you put a text through a tagging semanticizer and then compared the meta generated you could start to compile a genetic descriptor that could allow for determining overall originality. The downside of this is that information is original for me the first time I see it, not the first time someone else human or machine sees it.

Actually, this whole post is an apology for the drought of posts on my own blog as of late. It’s just that I am studiously attempting to keep all that I write scruptuously original, but also veracious 😉

2 Comments

  1. Great post! Always curious, why did you not create a more public persona on twitter? Does twitter serve up a chance to gather subscribers into Randomosity? Or is it actually a different type of blog medium, using an “evolving shorthand” to communicate between intimates that is at times only meaningful between followers? The pensees like nature of twitters would make old Blaise proud, more immediate than a group email and certainly less cloistered than an IM. Kudos

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.