David Weinberger’s “Everything is Miscellaneous” is a well-written exploration of the various “geographies of knowledge” and how our maps of this knowledge are changing as our tools and computational processing improve. If you are a data wonk, organization freak, or just somebody intrigued by how the classification of massive amounts of data, this is a book for you. It’s a lively mix of historical classification schemes and modern use cases of companies finding ways of making their data more useful.

The inside flap of the book lists three “profound consequences” which it believes are important (the following bullets are quotes from the inner flap):
- Information is most valuable when it is thrown into a big digital “pile” to be filtered and organized by users themselves.
- Instead of relying on experts, groups of passionate users are inventing their own ways of discovering what they know and want.
- Smart companies do not treat information as an asset to be guarded, but let it loose to be “mashed up,” gaining market awareness and customer loyalty.
While I believe the three claims above are well-documented throughout the book, I think that there is a core component of the dialog that should have been expounded in more detail; precisely how this miscellaneous pile becomes relevant and navigable to each individual user. While there are pockets of companies and researchers adding meta-data to digital archives in a way that enriches it for targeted audiences, it’s still a very small group and a very small percentage of the overall material on the web. Indeed, while I know of no metrics on this, I would wager that the index of enriched content is falling further behind the actual pace of content creation. I believe that some rich forms of intent publication need to be added to the equation in an automatic manner in order for this problem to be solved.
In our current world, the internet is an extension of faulty (albeit useful) rules. “Thou must be cited (or linked to) to be useful,” seems to be an underlying rule of the internet. I cannot prove it, but I’m willing to bet there is an amazing amount of useful data, specific to given user search queries, not returned as prominent search results by the current algorithms. Specifying intent in a search query is a difficult matter, often involving programming-like search queries. This is not something the average internet user knows how to do.
While there is infinitely more data available to the layperson than ever before, and our search engines are enormously useful generic indexes, they still only brush the surface of user intent, something that is promised by what everyone is calling the “semantic web.” Theoretically, this is the framework that will allow computers to infer “user intent” on behalf of a user and make a request to other computers for precisely what the user probably wants. In order for this pile of randomness to become useful, we need a generic way of publishing meta-data about the sites being visited in a manner which does not get in the way of the average user. Ideally, this universal schema of intent would be enough to apply to individual user repositories of data. In fact, this might be where the movement for this sort of disclosure would need to occur. Anybody interested in defining a universal classification scheme for intent? Yikes!
I’d highly recommend this book by Weinberger as a thought-provoking and fun read. It had me questioning the organization schema of my own home library (and patting myself on the back for a couple ways that I’d apparently internalized hundreds of years of various classification methods already). It’s a fun read that doesn’t take too long.
I’d love to chat with folks about it so feel free to contact me via the comments if you are interested!