Sunday, July 20, 2008

The Academic World in One Site



OK – perhaps not the world. Let me explain.

One of the features of Brainify that I am most excited about is the idea of a community generated taxonomy of academic sites. I’d like to explain what it is and how it works.

For those of you unaware of the term “taxonomy” – it simply means a “categorization”. We are all familiar with the biological classification taxonomy (Kingdom, Phylum, Class, Order, Family …). All of life fits somewhere into this taxonomy.

An Academic Taxonomy

Similarly, all academic topics fit into an academic taxonomy. For example, Kinetics might be a subtopic of Physics, and Physics a subtopic of Science. One of the primary goals of Brainify is to produce a populated taxonomy of all the collected academic web sites. The intent of this is to make subject-based browsing of the academic content collected in Brainify particularly effective. Instead of choosing to create a taxonomy, Brainify could have simply used tagging for categorization (resulting in what some call a “folksnomy”). In fact, Brainify does support tagging because it is an outstanding tool for community classification and effective searching. Tagging, however, is inherently flat and does not support hierarchical browsing. Thus we have chosen to support both tagging and the creation of a community-built taxonomy.

The problem in academic taxonomies is that although they are useful, there is no agreement on a single taxonomy. For example, some universities list Computer Science under Science, some under Math, and some under Applied Science. This makes the derivation of an academic taxonomy difficult. And even if there was a standardized taxonomy available (there are, in fact, several - making none of them standardized), deciding where a particular web site fits into that taxonomy could result in a never ending debate.

How Does Brainify Build the Taxonomy?

First of all – Brainify does not build the taxonomy – the community does. When a member finds a useful academic web site and adds it to their Brainify collection, the collector is asked where he or she believes the web site should sit in the taxonomy. He or she traverses the hierarchy and chooses a location. If the collector does not feel that any of the existing sub-categories are appropriate for the site being collected, he or she is even able to create new sub-categories in the taxonomy. Since it is possible that a web site may be collected in Brainify by any number of different collectors, there may be many different locations in the taxonomy where it is placed *.

It is my hope, however, that as sites are collected and categorized, patterns will emerge in terms of their location in the taxonomy. I suspect we will find that even though one site is placed in, say, 10 different locations, a very high percentage of the collectors have chosen to place it in the same single location, and the remainder have placed it uniformly over the remaining 9 locations.

For other collected sites, we might find that some smaller proportion of collectors (say 50%) have placed it in one location, but that another significant portion (say 40%) have placed it in a second location. This, in fact, may be perfectly reasonable because some sites will be applicable to different disciplines. Having the community define the taxonomical location(s) of the web site ensures, I believe, that the community will find content in the place they would expect to look for it. So although we are not defining a single taxonomy where each site is located in exactly one spot, the community is instead creating a browsable hierarchy where academic items of interest are likely to be found exactly where other community members would think to look for them. This is the beauty, I believe, of a community-generated taxonomy. What it may give up in correctness, it gains in utility.

Practical Implications

Since Brainify is just launching, we have not yet encountered the practical implications of our community generated taxonomy (or “multi-onomy” now). However, we can guess at some.

First – I suspect we will find a long collection tail where there are a significant number of meaningless categorizations. For example, although we hope to avoid this through education, it could be that some people will collect a site as “science/assignments/assignment 1”. This might be meaningful to the collector, but not meaningful at all to the remainder of the community. My suspicion is that those kinds of categorizations will be distinguished by lack of agreement. That is, even though there may be a significant percentage of users who categorize meaninglessly, they will each choose their own different meaningless category. So our expectation is that for each collected web site we will find a small number of categories (1-3 or so) chosen by a large majority of collectors, and some larger number of categories each chosen by a very small number of collectors. The latter group can (and should) safely be ignored. Therefore, when browsing a category, Brainify will deferentially display sites where a significant number of users have agreed that the site belongs in that category.

Another potential issue is that the people building the taxonomy will largely be non-experts in their field. As such, it could be that some of the categorizations might not be considered to be the best choices according to discipline experts. I am not overly concerned about this prospect (though that may change with experience). Even though the collectors are non-experts, they are students studying the field and therefore do have a basic, and increasing, knowledge. In addition, the effective place for a site in the hierarchy requires some level of consensus and is not adversely affected by a few inappropriate categorizations. I am a big believer in the power of community and the likelihood that all will come out well in the end – as long as there are enough eyes on it. We are seeing that now in other sites – such as Wikipedia. Although it is not without its problems, the site has turned into an amazing resource – and was built through a community effort.

I am sure there will be other unexpected issues that we have to deal with, as well as other unexpected benefits of this community generated multi-onomy. But if Brainify ever develops the size and kind of community I wish for it, it is possible that one day we will see Brainify provide a categorization of nearly every academic web site that exists. As new ones are created, the community will almost instantly collect them and place them in the taxonomy. What a wonderful resource this could be for students. I have my fingers crossed.

Take care - Murray

--

* Note that, for this reason, I don’t really consider Brainify’s taxonomy to be a strict taxonomy. Perhaps, therefore, we should call it something other than a taxonomy – like a “Multi-onomy”

No comments: