Semantic UMW Rotating Header Image

October 4th, 2008:

Links: They’re Not Just For Breakfast and Google Anymore!

I mentioned in my previous post (now also here as a page) that working with links in posts is a big interest of mine. I’d like to give a quick update on the Link Friends Exhibit and expand on why links are so important and useful.

I’ve tweaked the pager for the link friends so that URLs with the most posts linking to them show up first. Unfortunately, the home page of UMWBlogs is excluded from the list because the “Hello World” post of new blogs links there. That makes the size of the result set simply too large to deal with, at least right now. Thanks to Eighteenth Century Audio, Librivox is at the top of the list with 70 — er, now 71 — posts that link to it.

In the Exhibits for individual blogs and posts, the list of links will now also direct you to an Exhibit of all the posts, blogs, and bloggers that share that link. Visit this blog’s Exhibit and click on something in the ‘All Links’ column to see it in action.

The majority of the posts that link to the same place are pairs created by auto-aggregation. Many course blogs are aggregating posts from the various members of the course, and so the same content — and links — appears in two different places. That makes a great number of pairs that link to the same place, which turns out to be a bit misleading since it’s really two instances of the same text, just from different contexts.

I’d like someday for things to get more interesting, with the Exhibit revealing completely different posts that happen to link to the same place. The technical mechanism will do that. This comes down to encouraging people to get into the habit of taking the time to include links to relevant sites when they can. What’s a relevant site?Blog home pages and particular posts are good candidates when your post is responding to someone else. Admittedly, this seems like a bit of overkill — trackbacks are meant to handle similar cross-referencing. Alas, because I’m scraping all the data from feeds, the trackbacks don’t show up.

Another good candidate is the sites being discussed in class, or are a reference, or are a useful tool for the class. Jeff McClurken’s post noting delicious (wikipedia link) as a useful tool is a good example of this. Mentioning Amazon.com (wikipedia)in your post? Make it a link. Using Zotero? (wikipedia) Make it a link. Omeka? Make it a link. Etc., etc., etc. . . .

That’s more than good practice for the readers of your blog, making it easy for them to check out a site that they might not yet be familiar with. It’s much, much, more. Many people are familiar with how Google (wikipedia) uses links. Through a mysterious algorithm that only they, and possibly Gandolf, know, the search results are rated by the links to that site. This is the “Google-juice” that useless sites use to get more traffic to their site. They create a bunch of links to their site, hoping that’ll boost their site to the top of the Google results page. It’s also why Wikipedia articles show up so frequently in Google searches — lots of people have linked to a relevant article.

But what I’m talking about is more, and more useful in some ways, than that. I’m talking about exactly the reverse of what Google does. Google uses the link to get information about the target of the link. I’m using the link to get information about the source of the link: your post. That’s a huge difference. That way, your link to stuff relevant to your post becomes data about your post. (That’s part of the idea of a Web of Data along with a Web of Documents that I mention in the About page.)

What difference does it make? Teaching and learning is all about discovering unexpected, maybe even serendipitous, connections. Two completely different people, studying completely different things, might very well be writing about the same site, tool, or topic. Including a link makes it easier to discover the unexpected commonalities across very different contexts.

But wait, as they say — there’s more. One especially useful variety of link is a link to a relevant Wikipedia article. My previous post mentioned that linking to a Wikipedia article serves as a tool for disambiguation, distinguishing between Paris France, Paris Texas, Paris Hilton, and Paris the Trojan hero. There’s more too it. A LOT more.

Through the extradinary service DBpedia (wikipedia article here), I will eventually be able to offer guides to finding similar posts even if they do not link to precisely the same Wikipedia article. DBpedia has been doing basically the same thing with Wikipedia that I’m doing with online content from UMW. Indeed, they are very much the inspiration for this project. They’re scraping data out of Wikipedia pages and making it available on the Semantic Web as Linked Data.

As I always used to encourage my students to ask, “So What?”. Almost all Wikipedia articles are in several different categories. DBpedia easily exposes those categories, which means it will be possible to find post that link to Wikipedia articles in the same categories. DBpedia also plays nicely with Geonames.org (wikipedia link), a huge body of geographical data. That will make it possible to find posts and/or blogs discussing things in the same geographical region. So if one post links to the Wikipedia article for Paris France, another links to the article for France, and other links to the article for the Eiffel Tower, it should be possible to pull all those together into a list of posts about stuff in France.

Did I mention that there’s more? DBpedia plays nicely with many other data sets. There’s YAGO, which offers standardized terms and relationships between them. There’s also a newer initiative from Zitgist called UMBEL. These projects are aiming toward subtle and precise categorization of material, which will make it easier to discover people with shared interests and thoughts.

We’ve moved into the future directions and possibilities for these technologies, and there’s still plenty of work to do to stitch it all together. That’s fine and as it should be. But the important thing we can all do now is to get into the habit of linking heavily. It’s a simple, easy-to-do technique that will pay off bigger and bigger dividends as time and technology progress.

css.php