OpenCalais

Posted by Minnaar Pieters 05 Mar 2009
Open Calais Design
While the semantic web is something that not everyone quite believes in, you have to admit that the potential of the idea is amazing. If search engines can actually recognize relationships between different elements within a document, search results will become much more accurate. Also, searching for something will actually give a user the ability to ask very direct questions, and a semantic search engine (Yahoo is apparently developing one, didnt know they still had any innovation left!) would actually know the answer from analysing documents.


Open Calais Logo
The tough part is of course tagging documents with relevant metadata so that relationships can easily be scanned. I recently came across OpenCalais.com which gives surprisingly accurate relationships within a document. It automatically embeds these tags within the document, and from there on you can easily integrate it within your site or document. This Document Viewer tool can be accessed here.

I ran my thesis through the document viewer and the results are quite incredible. While a user can actually go and tweak the results, I cannot help but think how useful this will be in fields like academia. Suddenly my thesis was a garbled mess of links - but on closer inspection, they make perfect sense. If someone can actually develop this into a visualization tool, the results will be amazing. (thats a tip for you, opencalais!)

"Calais is a web service that uses natural language processing (NLP) technology to semantically tag text that is input to the service. The tags are delivered to the user who can then incorporate them into other applications - for search, news aggregation, blogs, catalogs, you name it."

Users can also download a plugin for wordpress (no blogger support?) which can automatically create these relationships for every single post you do automatically.

If you are unfamiliar with the idea of Semantic Web, read a description here.

"Humans are capable of using the Web to carry out tasks such as finding the Finnish word for "monkey", reserving a library book, and searching for a low price for a DVD. However, a computer cannot accomplish the same tasks without human direction because web pages are designed to be read by people, not machines. The semantic web is a vision of information that is understandable by computers, so that they can perform more of the tedious work involved in finding, sharing, and combining information on the web." (Wikipedia)

Thanks for the tip ZaTechShow podcast 51.

blog comments powered by Disqus
My photo
I am a R&D Analyst in Stellenbosch South Africa who has a immense passion for all things tech related. I embrace technology, open source and web standards, and I participate and contribute to the social web.