Saturday, August 15, 2009

Pain and Research

Today, I got up from bed very late. I had a terrible teeth pain at the night. After taking my breakfast , I was writing something ontology matching techniques for my thesis. I want to share with it who will read my blog. Hope that they will pray for me for my thesis presentation.

Ontology matching techniques can be classified in the following :
  • Element Level Matching
  • Corpus-based Matching
  • Knowledge base Matching
  • Semantic Matching.
In Element Level Matching System, we start the process by comparing two strings. To compare strings, there are several methods are exiting.
  1. Prefix,
  2. suffix
  3. Edit distance
  4. N-gram
More interested readers , I can suggest that you should read the book of "Ontology Matching by Pavel"

In Corpus-based Matching, a large number of corpus are included. tokens are most important for this kind of matching . we can find matching using:

  • LSI(Latent Semantic Indexing)
  • Cluster Code Difference
  • Formal Concept Analysis
  • Common Instance Comparision
In knowledge-base matching, external resources are included. i.e. WordNet, Thesauri , Taxonomies, etc.

In semantic matching , match acts as operator and takes two graph and produces mapping but it depends on knowledge techniques as well . I think semantic matching techniques can not be accomplish without help of knowledge base. However, I can suggest to read a paper for semantic matching
"Semantic Matching" By Pavel Shvaiko and Fausto Giunchiglia

I think that matching is one of the hardest task. You can not achieve 100 % matching results by automatic matcher. I am not pessimistic person , I am optimistic . I am sure that we can overcome all this problem. It will be major break through for heterogeneity problem of data integration.




Tuesday, August 11, 2009

Different Kind of Controlled Vocabularies

I think that there are several kind of controlled vocabulary are exiting, e.g. , (thesauruses, Ontologies, Subject Schema, Catalogs). These vocabularies play an important role for information communications sytems, Library systems.

Thesaurus : It uses library science and knowledge organization systems

Subject Schema: It mainly uses in text categorization or document annotations.

Catalogs : Library science + Entertainment industry

All of these use for searching, information extraction etc.

In the past , mainly library science people used controlled vocabulary. After coming to Ontology, people understand need of controlled vocabularies for commercial or research purpose. There is major problem of universal controlled vocabularies. There is no universal Controlled Vocabularies for any specific domain. I must say that we need it very soon. Specially, Medical science , or Agriculture science or Other specific fields. I keep my finger cross for universal Controlled Vocabulary.

As I am researcher on this field, I looking forward a concrete research or volunteer work on it .

Monday, August 10, 2009

Controlled Vocabulary

There are several things come to mind when we hear about Controlled Vocabulary. In order to simplify my thoughts about controlled vocabularies , I planed to capture it in my blog:

  • What is Controlled Vocabulary
  • Why do we need them in our real life
  • How can you build Controlled Vocabulary
At the beginning, controlled vocabulary is one kind of database, ontology , thesaurus, Yellow Pages, classification schema etc. The simple definition of controlled vocabulary is a set of concepts and their relationship.

CV= Concepts + Relationship

For example, In flickr , if people tags their photos according to predefined keyword then it is easy to get the information. one person gives keyword "Trento" and put all the picture under the treno. Here Trento is working as controlled word.

There are many applications are exiting of Controlled vocabularies:

  • Information Extraction
  • Information browsing
  • Searching information
There are several approaches to build Controlled Vocabularies. I will define it next time.. going to run now.. ..:(

Sunday, August 9, 2009

Ontology Construction and Evaluation

I am writing on this topic because I am very much interested on it. At fist, we need to clarify some of questions:
  1. What is Ontology ?
  2. What is Ontology tool?
  3. Why we need to Evaluate it?
My point of view , Ontology is conceptual presentation that bears one scenario of a real world.
But famous definition "An explicit specification of a conceptualization".


There are lots of ontology construction tools around the world. The most famous tool is Protege from Standford University.

Ontology construction is very costly and time consuming. For example, If we want to capture all knowledge of one organization, the solution is ontology. Now, we need to check ontology construction tools so that we can design ontology with minimum cost.