- What is linking data
- What is the uses of linking data
- How will agrovoc work as linking data.
Thursday, December 3, 2009
Agrovoc as Linking data
I was thinking for a long time to write something about linking data. Since I am working on Agrovoc thesaurus. How can we use it as linking data format. Before starting something, we need to clearify some questions.
Saturday, October 10, 2009
Ecoterm Meeting
It was great experience in Ecoterm workshop on 5th and 6th of october, at Fao, Rome. I got a chance to represent AIMS registry and Mapping projects. Beside this, I talked with several wornderful person. Gail was wonderful lady with vast knowledge. From the workshop, I came to learn that we should take initiatives now for building enviroment terminology for future. It is a big issue of climate change in the earth. We need universal theasurus for earth science ,geo science. I know it is difficult to build or difficult to maintains. We can keep our finger cross.
I did not satisfy about any mapping project. Nobody explained me clearly about their thoughts. Isaac asked me about prefLabel from SKOS. I have not found any ontology matching tool using SKOS files. In our case , we can use it by parsing skos file.
My idea was concept faced based matching, I was not so smart to explain it. But I feel that we can use it for matching purpose. Facet is a distinct feature of concept that contains hidden knowledge of a Concept. I am writing a paper on it now. Hoping that it will be published and people will get know about more thoughts of mine.
I did not satisfy about any mapping project. Nobody explained me clearly about their thoughts. Isaac asked me about prefLabel from SKOS. I have not found any ontology matching tool using SKOS files. In our case , we can use it by parsing skos file.
My idea was concept faced based matching, I was not so smart to explain it. But I feel that we can use it for matching purpose. Facet is a distinct feature of concept that contains hidden knowledge of a Concept. I am writing a paper on it now. Hoping that it will be published and people will get know about more thoughts of mine.
Tuesday, September 1, 2009
Meta data in KOS
I was thinking to write about Knowledge Organization System(KOS). Since I am working on the same area.
To start about KOS: We have to know about meta data: simply data about data
Meta data for KOS:
* Name /title
* Acronym
* Owner/Creator
* Language
* Type
* Format
* Note
* Usage
* E-Mail
* Date()
* Souces
* version ( we should keep version controlling so that we can walk through different version if we need sometime).
* Usage/subject cover/ purpose/rating
* Singnature(some time use for security reason).
Here is more details about KOS registry draft:
Ref: http://staff.oclc.org/~vizine/NKOS/Thesaurus_Registry_version3_rev.htm
Different forms of Knowledge Organization Systems (KOS) and their standards:
Dictionaries, glossaries
ISO 12200:1999, Computer applications in terminology--Machine Readable Terminology
Interchange Format (MARTIF)--Negotiated Interchange
ISO 12620:1999, Computer applications in terminology--Data Categories.
Thesauri
ISO 2788-1986(E) / ANSI/NISO Z39.19-1993(R1998) (www.niso.org)
ZThes (using Z39.50, strictly ANSI Z39.19)
http://www.loc.gov/z3950/agency/profiles/zthes-04.html)
Browser at http://muffin.indexdata.dk/zthes/tbrowse.zap
Vocabulary Markup Language (VocML) (under discussion at NKOS)
See also http://ceres.ca.gov/KOS/
ISO 5964-1985(E) (multilingual)
USMARC format for authority data
(http://lcweb.loc.gov/marc/authority/ecadhome.html)
Topic maps (reference works, encyclopedias) (http://www.topicmaps.org/about.html)
ISO/IEC 13250:2000 Topic Maps
XML Topic Maps (XTM) 1.0 (http://www.topicmaps.org/xtm/1.0/)
Concept maps
Classification schemes
USMARC format for classification data
http://lcweb.loc.gov/marc/classification/eccdhome.html
Ontologies
Knowledge Interchange Format (KIF) NCITS.T2/98-004
(http://meta2.stanford.edu/kif/dpans.html)
Ontology Markup Language (OML) /
Conceptual Knowledge Markup Language (CKML)
(http://www.ontologos.org/OML/CKML-Grammar.html)
Ontology Interface Layer (OIL) (http://www.ontoknowledge.org/oil/)
Generic standards for knowledge structures, entity-relationship models
Resource Description Framework (RDF) (http://www.w3.org/RDF/)
Metadata Coalition. Open Information Model (OIM). Knowledge Management Model
(http://www.mdcinfo.com/OIM/)
XTM might also fit here
Ref: Dagobert Soergel
If we take example of Thesaurus Registry for KOS: we can defined it in the following way:
* termId
* termName
* term Qualifier
* term Langauge
* term Created Date
* term Modified Date
* term Modified by
* Souce DB
Every group is having own KOS and their presentation system. This structure/standard totally depends on how and what purpose will you use this system.
It has a lot of blessings in vocabulary system or faceted system
But ,It can not currently be utilised to full petential because semantic structure not explicitly represented.
From my point of view , faceted analysis is big research issue now the days. I do believe that if we can present and adapt our ontology in faceted way, we can browse it easily.
I want to hand off my writing now and thinking lots of digital preservation of digital documents.
To start about KOS: We have to know about meta data: simply data about data
Meta data for KOS:
* Name /title
* Acronym
* Owner/Creator
* Language
* Type
* Format
* Note
* Usage
* Date()
* Souces
* version ( we should keep version controlling so that we can walk through different version if we need sometime).
* Usage/subject cover/ purpose/rating
* Singnature(some time use for security reason).
Here is more details about KOS registry draft:
Ref: http://staff.oclc.org/~vizine/NKOS/Thesaurus_Registry_version3_rev.htm
Different forms of Knowledge Organization Systems (KOS) and their standards:
Dictionaries, glossaries
ISO 12200:1999, Computer applications in terminology--Machine Readable Terminology
Interchange Format (MARTIF)--Negotiated Interchange
ISO 12620:1999, Computer applications in terminology--Data Categories.
Thesauri
ISO 2788-1986(E) / ANSI/NISO Z39.19-1993(R1998) (www.niso.org)
ZThes (using Z39.50, strictly ANSI Z39.19)
http://www.loc.gov/z3950/agency/profiles/zthes-04.html)
Browser at http://muffin.indexdata.dk/zthes/tbrowse.zap
Vocabulary Markup Language (VocML) (under discussion at NKOS)
See also http://ceres.ca.gov/KOS/
ISO 5964-1985(E) (multilingual)
USMARC format for authority data
(http://lcweb.loc.gov/marc/authority/ecadhome.html)
Topic maps (reference works, encyclopedias) (http://www.topicmaps.org/about.html)
ISO/IEC 13250:2000 Topic Maps
XML Topic Maps (XTM) 1.0 (http://www.topicmaps.org/xtm/1.0/)
Concept maps
Classification schemes
USMARC format for classification data
http://lcweb.loc.gov/marc/classification/eccdhome.html
Ontologies
Knowledge Interchange Format (KIF) NCITS.T2/98-004
(http://meta2.stanford.edu/kif/dpans.html)
Ontology Markup Language (OML) /
Conceptual Knowledge Markup Language (CKML)
(http://www.ontologos.org/OML/CKML-Grammar.html)
Ontology Interface Layer (OIL) (http://www.ontoknowledge.org/oil/)
Generic standards for knowledge structures, entity-relationship models
Resource Description Framework (RDF) (http://www.w3.org/RDF/)
Metadata Coalition. Open Information Model (OIM). Knowledge Management Model
(http://www.mdcinfo.com/OIM/)
XTM might also fit here
Ref: Dagobert Soergel
If we take example of Thesaurus Registry for KOS: we can defined it in the following way:
* termId
* termName
* term Qualifier
* term Langauge
* term Created Date
* term Modified Date
* term Modified by
* Souce DB
Every group is having own KOS and their presentation system. This structure/standard totally depends on how and what purpose will you use this system.
It has a lot of blessings in vocabulary system or faceted system
But ,It can not currently be utilised to full petential because semantic structure not explicitly represented.
From my point of view , faceted analysis is big research issue now the days. I do believe that if we can present and adapt our ontology in faceted way, we can browse it easily.
I want to hand off my writing now and thinking lots of digital preservation of digital documents.
Monday, August 24, 2009
Preserving digital Memories
Today, I was talking with my friend Imma , she is library management specialist. I was asking her views regarding digital document preservation. She told me that text document are preserved as pdf and visibility is difficult if the file size is large.
This is a hot issue in the semantic web community, how can we preserve the information. For example , dynamic information is very difficult to preserve( satelight picture or audio information ).
This problem is not only semantic web domain but also it covers all the domain. For example, geographic information , natural pictures, videos etc.
The previously, people took picture by analog camera ;developed the pictures and preserve it in the album. But now, we take lots of pictures with digital camera; how many pictures we preserve it. From my experience, I have lost lot of pictures due to my hard disk crash. I will never get those pictures again. Since, I could not back to the time.
Also, people feel good when they remember childhood memory. But, some people can not remember all the things.
Once I asked one researcher from UK about this. Her vision was an electro magnetic chip that will help to remember things. Do we think , it is enough?
Ancient age, People wrote inforamtion in the stone. Professor Kurodo says"Archiving the mountains of digitalised culture heritage we have amassed for the future is paramount"
There are lots initiative taken now the days:
Recently yahoo announced that they will help digitise 18,000 works of American literature plus material from national and European archieves. That will include books, speeches, audio, video and music.
News channel , BBC also maintains archieve
Ref: http://www.bbc.co.uk/archive/
University of Trento and Trento city lunched new project "LiveMemory". The main theme of the project is to preserve the city information.
The main problem of preservation is format. There is no unique format for a picture , video, audio or text . Another problem is file size. There are some people offering to put your information or files in the online but I think it is not secured.
According to my views, we can do following things:
This is a hot issue in the semantic web community, how can we preserve the information. For example , dynamic information is very difficult to preserve( satelight picture or audio information ).
This problem is not only semantic web domain but also it covers all the domain. For example, geographic information , natural pictures, videos etc.
The previously, people took picture by analog camera ;developed the pictures and preserve it in the album. But now, we take lots of pictures with digital camera; how many pictures we preserve it. From my experience, I have lost lot of pictures due to my hard disk crash. I will never get those pictures again. Since, I could not back to the time.
Also, people feel good when they remember childhood memory. But, some people can not remember all the things.
Once I asked one researcher from UK about this. Her vision was an electro magnetic chip that will help to remember things. Do we think , it is enough?
Ancient age, People wrote inforamtion in the stone. Professor Kurodo says"Archiving the mountains of digitalised culture heritage we have amassed for the future is paramount"
There are lots initiative taken now the days:
Recently yahoo announced that they will help digitise 18,000 works of American literature plus material from national and European archieves. That will include books, speeches, audio, video and music.
News channel , BBC also maintains archieve
Ref: http://www.bbc.co.uk/archive/
University of Trento and Trento city lunched new project "LiveMemory". The main theme of the project is to preserve the city information.
The main problem of preservation is format. There is no unique format for a picture , video, audio or text . Another problem is file size. There are some people offering to put your information or files in the online but I think it is not secured.
According to my views, we can do following things:
- Everybody should agree about the unique format. Example W3 for RDF, OWL, XML
- We should start the campaign about "Losing Information in Every Second"through the popular search engines.
- Build an International forum for Digital Information(Pictures forum , Document forum etc)
- Make a digital repository for every city.
Sunday, August 23, 2009
My master thesis
Recently, I have got compliments from two persons about my master thesis. I was extremely happy to get it.
I had done my master thesis at KTH, Sweden with my friend Ramanjit Singh. He is very nice guy with a good sense of humor. We started our thesis under Prof. Paul Johannesson and Gudrun Jeppesen Neve. Our teachers was extremely nice and helpful to us. Our thesis was about "Evaluation Ontology Construction tools and Ranking techniques". Intially , we had plan to evaluate ontology construction tools at least 10 but we had not time. Specially me, I had got a PhD position at University of Trento, Italy. I told Prof. Paul about it. He inspired me about PhD and told me that you can do 3 tools evaluations. We worked hard and presented our thesis.
I had always fascination about my thesis. But , I was doing completing different things in my PhD studies. After 2 years ,I forgot about my thesis and the previous work .
When I got a letter from one researcher, UK and another e-mail from a semantic columist, USA. I was wornder and feel extremely good. I wish I could do more research on it. Remembering my beautiful days at KTH.
I had done my master thesis at KTH, Sweden with my friend Ramanjit Singh. He is very nice guy with a good sense of humor. We started our thesis under Prof. Paul Johannesson and Gudrun Jeppesen Neve. Our teachers was extremely nice and helpful to us. Our thesis was about "Evaluation Ontology Construction tools and Ranking techniques". Intially , we had plan to evaluate ontology construction tools at least 10 but we had not time. Specially me, I had got a PhD position at University of Trento, Italy. I told Prof. Paul about it. He inspired me about PhD and told me that you can do 3 tools evaluations. We worked hard and presented our thesis.
I had always fascination about my thesis. But , I was doing completing different things in my PhD studies. After 2 years ,I forgot about my thesis and the previous work .
When I got a letter from one researcher, UK and another e-mail from a semantic columist, USA. I was wornder and feel extremely good. I wish I could do more research on it. Remembering my beautiful days at KTH.
Saturday, August 22, 2009
Library Catalog System
A library catalog system keeps records for al bibliographic items .
Bibliographic items:
History:
AS far I know from history, library catalogues are introduced by in the house of wisdom. Then, there was a big collections of books of 7th and 8th centuray in Iraq during Islamic Renaissance. They used totally different catalogue syem in their library.
Later on , Hulagu khan attacked Iraq and distroyed and burnt all books.
Ref:http://liswiki.org/wiki/History_of_the_card_catalog
Here is some wonderful tools for cataloging
Ref : http://www.lib.berkeley.edu/Catalogs/
Bibliographic items:
- books
- computer files
- graphics
- realia,
- cartographic
History:
AS far I know from history, library catalogues are introduced by in the house of wisdom. Then, there was a big collections of books of 7th and 8th centuray in Iraq during Islamic Renaissance. They used totally different catalogue syem in their library.
Later on , Hulagu khan attacked Iraq and distroyed and burnt all books.
Ref:http://liswiki.org/wiki/History_of_the_card_catalog
Here is some wonderful tools for cataloging
Ref : http://www.lib.berkeley.edu/Catalogs/
Holly Ramadan
Allah has given us lots of things, but we always forget to show our respect/ sacrifices to him. Today is our holly Ramadan. I am praying to God so that I can keep my fasting. Generally people think, you can not do work if you keep your fasting. I, myself believe that I do my best work in Ramadan. Anyway, Back to my thesis writing .
Friday, August 21, 2009
Semantic heterogeneity and factors
I was thinking to write about semantic heterogeneity for a long time. Its big problem for semantic matching purpose. Since my thesis on matching between two controlled vocabulary.
In short , I found some factors for heterogeneity problem:
In short , I found some factors for heterogeneity problem:
- Time (Vocabulary changes time-to-time)
- place( after 50 miles , a new language starts, for example , italian langauge of trentino people is different than bolzano people or Distinct language in india for every states).
- cultural diversity( English people say centre , American people say center).
- structure of vocabulary (there is no unique presentation of vocabulary, for example some people use rdf files , some people use xml files)
- Syntactic heterogeneity
- Terminological heterogeneity(Paper vs Article)
- Conceptual heterogeneity is also called semantic heterogeneity.
- Semiotic heterogeneity
Sunday, August 16, 2009
Is social networking problem?
At night, this question came to my mind whether social networking are good or bad or time consuming? I was thinking to write something on it.
As we know that social networking sites( facebook, myspace, friends, h15) are growing popularity everyday. Is this popular only 15-20 aged people or 21-30 or 31-48 so on . I think that mostly people are aged 17-19 are very crazy for facebook. I had a small research on it. I asked a couple people and found out that they are basically using facebook for making new friend. Guys are poking girls and girls are also watching guys faces and physics. I asked myself , "Is it usful or not useful or time consuming". One sense it is useful , you can make good and new friend. On the hand , it is simply time consuming and u can use this time for other purpose. so it is giving any impact for young people with earlier age.
if you consider the group 21-35 then you can find out that most of people are using to communicate with their school or college or university friends or official colleaguge .I do not support those people who are using or browsing facebook at the office. if you spend 15 min or 20 min everyday on facebook then u can wast 1 h official time. It was new about one official person that he told his boss about sickness but boss saw him in facebook and got angry and fired him from job. This kind of cases are coming now the days. I am not saying you should not use it . I can say that you can use it but you should also respect your official time and work.
I think we should use social network site after official hours. or you can use it as learning or group meeting purpose.
but age group 36- 48 use social network site for finding their old school buddies. They are not so frequent on it. They are happy to see and communicated people.
However, we can make it useful :
1. Teacher can open a page for his course and make open discuss through it.
2. We can make meeting schedule among the group.
3. Publish some useful information so that people can get some knowledge.
As we know that social networking sites( facebook, myspace, friends, h15) are growing popularity everyday. Is this popular only 15-20 aged people or 21-30 or 31-48 so on . I think that mostly people are aged 17-19 are very crazy for facebook. I had a small research on it. I asked a couple people and found out that they are basically using facebook for making new friend. Guys are poking girls and girls are also watching guys faces and physics. I asked myself , "Is it usful or not useful or time consuming". One sense it is useful , you can make good and new friend. On the hand , it is simply time consuming and u can use this time for other purpose. so it is giving any impact for young people with earlier age.
if you consider the group 21-35 then you can find out that most of people are using to communicate with their school or college or university friends or official colleaguge .I do not support those people who are using or browsing facebook at the office. if you spend 15 min or 20 min everyday on facebook then u can wast 1 h official time. It was new about one official person that he told his boss about sickness but boss saw him in facebook and got angry and fired him from job. This kind of cases are coming now the days. I am not saying you should not use it . I can say that you can use it but you should also respect your official time and work.
I think we should use social network site after official hours. or you can use it as learning or group meeting purpose.
but age group 36- 48 use social network site for finding their old school buddies. They are not so frequent on it. They are happy to see and communicated people.
However, we can make it useful :
1. Teacher can open a page for his course and make open discuss through it.
2. We can make meeting schedule among the group.
3. Publish some useful information so that people can get some knowledge.
Saturday, August 15, 2009
Pain and Research
Today, I got up from bed very late. I had a terrible teeth pain at the night. After taking my breakfast , I was writing something ontology matching techniques for my thesis. I want to share with it who will read my blog. Hope that they will pray for me for my thesis presentation.
Ontology matching techniques can be classified in the following :
In Corpus-based Matching, a large number of corpus are included. tokens are most important for this kind of matching . we can find matching using:
In semantic matching , match acts as operator and takes two graph and produces mapping but it depends on knowledge techniques as well . I think semantic matching techniques can not be accomplish without help of knowledge base. However, I can suggest to read a paper for semantic matching
I think that matching is one of the hardest task. You can not achieve 100 % matching results by automatic matcher. I am not pessimistic person , I am optimistic . I am sure that we can overcome all this problem. It will be major break through for heterogeneity problem of data integration.
Ontology matching techniques can be classified in the following :
- Element Level Matching
- Corpus-based Matching
- Knowledge base Matching
- Semantic Matching.
- Prefix,
- suffix
- Edit distance
- N-gram
In Corpus-based Matching, a large number of corpus are included. tokens are most important for this kind of matching . we can find matching using:
- LSI(Latent Semantic Indexing)
- Cluster Code Difference
- Formal Concept Analysis
- Common Instance Comparision
In semantic matching , match acts as operator and takes two graph and produces mapping but it depends on knowledge techniques as well . I think semantic matching techniques can not be accomplish without help of knowledge base. However, I can suggest to read a paper for semantic matching
"Semantic Matching" By Pavel Shvaiko and Fausto Giunchiglia
I think that matching is one of the hardest task. You can not achieve 100 % matching results by automatic matcher. I am not pessimistic person , I am optimistic . I am sure that we can overcome all this problem. It will be major break through for heterogeneity problem of data integration.
Tuesday, August 11, 2009
Different Kind of Controlled Vocabularies
I think that there are several kind of controlled vocabulary are exiting, e.g. , (thesauruses, Ontologies, Subject Schema, Catalogs). These vocabularies play an important role for information communications sytems, Library systems.
Thesaurus : It uses library science and knowledge organization systems
Subject Schema: It mainly uses in text categorization or document annotations.
Catalogs : Library science + Entertainment industry
All of these use for searching, information extraction etc.
In the past , mainly library science people used controlled vocabulary. After coming to Ontology, people understand need of controlled vocabularies for commercial or research purpose. There is major problem of universal controlled vocabularies. There is no universal Controlled Vocabularies for any specific domain. I must say that we need it very soon. Specially, Medical science , or Agriculture science or Other specific fields. I keep my finger cross for universal Controlled Vocabulary.
As I am researcher on this field, I looking forward a concrete research or volunteer work on it .
Thesaurus : It uses library science and knowledge organization systems
Subject Schema: It mainly uses in text categorization or document annotations.
Catalogs : Library science + Entertainment industry
All of these use for searching, information extraction etc.
In the past , mainly library science people used controlled vocabulary. After coming to Ontology, people understand need of controlled vocabularies for commercial or research purpose. There is major problem of universal controlled vocabularies. There is no universal Controlled Vocabularies for any specific domain. I must say that we need it very soon. Specially, Medical science , or Agriculture science or Other specific fields. I keep my finger cross for universal Controlled Vocabulary.
As I am researcher on this field, I looking forward a concrete research or volunteer work on it .
Monday, August 10, 2009
Controlled Vocabulary
There are several things come to mind when we hear about Controlled Vocabulary. In order to simplify my thoughts about controlled vocabularies , I planed to capture it in my blog:
CV= Concepts + Relationship
For example, In flickr , if people tags their photos according to predefined keyword then it is easy to get the information. one person gives keyword "Trento" and put all the picture under the treno. Here Trento is working as controlled word.
There are many applications are exiting of Controlled vocabularies:
- What is Controlled Vocabulary
- Why do we need them in our real life
- How can you build Controlled Vocabulary
CV= Concepts + Relationship
For example, In flickr , if people tags their photos according to predefined keyword then it is easy to get the information. one person gives keyword "Trento" and put all the picture under the treno. Here Trento is working as controlled word.
There are many applications are exiting of Controlled vocabularies:
- Information Extraction
- Information browsing
- Searching information
Sunday, August 9, 2009
Ontology Construction and Evaluation
I am writing on this topic because I am very much interested on it. At fist, we need to clarify some of questions:
But famous definition "An explicit specification of a conceptualization".
There are lots of ontology construction tools around the world. The most famous tool is Protege from Standford University.
Ontology construction is very costly and time consuming. For example, If we want to capture all knowledge of one organization, the solution is ontology. Now, we need to check ontology construction tools so that we can design ontology with minimum cost.
- What is Ontology ?
- What is Ontology tool?
- Why we need to Evaluate it?
But famous definition "An explicit specification of a conceptualization".
There are lots of ontology construction tools around the world. The most famous tool is Protege from Standford University.
Ontology construction is very costly and time consuming. For example, If we want to capture all knowledge of one organization, the solution is ontology. Now, we need to check ontology construction tools so that we can design ontology with minimum cost.
Subscribe to:
Posts (Atom)