How to avoid ambiguity on large vocabularies

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

How to avoid ambiguity on large vocabularies

Laura Morales
When designing a vocabulary with several classes and properties that evolve over time, I'm not sure what is a good strategy to avoid ambiguity. Let me explain better. On traditional paper vocabularies/dictionaries such as the dictionary of the English language, words usually have more than one meaning. The way of removing ambiguity is the context. In RDF I could use classes as the context to disambiguate a property with more than one meaning. For example

    # This is clear
    :Alice
        a :Singer;
        :hasProduced :AlbumFoo . # ex:hasProduced refers to a music album

and

    # This is clear
    :Bob
        a :Director ;
        :hasProduced :MovieBar . # ex:hasProduced refers to a movie

but a person could be both

    # This is ambiguous. What is the meaning of the property ex:hasProduced?
    :Charlie
        a :Singer;
        :hasProduced :AlbumFoo ;

        a :Director ;
        :hasProduced :MovieBar .

My particular problem is that the vocabulary grows over time, and it's not easy to foresee what future individuals will use as types. For example all my current individuals could be singers, then in the future a new individual will be a singer and a director. The few strategies that comes to my mind are

1) the vocabulary starts simple, but all future classes/properties will be more and more wordy to avoid collision. Doesn't look like a good plan to me, in the long run. I can see it get very messy over time
2) use the simplest properties first, and replace them in the future if there are collision problems. For example start with :hasProduced and change it to :hasProducedMusicAlbum and :hasProducedMovie when new individuals like :Charlie are inserted. This means having to change all the existing individuals which is not ideal especially if other people are reusing your vocabulary
3) split the :Charlie node into 3, :Charlie :CharlieTheSinger and :CharlieTheDirector

I wonder how other people cope with this problem.
_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user
Reply | Threaded
Open this post in threaded view
|

Re: How to avoid ambiguity on large vocabularies

Laura Morales
Another thing that I did not consider is that the "context" for disambiguating the meaning of a property could be found by looking at the current node and the neighborhood nodes too, for example

    :Charlie
        a :Singer;
        :hasProduced :AlbumFoo ; # The neighborhood has type :MusicAlbum

        a :Director ;
          :hasProduced :MovieBar . # The neighborhood has type :Movie

so, 4) write more complex queries



> Sent: Monday, October 07, 2019 at 10:02 AM
> From: "Laura Morales" <[hidden email]>
> To: [hidden email]
> Subject: [protege-user] How to avoid ambiguity on large vocabularies
>
> When designing a vocabulary with several classes and properties that evolve over time, I'm not sure what is a good strategy to avoid ambiguity. Let me explain better. On traditional paper vocabularies/dictionaries such as the dictionary of the English language, words usually have more than one meaning. The way of removing ambiguity is the context. In RDF I could use classes as the context to disambiguate a property with more than one meaning. For example
>
>     # This is clear
>     :Alice
>         a :Singer;
>         :hasProduced :AlbumFoo . # ex:hasProduced refers to a music album
>
> and
>
>     # This is clear
>     :Bob
>         a :Director ;
>         :hasProduced :MovieBar . # ex:hasProduced refers to a movie
>
> but a person could be both
>
>     # This is ambiguous. What is the meaning of the property ex:hasProduced?
>     :Charlie
>         a :Singer;
>         :hasProduced :AlbumFoo ;
>
>         a :Director ;
>         :hasProduced :MovieBar .
>
> My particular problem is that the vocabulary grows over time, and it's not easy to foresee what future individuals will use as types. For example all my current individuals could be singers, then in the future a new individual will be a singer and a director. The few strategies that comes to my mind are
>
> 1) the vocabulary starts simple, but all future classes/properties will be more and more wordy to avoid collision. Doesn't look like a good plan to me, in the long run. I can see it get very messy over time
> 2) use the simplest properties first, and replace them in the future if there are collision problems. For example start with :hasProduced and change it to :hasProducedMusicAlbum and :hasProducedMovie when new individuals like :Charlie are inserted. This means having to change all the existing individuals which is not ideal especially if other people are reusing your vocabulary
> 3) split the :Charlie node into 3, :Charlie :CharlieTheSinger and :CharlieTheDirector
>
> I wonder how other people cope with this problem.
> _______________________________________________
> protege-user mailing list
> [hidden email]
> https://mailman.stanford.edu/mailman/listinfo/protege-user
>
_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user
Reply | Threaded
Open this post in threaded view
|

Re: How to avoid ambiguity on large vocabularies

Lorenz Buehmann

On 07.10.19 10:07, Laura Morales wrote:
> Another thing that I did not consider is that the "context" for disambiguating the meaning of a property could be found by looking at the current node and the neighborhood nodes too, for example
>
>     :Charlie
>         a :Singer;
>         :hasProduced :AlbumFoo ; # The neighborhood has type :MusicAlbum
>
>         a :Director ;
>           :hasProduced :MovieBar . # The neighborhood has type :Movie


that won't work. you would lose the relationship between the context and
the facts, wouldn't it? I mean, this is just a set of triples:

:Charlie a :Singer.
:Charlie :hasProduced :AlbumFoo .
:Charlie a :Director .
:Charlie :hasProduced :MovieBar .

which is equivalent to

:Charlie a :Singer.
:Charlie a :Director .
:Charlie :hasProduced :MovieBar .
:Charlie :hasProduced :AlbumFoo .


about an entity :Charlie. Neither :Singer nor :Director context are
directly related to the corresponding :hasProduced triple. And the order
of triples doesn't matter. You'd need some intermediate structure like
blank nodes here



>
> so, 4) write more complex queries
>
>
>
>> Sent: Monday, October 07, 2019 at 10:02 AM
>> From: "Laura Morales" <[hidden email]>
>> To: [hidden email]
>> Subject: [protege-user] How to avoid ambiguity on large vocabularies
>>
>> When designing a vocabulary with several classes and properties that evolve over time, I'm not sure what is a good strategy to avoid ambiguity. Let me explain better. On traditional paper vocabularies/dictionaries such as the dictionary of the English language, words usually have more than one meaning. The way of removing ambiguity is the context. In RDF I could use classes as the context to disambiguate a property with more than one meaning. For example
>>
>>     # This is clear
>>     :Alice
>>         a :Singer;
>>         :hasProduced :AlbumFoo . # ex:hasProduced refers to a music album
>>
>> and
>>
>>     # This is clear
>>     :Bob
>>         a :Director ;
>>         :hasProduced :MovieBar . # ex:hasProduced refers to a movie
>>
>> but a person could be both
>>
>>     # This is ambiguous. What is the meaning of the property ex:hasProduced?
>>     :Charlie
>>         a :Singer;
>>         :hasProduced :AlbumFoo ;
>>
>>         a :Director ;
>>         :hasProduced :MovieBar .
>>
>> My particular problem is that the vocabulary grows over time, and it's not easy to foresee what future individuals will use as types. For example all my current individuals could be singers, then in the future a new individual will be a singer and a director. The few strategies that comes to my mind are
>>
>> 1) the vocabulary starts simple, but all future classes/properties will be more and more wordy to avoid collision. Doesn't look like a good plan to me, in the long run. I can see it get very messy over time
>> 2) use the simplest properties first, and replace them in the future if there are collision problems. For example start with :hasProduced and change it to :hasProducedMusicAlbum and :hasProducedMovie when new individuals like :Charlie are inserted. This means having to change all the existing individuals which is not ideal especially if other people are reusing your vocabulary
>> 3) split the :Charlie node into 3, :Charlie :CharlieTheSinger and :CharlieTheDirector
>>
>> I wonder how other people cope with this problem.
>> _______________________________________________
>> protege-user mailing list
>> [hidden email]
>> https://mailman.stanford.edu/mailman/listinfo/protege-user
>>
> _______________________________________________
> protege-user mailing list
> [hidden email]
> https://mailman.stanford.edu/mailman/listinfo/protege-user

_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user
Reply | Threaded
Open this post in threaded view
|

Re: How to avoid ambiguity on large vocabularies

Laura Morales
> > Another thing that I did not consider is that the "context" for disambiguating the meaning of a property could be found by looking at the current node and the neighborhood nodes too, for example
> >
> >     :Charlie
> >         a :Singer;
> >         :hasProduced :AlbumFoo ; # The neighborhood has type :MusicAlbum
> >
> >         a :Director ;
> >           :hasProduced :MovieBar . # The neighborhood has type :Movie
>
>
> that won't work. you would lose the relationship between the context and
> the facts, wouldn't it? I mean, this is just a set of triples:
>
> :Charlie a :Singer.
> :Charlie :hasProduced :AlbumFoo .
> :Charlie a :Director .
> :Charlie :hasProduced :MovieBar .
>
> which is equivalent to
>
> :Charlie a :Singer.
> :Charlie a :Director .
> :Charlie :hasProduced :MovieBar .
> :Charlie :hasProduced :AlbumFoo .
>
>
> about an entity :Charlie. Neither :Singer nor :Director context are
> directly related to the corresponding :hasProduced triple. And the order
> of triples doesn't matter. You'd need some intermediate structure like
> blank nodes here


What I was trying to say with the example is that I could have 2 classes that define a property with the same name but different meaning, that is

    :Singer   :hasProperty :hasProduced
    :Director :hasProperty :hasProduced

so in the example of :Charlie I could disambiguate the meaning of :hasProduced like this:

- if :Charlie is a :Singer, then :hasProduced means a music album
- if :Charlie is a :Director, then :hasProduced means a movie
- if :Charlie is both, then look at the type of the object linked by :hasProduced

In this particular case I think it can work. Not much so for data properties though, eg. strings.
Wither ways, I'd love to know how other people deal with this issue.
_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user
Reply | Threaded
Open this post in threaded view
|

Re: How to avoid ambiguity on large vocabularies

Michael DeBellis-2
I think there are a few ways to address your question. So if hasProduced means something completely different for movies and for music then you could just put them in different namespaces (e.g., film and music).. So if Charlie is a movie producer then you would say Charlie film:hasProduced film:Breathless. And if he is a music producer you would say Charlie music:hasProduced music:AbbeyRoad

However, you might want to say that producing a movie and music are related but different. Then you could use sub properties. hasProduced would be the super property and hasProducedFilm would be a sub-property of hasProduced and hasProducedMusic would be a sub property of hasProduced. 

In this latter case if I said Charlie hasProducedMovie Breathless then the reasoner would also conclude that Charlie also hasProduced Breathless. Similarly if Charlie hasProducedMusic AbbeyRoad it would infer that Charlie hasProduced AbbeyRoad. 

In the latter case you could define the ontology with domains and ranges (or with DL statements) so that if we know that AbbeyRoad is a work of music and that Charlie hasProduced AbbeyRoad then the reasoner infers that Charlie hasProducedMusic AbbeyRoad and similarly for movies. An advantage of this approach is that it's open ended so if you know Charlie hasProduced Breathless but you don't yet know if Breathless is a film or music you can just leave it at Charlie hasProduced Breathless and if/when you determine that Breathless is a movie the reasoner will make further inferences for you automatically.


On Mon, Oct 7, 2019 at 2:12 AM Laura Morales <[hidden email]> wrote:
> > Another thing that I did not consider is that the "context" for disambiguating the meaning of a property could be found by looking at the current node and the neighborhood nodes too, for example
> >
> >     :Charlie
> >         a :Singer;
> >         :hasProduced :AlbumFoo ; # The neighborhood has type :MusicAlbum
> >
> >         a :Director ;
> >           :hasProduced :MovieBar . # The neighborhood has type :Movie
>
>
> that won't work. you would lose the relationship between the context and
> the facts, wouldn't it? I mean, this is just a set of triples:
>
> :Charlie a :Singer.
> :Charlie :hasProduced :AlbumFoo .
> :Charlie a :Director .
> :Charlie :hasProduced :MovieBar .
>
> which is equivalent to
>
> :Charlie a :Singer.
> :Charlie a :Director .
> :Charlie :hasProduced :MovieBar .
> :Charlie :hasProduced :AlbumFoo .
>
>
> about an entity :Charlie. Neither :Singer nor :Director context are
> directly related to the corresponding :hasProduced triple. And the order
> of triples doesn't matter. You'd need some intermediate structure like
> blank nodes here


What I was trying to say with the example is that I could have 2 classes that define a property with the same name but different meaning, that is

    :Singer   :hasProperty :hasProduced
    :Director :hasProperty :hasProduced

so in the example of :Charlie I could disambiguate the meaning of :hasProduced like this:

- if :Charlie is a :Singer, then :hasProduced means a music album
- if :Charlie is a :Director, then :hasProduced means a movie
- if :Charlie is both, then look at the type of the object linked by :hasProduced

In this particular case I think it can work. Not much so for data properties though, eg. strings.
Wither ways, I'd love to know how other people deal with this issue.
_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user

_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user
Reply | Threaded
Open this post in threaded view
|

Re: How to avoid ambiguity on large vocabularies

Bill Dyla
In reply to this post by Laura Morales

I think your goal is to unambiguously establish the role (director, singer, actor, producer, etc) a person played in the creation of some work of art (movie, music album, etc) And also be able start simply with a few roles and extend as required.

If so, I would :


1- create a class :Role and add individuals for the roles (:actor, :director, :singer, :producer, etc)

2- create a class :WorkOfArt and sub types (:MusicAlbum, :Movie) But the sub classes are not used in below explanations
 
3- Create a class that qualifies the relationship between a person and a work of art e.g. :Involvement (more on this later)

4- Create the object property
that relates people to works of art E.g. :wasInvolvedIn with range :WorkOfArt

5- create the object property that relates people to roles played eg :hadRole with range :Role

6- create qualified relation between a person and their involvement eg :qualifiedInvlvement with range :Involvement - this allows us to assert other information about the relationship between a person and a work of art such as the role they played in its creation)
Note I did not put a domain. This allow us to add other involvement such  as the publishing company eg Sony Music


And back to your original knowledge: has a person directed, acted, played banjo, etc:
Another class hierarchy
:HadRole
     :Actor
     :Director
     ...

That each have equivalent class assertions. equivalent class example for :Actor -
  :qualifiedInvolvement some (:Involvement and :hadRole value :actor)
  And  Separately
:Person and :hadRole value :actor


you’ve now got a way to *incrementally* add the roles that people and organizations played with respect to a work of art and a way to classify them as actors and musician


And finally the assertions about individuals

:dirtyHarry a :Movie
:millionDollarBaby a :Movie
:clintEastwood a :Person;
   :qualifiedInvolvement [
        :hadRole :actor;
        :wasInvolvedIn :dirtyHarry]
   qualifiedInvolvement [
        :hadRole :producer ;
        :hadRole :director ;
        :wasInvolvedIn :millionDollarBaby]


Run the reasoner and you’ll also see
:clintEastwood a :Actor

this is a long way around to determine that Clint Eastwood is an actor, producer, director. But also explains *why* he’s an actor (he acted on specific movies). You can easily extend this to his production company Malpaso and which of his movies are related to such. It is also easily extended to capture the awards of which he was nominated and won and the class :AwardWinningActor.

This is a pattern I use a lot in real world applications
See w3c Prov for published examples



Sent from my iPhone

> On Oct 7, 2019, at 05:12, Laura Morales <[hidden email]> wrote:
>
> 
>>
>>> Another thing that I did not consider is that the "context" for disambiguating the meaning of a property could be found by looking at the current node and the neighborhood nodes too, for example
>>>
>>>    :Charlie
>>>        a :Singer;
>>>        :hasProduced :AlbumFoo ; # The neighborhood has type :MusicAlbum
>>>
>>>        a :Director ;
>>>          :hasProduced :MovieBar . # The neighborhood has type :Movie
>>
>>
>> that won't work. you would lose the relationship between the context and
>> the facts, wouldn't it? I mean, this is just a set of triples:
>>
>> :Charlie a :Singer.
>> :Charlie :hasProduced :AlbumFoo .
>> :Charlie a :Director .
>> :Charlie :hasProduced :MovieBar .
>>
>> which is equivalent to
>>
>> :Charlie a :Singer.
>> :Charlie a :Director .
>> :Charlie :hasProduced :MovieBar .
>> :Charlie :hasProduced :AlbumFoo .
>>
>>
>> about an entity :Charlie. Neither :Singer nor :Director context are
>> directly related to the corresponding :hasProduced triple. And the order
>> of triples doesn't matter. You'd need some intermediate structure like
>> blank nodes here
>
>
> What I was trying to say with the example is that I could have 2 classes that define a property with the same name but different meaning, that is
>
>    :Singer   :hasProperty :hasProduced
>    :Director :hasProperty :hasProduced
>
> so in the example of :Charlie I could disambiguate the meaning of :hasProduced like this:
>
> - if :Charlie is a :Singer, then :hasProduced means a music album
> - if :Charlie is a :Director, then :hasProduced means a movie
> - if :Charlie is both, then look at the type of the object linked by :hasProduced
>
> In this particular case I think it can work. Not much so for data properties though, eg. strings.
> Wither ways, I'd love to know how other people deal with this issue.
> _______________________________________________
> protege-user mailing list
> [hidden email]
> https://mailman.stanford.edu/mailman/listinfo/protege-user

_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user