rule to link two individuals having a similar value for the same dataProperty. Using JaccardIndex

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

rule to link two individuals having a similar value for the same dataProperty. Using JaccardIndex

Javingka

Hi everyone!

I'm trying to create a rule to link two individuals having a similar value for the same dataProperty. Applying a jaccard Index algorithm

The situation in n3 it would be like this:

{ _indA isA :person _indB isA :person _indA :hasName ?nameA _indB :hasName ?nameB (?nameA, ?nameB) :hasJaccardIndex ?JaccardValue ?JaccardValue math:lessThan "0.2" } -> { _indA :hasSimilarName _indB }

the algorithm is:

cA ∩ cB / cA + cB - (cA ∩ cB) or cA ∩ cB / (cA ∪ cB)

where

  • cA -> the number of characters in the string A
  • cB -> the number of characters in the string B
  • cA ∩ cB -> the number of characters string A and B has in common (intersection)
  • cA ∪ cB -> the total number of characters considering both strings without repetitions (union)
Soma idea? Have you seen something the sort? See :math or :log ontologies make think this kind of functions can be applied.

Thanks in advance!



Sent from the Protege User mailing list archive at Nabble.com.

_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user
Reply | Threaded
Open this post in threaded view
|

Re: rule to link two individuals having a similar value for the same dataProperty. Using JaccardIndex

Michael DeBellis-2
I couldn't quite understand the algorithm, what exactly you want the result of the rule to be. But I think what you want can be done pretty straight forwardly in SWRL. Have you looked here:  https://www.w3.org/Submission/SWRL/#8.2  Those are the math built-ins. So essentially what you want to do is first bind a bunch of variables in a rule to things like ?cA, ?cB, etc. and then use the math builtins to compare them. Actually, besides the math builtins you would need the string builtins as well: https://www.w3.org/Submission/SWRL/#8.4   With the string builtins you can do things like find substrings, find the length of strings, etc. 

So I'll use an example that I know isn't as complex as what you are trying to do but I think should give you the info you need to implement that algorithm. Suppose you have a property called hasLongerName that is a relation between two Persons where the first Person has a longer name than the second. You could find that in SWRL like this:

Person(?p1) ^ Person(?p2) ^ hasName(?p1, ?ns1)  ^ hasName(?p2, ?ns2) ^ swrlb:stringLength(?ns1l, ?ns1)  ^ swrlb:stringLength(?ns2l, ?ns2) ^ swrlb:greaterThan(?ns1l,  ?ns2l) -> hasLongerName(?p1, ?p2)

Where hasName is a functional data property that returns a string, ns1 stands for name string 1 and ns1l stands for name string 1 length. Note, you don't really need the two Person(?p1) expressions at the beginning unless there are other things that have names but you only want to compare names of Persons. But I like to include those kinds of expressions because I think it makes the rules more intuitive for others who might maintain them. Also, I'm not sure if you need to take the string length, I think the swrlb:greaterThan and other comparison builtins probably work on strings as well as numbers but I just included it to give you an idea how you can work back and forth between the math builtins and string builtins. Hope that helps. 

Michael

On Wed, Oct 9, 2019 at 5:48 AM Javingka <[hidden email]> wrote:

Hi everyone!

I'm trying to create a rule to link two individuals having a similar value for the same dataProperty. Applying a jaccard Index algorithm

The situation in n3 it would be like this:

{ _indA isA :person _indB isA :person _indA :hasName ?nameA _indB :hasName ?nameB (?nameA, ?nameB) :hasJaccardIndex ?JaccardValue ?JaccardValue math:lessThan "0.2" } -> { _indA :hasSimilarName _indB }

the algorithm is:

cA ∩ cB / cA + cB - (cA ∩ cB) or cA ∩ cB / (cA ∪ cB)

where

  • cA -> the number of characters in the string A
  • cB -> the number of characters in the string B
  • cA ∩ cB -> the number of characters string A and B has in common (intersection)
  • cA ∪ cB -> the total number of characters considering both strings without repetitions (union)
Soma idea? Have you seen something the sort? See :math or :log ontologies make think this kind of functions can be applied.

Thanks in advance!



Sent from the Protege User mailing list archive at Nabble.com.
_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user


_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user