Property path parsing bug in Protégé?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Property path parsing bug in Protégé?

Joshua TAYLOR
Hello all,

I just answered a question on StackOverflow [1] wherein I ran into a
small ontology (posted in completeness there) where three SPARQL
queries that I'd thought should have returned the same results return
different results.  My answer  [2] to the question includes the
ontology and the queries, as well as screenshots.  The long and short
of it is that trying to select some subclasses *(those in a direct
rdfs:subClassOf path, modulo owl:equivalentClass links) of :C with
this query

    ?subclass (^owl:equivalentClass|(owl:equivalentClass|rdfs:subClassOf))*  :C

works just fine.  However, a seemingly equivalent query

    (rdfs:subClass|owl:equivalentClass|^owl:equivalentClass)*

returns different results, and another

    (owl:equivalentClass|^owl:equivalentClass|rdfs:subClass)*

returns different results from the first two.

If this isn't a known bug already, it's probably not too hard to
pinpoint exactly what's happening (using a different example and three
distinct properties), but I haven't done that yet.

Does this look like a bug, or have I misread the property path syntax?

//JT

[1] http://stackoverflow.com/q/21092246/1281433
[2] http://stackoverflow.com/a/21093154/1281433

--
Joshua Taylor, http://www.cs.rpi.edu/~tayloj/
_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03
Reply | Threaded
Open this post in threaded view
|

Re: Property path parsing bug in Protégé?

Joshua TAYLOR
Sorry, I ended up editing that answer.  The version that I originally
mentioned was

http://stackoverflow.com/revisions/21093154/2

However, the updated version,

http://stackoverflow.com/revisions/21093154/3

show some similar issues, and compares the output against's Jena's.
In case the answer gets edited any more, I'll include the query and
sample data here.  Here's a simple ontology with an axiom

A subClassOf (B and C and (p some D))

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:owl="http://www.w3.org/2002/07/owl#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
    xmlns="http://stackoverflow.com/q/21092246/1281433/data.owl#">
  <owl:Ontology
rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl"/>
  <owl:Class rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl#C"/>
  <owl:Class rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl#B"/>
  <owl:Class rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl#A">
    <rdfs:subClassOf>
      <owl:Class>
        <owl:intersectionOf rdf:parseType="Collection">
          <owl:Class
rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl#B"/>
          <owl:Class
rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl#C"/>
          <owl:Restriction>
            <owl:onProperty>
              <owl:ObjectProperty
rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl#p"/>
            </owl:onProperty>
            <owl:someValuesFrom>
              <owl:Class
rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl#D"/>
            </owl:someValuesFrom>
          </owl:Restriction>
        </owl:intersectionOf>
      </owl:Class>
    </rdfs:subClassOf>
  </owl:Class>
</rdf:RDF>

Using Jena, this query

prefix :      <http://stackoverflow.com/q/21092246/1281433/data.owl#>
prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#>
prefix owl:   <http://www.w3.org/2002/07/owl#>
prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

select ?superclass where {
  :A (rdfs:subClassOf|(owl:intersectionOf/rdf:rest*/rdf:first))* ?superclass .
}

produces this output:

--------------
| superclass |
==============
| :A         |
| _:b0       |
| :B         |
| :C         |
| _:b1       |
--------------

Now, I realize that Protege will typically do something prettier with
the blank nodes (in this case, it ought to show (B and C and (p some
D)) and (p some D) ).  However, I actually only get the results A and
B, as shown in http://i.stack.imgur.com/bv7xm.png .


On Mon, Jan 13, 2014 at 1:02 PM, Joshua TAYLOR <[hidden email]> wrote:

> Hello all,
>
> I just answered a question on StackOverflow [1] wherein I ran into a
> small ontology (posted in completeness there) where three SPARQL
> queries that I'd thought should have returned the same results return
> different results.  My answer  [2] to the question includes the
> ontology and the queries, as well as screenshots.  The long and short
> of it is that trying to select some subclasses *(those in a direct
> rdfs:subClassOf path, modulo owl:equivalentClass links) of :C with
> this query
>
>     ?subclass (^owl:equivalentClass|(owl:equivalentClass|rdfs:subClassOf))*  :C
>
> works just fine.  However, a seemingly equivalent query
>
>     (rdfs:subClass|owl:equivalentClass|^owl:equivalentClass)*
>
> returns different results, and another
>
>     (owl:equivalentClass|^owl:equivalentClass|rdfs:subClass)*
>
> returns different results from the first two.
>
> If this isn't a known bug already, it's probably not too hard to
> pinpoint exactly what's happening (using a different example and three
> distinct properties), but I haven't done that yet.
>
> Does this look like a bug, or have I misread the property path syntax?
>
> //JT
>
> [1] http://stackoverflow.com/q/21092246/1281433
> [2] http://stackoverflow.com/a/21093154/1281433
>
> --
> Joshua Taylor, http://www.cs.rpi.edu/~tayloj/



--
Joshua Taylor, http://www.cs.rpi.edu/~tayloj/
_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03
Reply | Threaded
Open this post in threaded view
|

Re: Property path parsing bug in Protégé?

Timothy Redmond

So I think that the query in question is

prefix :      <http://stackoverflow.com/q/21092246/1281433/data.owl#> 
prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> 
prefix owl:   <http://www.w3.org/2002/07/owl#> 
prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 

select ?superclass where {
  :A (rdfs:subClassOf|(owl:intersectionOf/rdf:rest*/rdf:first))* ?superclass .
  filter(!isBlank(?superclass))
}
run against the attached owl file. Interestingly when I tried this I got a result including A, B and C. 

But I am wondering if this issue comes up because of the fact that the query is sensitive to the way that the OWL file is encoded as RDF.  The current SPARQL query plugin works by
  1. taking the owl ontology in memory and writing it as RDF in the form of an in-memory sesame model.
  2. using the sesame query tools to make the query.

In particular the original OWL as RDF file is not used for the query.  I can imagine that the class expression

B
 and C
 and (p some D)

might be expressible in RDF with two owl:intersectionOf expressions.  Perhaps this is what happened in your in-memory sesame model.  But I am not sure why this would happen exactly.

-Timothy.



On 01/13/2014 11:33 AM, Joshua TAYLOR wrote:
Sorry, I ended up editing that answer.  The version that I originally
mentioned was

http://stackoverflow.com/revisions/21093154/2

However, the updated version,

http://stackoverflow.com/revisions/21093154/3

show some similar issues, and compares the output against's Jena's.
In case the answer gets edited any more, I'll include the query and
sample data here.  Here's a simple ontology with an axiom

A subClassOf (B and C and (p some D))

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:owl="http://www.w3.org/2002/07/owl#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
    xmlns="http://stackoverflow.com/q/21092246/1281433/data.owl#">
  <owl:Ontology
rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl"/>
  <owl:Class rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl#C"/>
  <owl:Class rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl#B"/>
  <owl:Class rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl#A">
    <rdfs:subClassOf>
      <owl:Class>
        <owl:intersectionOf rdf:parseType="Collection">
          <owl:Class
rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl#B"/>
          <owl:Class
rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl#C"/>
          <owl:Restriction>
            <owl:onProperty>
              <owl:ObjectProperty
rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl#p"/>
            </owl:onProperty>
            <owl:someValuesFrom>
              <owl:Class
rdf:about="http://stackoverflow.com/q/21092246/1281433/data.owl#D"/>
            </owl:someValuesFrom>
          </owl:Restriction>
        </owl:intersectionOf>
      </owl:Class>
    </rdfs:subClassOf>
  </owl:Class>
</rdf:RDF>

Using Jena, this query

prefix :      <http://stackoverflow.com/q/21092246/1281433/data.owl#>
prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#>
prefix owl:   <http://www.w3.org/2002/07/owl#>
prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

select ?superclass where {
  :A (rdfs:subClassOf|(owl:intersectionOf/rdf:rest*/rdf:first))* ?superclass .
}

produces this output:

--------------
| superclass |
==============
| :A         |
| _:b0       |
| :B         |
| :C         |
| _:b1       |
--------------

Now, I realize that Protege will typically do something prettier with
the blank nodes (in this case, it ought to show (B and C and (p some
D)) and (p some D) ).  However, I actually only get the results A and
B, as shown in http://i.stack.imgur.com/bv7xm.png .


On Mon, Jan 13, 2014 at 1:02 PM, Joshua TAYLOR [hidden email] wrote:
Hello all,

I just answered a question on StackOverflow [1] wherein I ran into a
small ontology (posted in completeness there) where three SPARQL
queries that I'd thought should have returned the same results return
different results.  My answer  [2] to the question includes the
ontology and the queries, as well as screenshots.  The long and short
of it is that trying to select some subclasses *(those in a direct
rdfs:subClassOf path, modulo owl:equivalentClass links) of :C with
this query

    ?subclass (^owl:equivalentClass|(owl:equivalentClass|rdfs:subClassOf))*  :C

works just fine.  However, a seemingly equivalent query

    (rdfs:subClass|owl:equivalentClass|^owl:equivalentClass)*

returns different results, and another

    (owl:equivalentClass|^owl:equivalentClass|rdfs:subClass)*

returns different results from the first two.

If this isn't a known bug already, it's probably not too hard to
pinpoint exactly what's happening (using a different example and three
distinct properties), but I haven't done that yet.

Does this look like a bug, or have I misread the property path syntax?

//JT

[1] http://stackoverflow.com/q/21092246/1281433
[2] http://stackoverflow.com/a/21093154/1281433

--
Joshua Taylor, http://www.cs.rpi.edu/~tayloj/




_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03

joshua-sparql.owl (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Property path parsing bug in Protégé?

Joshua TAYLOR
On Tue, Jan 14, 2014 at 2:13 AM, Timothy Redmond <[hidden email]> wrote:

> select ?superclass where {
>   :A (rdfs:subClassOf|(owl:intersectionOf/rdf:rest*/rdf:
> first))* ?superclass .
>   filter(!isBlank(?superclass))
> }
>
> run against the attached owl file. Interestingly when I tried this I got a
> result including A, B and C.
>
> But I am wondering if this issue comes up because of the fact that the query
> is sensitive to the way that the OWL file is encoded as RDF.  The current
> SPARQL query plugin works by
>
> taking the owl ontology in memory and writing it as RDF in the form of an
> in-memory sesame model.
> using the sesame query tools to make the query.
>
> In particular the original OWL as RDF file is not used for the query.  I can
> imagine that the class expression
>
> B
>  and C
>  and (p some D)
>
> might be expressible in RDF with two owl:intersectionOf expressions.
> Perhaps this is what happened in your in-memory sesame model.  But I am not
> sure why this would happen exactly.
I hadn't thought about exactly what RDF the SPARQL query would be run
against, but I don't think that explains the missing the results.  A
pattern like

    :A (rdfs:subClassOf|(owl:intersectionOf/rdf:rest*/rdf:first))* ?superclass

should still find superclass through nested intersection classes.
(That was sort of the point of this exercise:  encoding some OWL
reasoning (in a brittle way, I realize) in a SPARQL query). We can
test the SPARQL query and the RDF directly with Jena's command line
tools and then compare with Protege.  Here's data (also attached) that
has a nested intersection classes corresponding to "A subClassOf B and
(C and D)":

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns="http://stackoverflow.com/q/19924861/1281433/sample.owl#"
    xmlns:owl="http://www.w3.org/2002/07/owl#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
  <owl:Ontology
rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl"/>
  <owl:Class rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl#C"/>
  <owl:Class rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl#C"/>
  <owl:Class rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl#B"/>
  <owl:Class rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl#A">
    <rdfs:subClassOf>
      <owl:Class>
        <owl:intersectionOf rdf:parseType="Collection">
          <owl:Class
rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl#B"/>
          <owl:Class>
            <owl:intersectionOf rdf:parseType="Collection">
              <owl:Class
rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl#C"/>
              <owl:Class
rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl#D"/>
            </owl:intersectionOf>
          </owl:Class>
        </owl:intersectionOf>
      </owl:Class>
    </rdfs:subClassOf>
  </owl:Class>
</rdf:RDF>

Here's the corresponding SPARQL query:

prefix :      <http://stackoverflow.com/q/19924861/1281433/sample.owl#>
prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#>
prefix owl:   <http://www.w3.org/2002/07/owl#>
prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

select ?superclass where {
  :A (rdfs:subClassOf|(owl:intersectionOf/rdf:rest*/rdf:first))* ?superclass .
  filter(!isBlank(?superclass))
}

The results from Jena's command line tools are A, B, C, and D:
--------------
| superclass |
==============
| :A         |
| :B         |
| :C         |
| :D         |
--------------

The same query in Protege (4.2.0 build 256) returns just A and B, as
shown in the screen capture.

//JT
--
Joshua Taylor, http://www.cs.rpi.edu/~tayloj/

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03

query.rq (516 bytes) Download Attachment
sample.owl (1K) Download Attachment
protege-screencap.png (45K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Property path parsing bug in Protégé?

Timothy Redmond

A curious thing about this is that I am not replicating this either on the latest Protege from git or from the latest download from Stanford (build 304).  I don't have the version of Protege that you are using.  Could you try with the latest and see if you still see the same issue?

It occurs to me that it might be useful for the SPARQL plugin to have a capability to write out the internal RDF model so that when weird things happen people would have a method of debugging the cause.

-Timothy





On 01/14/2014 05:13 AM, Joshua TAYLOR wrote:
On Tue, Jan 14, 2014 at 2:13 AM, Timothy Redmond [hidden email] wrote:
select ?superclass where {
  :A (rdfs:subClassOf|(owl:intersectionOf/rdf:rest*/rdf:
first))* ?superclass .
  filter(!isBlank(?superclass))
}

run against the attached owl file. Interestingly when I tried this I got a
result including A, B and C.

But I am wondering if this issue comes up because of the fact that the query
is sensitive to the way that the OWL file is encoded as RDF.  The current
SPARQL query plugin works by

taking the owl ontology in memory and writing it as RDF in the form of an
in-memory sesame model.
using the sesame query tools to make the query.

In particular the original OWL as RDF file is not used for the query.  I can
imagine that the class expression

B
 and C
 and (p some D)

might be expressible in RDF with two owl:intersectionOf expressions.
Perhaps this is what happened in your in-memory sesame model.  But I am not
sure why this would happen exactly.
I hadn't thought about exactly what RDF the SPARQL query would be run
against, but I don't think that explains the missing the results.  A
pattern like

    :A (rdfs:subClassOf|(owl:intersectionOf/rdf:rest*/rdf:first))* ?superclass

should still find superclass through nested intersection classes.
(That was sort of the point of this exercise:  encoding some OWL
reasoning (in a brittle way, I realize) in a SPARQL query). We can
test the SPARQL query and the RDF directly with Jena's command line
tools and then compare with Protege.  Here's data (also attached) that
has a nested intersection classes corresponding to "A subClassOf B and
(C and D)":

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns="http://stackoverflow.com/q/19924861/1281433/sample.owl#"
    xmlns:owl="http://www.w3.org/2002/07/owl#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
  <owl:Ontology
rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl"/>
  <owl:Class rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl#C"/>
  <owl:Class rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl#C"/>
  <owl:Class rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl#B"/>
  <owl:Class rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl#A">
    <rdfs:subClassOf>
      <owl:Class>
        <owl:intersectionOf rdf:parseType="Collection">
          <owl:Class
rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl#B"/>
          <owl:Class>
            <owl:intersectionOf rdf:parseType="Collection">
              <owl:Class
rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl#C"/>
              <owl:Class
rdf:about="http://stackoverflow.com/q/19924861/1281433/sample.owl#D"/>
            </owl:intersectionOf>
          </owl:Class>
        </owl:intersectionOf>
      </owl:Class>
    </rdfs:subClassOf>
  </owl:Class>
</rdf:RDF>

Here's the corresponding SPARQL query:

prefix :      <http://stackoverflow.com/q/19924861/1281433/sample.owl#>
prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#>
prefix owl:   <http://www.w3.org/2002/07/owl#>
prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

select ?superclass where {
  :A (rdfs:subClassOf|(owl:intersectionOf/rdf:rest*/rdf:first))* ?superclass .
  filter(!isBlank(?superclass))
}

The results from Jena's command line tools are A, B, C, and D:
--------------
| superclass |
==============
| :A         |
| :B         |
| :C         |
| :D         |
--------------

The same query in Protege (4.2.0 build 256) returns just A and B, as
shown in the screen capture.

//JT


_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03


_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03
Reply | Threaded
Open this post in threaded view
|

Re: Property path parsing bug in Protégé?

Joshua TAYLOR
On Thu, Jan 16, 2014 at 12:22 PM, Timothy Redmond <[hidden email]> wrote:
> A curious thing about this is that I am not replicating this either on the
> latest Protege from git or from the latest download from Stanford (build
> 304).  I don't have the version of Protege that you are using.  Could you
> try with the latest and see if you still see the same issue?
>
> It occurs to me that it might be useful for the SPARQL plugin to have a
> capability to write out the internal RDF model so that when weird things
> happen people would have a method of debugging the cause.

I can probably test this a bit later by downloading the latest version.

As an aside, I think I'm still on an older version because of the
problem I ran into with build 276 [1] which you ran into with some
other software [2].  I guess I really ought to try out some later
versions;  I haven't seen that bug in a while.

[1] https://mailman.stanford.edu/pipermail/protege-owl/2012-June/018792.html
[2] https://mailman.stanford.edu/pipermail/protege-owl/2012-June/018797.html




--
Joshua Taylor, http://www.cs.rpi.edu/~tayloj/
_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03