version control of OWL-DL?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

version control of OWL-DL?

Bud P. Bruegger
Hello,

coming from software engineering, and working in a distributed team, I
would like to support OWL-DL development using an off-the-shelf
distributed version control system (my favorite being Mercurial).  It
seems to me that the Manchester Syntax is very well suited for this and
that normal textual diffs give a very good sense of what has changed
between versions.

In a first modest experiment, I have found the following problems that
I hope are resolvable:

I started with a hand-written (normal text editor) version of a test
ontology that looks like the following:  (an excerpt)

    DataProperty: US_SSN
        Characteristics: Functional
        Range: xsd:string
        SubPropertyOf: uniqueID

    Class: USCitizen
        EquivalentTo: Citizen and (nationality value "US")

and is therefore well suited for a textual diff.

Loading this ontology into Protege-OWL (4.1_rc1) and saving it again, I
get something like the following:

    DataProperty: US_SSN

        Characteristics:
            Functional

        Range:
            xsd:string

        SubPropertyOf:
            uniqueID


    Class: USCitizen

        EquivalentTo:
            Citizen
             and (nationality value "US")

Clearly, the excessive use of line breaks and newlines is much less
suited for a textual diff algorithm.

Also, I have the impression (didn't verify), that the order of axioms
has changed.  Also this is a problem when using a textual diff in a
version control system.

So I would like to ask some questions:

* Is is possible to configure Protege to avoid using excessive
line-breaks in the Manchester Syntax?  Or alternatively, is there any
"pritty-print" utility for Manchester Syntax that would do this for
me?

* Is is possible to guarantee a consistent ordering of axioms such
that a textual diff algorithm shows changes only if they actually
happen (and not due to reordering of axioms)?

* Is there some good practice of how to split up an ontology in
  multiple files (using imports) to achieve some modularization?

* How do the big ontology projects manage version control and
  distributed collaborative development?  It seems for example, that
  the Monetary Ontology is version controlled using SVN  (see
  http://xp-dev.com/wiki/115819/Homepage).

Many thanks for any help.

-b
_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03
Reply | Threaded
Open this post in threaded view
|

Re: version control of OWL-DL?

Melanie Courtot-2
Hi Bud,

As far as I know there is no option to constrain serialization in Protege or using the OWLAPI for example, though I didn't look into it for a long time so it may have been added recently. I am however not sure how reliable this would be as ordering doesn't matter in OWL.

My experience comes from working on the OBI project (http://purl.obolibrary.org/obo/obi) in which we used to have multiple files, each representing a specific domain (e.g., instruments, data transformation, biomaterial etc) and each being maintained under SVN independently. Some old and scarce documentation is still available at http://obi-ontology.org/page/Branch_development
While this worked for editing, we decided to produce a single merged file for public release - this allowed us to run extra checks (metadata, IDs etc) and perform extra operations (such as distributing an inferred version of the ontology), while making it way easier for our users to download and browse the file.
If you are interested in this I could try and dig up more documentation; my feedback is that you have to make sure it is worth it; merging the files often creates some sort of issues that you then have to debug manually.

Since then, OBI switched to one main file only, and keep external files only for our term imports (http://obi-ontology.org/page/MIREOT). We still merge all files before releases, keeping some external info separately allows us to update it automatically on a regular basis.

A maybe relevant thread: https://mailman.stanford.edu/pipermail/p4-feedback/2011-February/003541.html where some OWL diff tools were discussed.

Hope that helps,
Melanie



On 2011-06-14, at 11:27 AM, Bud P. Bruegger wrote:

> Hello,
>
> coming from software engineering, and working in a distributed team, I
> would like to support OWL-DL development using an off-the-shelf
> distributed version control system (my favorite being Mercurial).  It
> seems to me that the Manchester Syntax is very well suited for this and
> that normal textual diffs give a very good sense of what has changed
> between versions.
>
> In a first modest experiment, I have found the following problems that
> I hope are resolvable:
>
> I started with a hand-written (normal text editor) version of a test
> ontology that looks like the following:  (an excerpt)
>
>    DataProperty: US_SSN
>        Characteristics: Functional
>        Range: xsd:string
>        SubPropertyOf: uniqueID
>
>    Class: USCitizen
>        EquivalentTo: Citizen and (nationality value "US")
>
> and is therefore well suited for a textual diff.
>
> Loading this ontology into Protege-OWL (4.1_rc1) and saving it again, I
> get something like the following:
>
>    DataProperty: US_SSN
>
>        Characteristics:
>            Functional
>
>        Range:
>            xsd:string
>
>        SubPropertyOf:
>            uniqueID
>
>
>    Class: USCitizen
>
>        EquivalentTo:
>            Citizen
>             and (nationality value "US")
>
> Clearly, the excessive use of line breaks and newlines is much less
> suited for a textual diff algorithm.
>
> Also, I have the impression (didn't verify), that the order of axioms
> has changed.  Also this is a problem when using a textual diff in a
> version control system.
>
> So I would like to ask some questions:
>
> * Is is possible to configure Protege to avoid using excessive
> line-breaks in the Manchester Syntax?  Or alternatively, is there any
> "pritty-print" utility for Manchester Syntax that would do this for
> me?
>
> * Is is possible to guarantee a consistent ordering of axioms such
> that a textual diff algorithm shows changes only if they actually
> happen (and not due to reordering of axioms)?
>
> * Is there some good practice of how to split up an ontology in
>  multiple files (using imports) to achieve some modularization?
>
> * How do the big ontology projects manage version control and
>  distributed collaborative development?  It seems for example, that
>  the Monetary Ontology is version controlled using SVN  (see
>  http://xp-dev.com/wiki/115819/Homepage).
>
> Many thanks for any help.
>
> -b
> _______________________________________________
> protege-owl mailing list
> [hidden email]
> https://mailman.stanford.edu/mailman/listinfo/protege-owl
>
> Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03

---
Mélanie Courtot
MSFHR/PCIRN trainee, TFL- BCCRC
675 West 10th Avenue
Vancouver, BC
V5Z 1L3, Canada







_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03
Reply | Threaded
Open this post in threaded view
|

Re: version control of OWL-DL?

Thomas Russ
In reply to this post by Bud P. Bruegger

On Jun 14, 2011, at 11:27 AM, Bud P. Bruegger wrote:

> Hello,
>
> coming from software engineering, and working in a distributed team, I
> would like to support OWL-DL development using an off-the-shelf
> distributed version control system (my favorite being Mercurial).  It
> seems to me that the Manchester Syntax is very well suited for this and
> that normal textual diffs give a very good sense of what has changed
> between versions.
> ...
> So I would like to ask some questions:
>
> * Is is possible to guarantee a consistent ordering of axioms such
> that a textual diff algorithm shows changes only if they actually
> happen (and not due to reordering of axioms)?

In Protege 3.x, if you use the Native Writer, you can get it to sort the axioms when writing, by specifying appropriate options.  This helps a lot for using svn or other source control systems with ontologies, even when they are in xml format, since SCS works reasonably well with XML.

I don't know if it will work with other syntaxes.  And it isn't part of Protege 4.x, which uses a completely different OWL infrastructure.

This was one of the items that I requested and that Stanford was kind enough to implement.  (Thanks, Protege Team!)

> * Is there some good practice of how to split up an ontology in
>  multiple files (using imports) to achieve some modularization?

I don't have a good answer for this one, but it seems that you can get some modularity if you can design things so that either your ontologies differ in amount of detail or if you can have sets of general-purpose ontologies that do not directly interact with each other.

OBI seems to do a bit of both, with a general and fairly small top-level ontology together with a number of specialized ontologies that do not interact too much.  There are some cross connections, however, since it seems in the nature of ontologies that they like to be heavily interconnected.

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03
Reply | Threaded
Open this post in threaded view
|

Re: version control of OWL-DL?

Tania Tudorache
In reply to this post by Bud P. Bruegger
I would not recommend using textual diffs on OWL files, because there is
no guarantee on the serialization or ordering of axioms. Even more, if
some of your users use different ontology editing tools (that would
probably output RDF/XML), then the textual diff really does not have any
chance.

There other approaches that collaborative projects have taken. One of
them is described by Melanie. Another one is to use an ontology server
to which all clients connect and they edit simultaneously the same copy
of the ontology. It is, of course, possible to make regular snapshots of
your ontology on the server, in case you also want to store them in SVN
or some other version control system. One advanage of the client-server
approach is that there is virtually no effort in merging and solving
conflicts between different versions of an ontology, that you would have
in the SVN case.

Protege 3.x has an extension, called Collaborative Protege that works in
client-server mode and it is used by several projects in a production
setting. This is the link:
http://protegewiki.stanford.edu/wiki/Collaborative_Protege
Collaborative Protege also comes as a web application, WebProtege:
http://protegewiki.stanford.edu/wiki/WebProtege
and it is possible to edit simultaneously in the desktop and web client.

We have also added recently support for the client-server in Protege 4
that works like SVN. There is a paper that describes the ontology
versioning approach used by the ontology server here [1].

In terms of modularization, there has been a lot of work done. You may
look at the papers in this LNCS series [2] and this workshop [3].
Probably you'll find some of the papers (or versions of them) also online.
Here [4] is a link describing the Univ. of Manchester work on
modularization, as well as an online modularization tool.

Good luck with your project!
Tania

[1]: http://bmir.stanford.edu/file_asset/index.php/1435/BMIR-2008-1366.pdf
[2]:
http://www.informatik.uni-trier.de/~ley/db/series/lncs/lncs5445.html#Seidenberg09 
<http://www.informatik.uni-trier.de/%7Eley/db/series/lncs/lncs5445.html#Seidenberg09>
[3]: http://www.informatik.uni-bremen.de/~okutz/womo4/page9/page9.html 
<http://www.informatik.uni-bremen.de/%7Eokutz/womo4/page9/page9.html>
[4]: http://owl.cs.manchester.ac.uk/research/topics/modularity/


On 06/14/2011 11:27 AM, Bud P. Bruegger wrote:

> Hello,
>
> coming from software engineering, and working in a distributed team, I
> would like to support OWL-DL development using an off-the-shelf
> distributed version control system (my favorite being Mercurial).  It
> seems to me that the Manchester Syntax is very well suited for this and
> that normal textual diffs give a very good sense of what has changed
> between versions.
>
> In a first modest experiment, I have found the following problems that
> I hope are resolvable:
>
> I started with a hand-written (normal text editor) version of a test
> ontology that looks like the following:  (an excerpt)
>
>      DataProperty: US_SSN
>          Characteristics: Functional
>          Range: xsd:string
>          SubPropertyOf: uniqueID
>
>      Class: USCitizen
>          EquivalentTo: Citizen and (nationality value "US")
>
> and is therefore well suited for a textual diff.
>
> Loading this ontology into Protege-OWL (4.1_rc1) and saving it again, I
> get something like the following:
>
>      DataProperty: US_SSN
>
>          Characteristics:
>              Functional
>
>          Range:
>              xsd:string
>
>          SubPropertyOf:
>              uniqueID
>
>
>      Class: USCitizen
>
>          EquivalentTo:
>              Citizen
>               and (nationality value "US")
>
> Clearly, the excessive use of line breaks and newlines is much less
> suited for a textual diff algorithm.
>
> Also, I have the impression (didn't verify), that the order of axioms
> has changed.  Also this is a problem when using a textual diff in a
> version control system.
>
> So I would like to ask some questions:
>
> * Is is possible to configure Protege to avoid using excessive
> line-breaks in the Manchester Syntax?  Or alternatively, is there any
> "pritty-print" utility for Manchester Syntax that would do this for
> me?
>
> * Is is possible to guarantee a consistent ordering of axioms such
> that a textual diff algorithm shows changes only if they actually
> happen (and not due to reordering of axioms)?
>
> * Is there some good practice of how to split up an ontology in
>    multiple files (using imports) to achieve some modularization?
>
> * How do the big ontology projects manage version control and
>    distributed collaborative development?  It seems for example, that
>    the Monetary Ontology is version controlled using SVN  (see
>    http://xp-dev.com/wiki/115819/Homepage).
>
> Many thanks for any help.
>
> -b
> _______________________________________________
> protege-owl mailing list
> [hidden email]
> https://mailman.stanford.edu/mailman/listinfo/protege-owl
>
> Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03
>

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03
Reply | Threaded
Open this post in threaded view
|

Re: version control of OWL-DL?

Bud P. Bruegger
In reply to this post by Melanie Courtot-2
Many thanks to Melanie, Tania and Thomas for the input!   It will take
me some time to digest it all and I may be back with followup
questions later.

a great responsive list!

-b
_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03
Reply | Threaded
Open this post in threaded view
|

Re: version control of OWL-DL?

Bud P. Bruegger
In reply to this post by Thomas Russ
On Wed, Jun 15, 2011 at 1:41 AM, Thomas Russ <[hidden email]> wrote:
> In Protege 3.x, if you use the Native Writer, you can get it to sort the axioms when writing, by specifying appropriate options.  This helps a lot for using svn or other source control systems with ontologies, even when they are in xml format, since SCS works reasonably well with XML.

This sounds indeed interesting.  It may be the inertia to stick to
what you already know, but the idea to use one of the many version
control systems available for software sounds kind of nice.  Also, to
be able to read an ontology a plain text with no extra tools attracts
me.  (Maybe I'm of a dying species--also in software my "Integrated
Development Environment" is the VIM text editor...)

One of the options that popped up in my mind was to write a simple
utility that simply reorders axioms as they occurred in the old
version.  If everything fits in memory, and if my current ideas of a
really dumb parser is applicable, that could be done with little
effort...

Do you think that re-ordering would be enough to then apply a textual
diff and version control, or are there other hurdles that I'm not
aware of?

Would such a utility be something also interesting to others?  (I
intend to do that for protege 4 where the Native Writer seems to be
missing).

I searched on the web but couldn't find what ordering criteria Native
Writer offers.  Can anyone give me some input?

many thanks!
-b
_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03