[protege-owl] Protege API, Jena, Sesame

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

[protege-owl] Protege API, Jena, Sesame

fabiandev
Hello,
 
When it comes to developp a websemantic application, is it correct that we could have to make a choice between Protege API, Jena and Sesame (to take 3 popular tools)
 
To be more precise, my question is : is Protege only an application that is used by end-users, or can we see Protege as an API on which a totally new project could be developped ?
 
If yes, than is there an easy way to make a choice between those 3 API ? or what are the main difference about them ?
 
Thanks for any help
Fabian

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

[protege-owl] how to handle some Mega Triples

Schentz Herbert
Nachricht
We will have to handle a large amount of classes and instances. All together we await about 100 Mega Triples.
Does any one know, how we can handle such a giant amount:
* ORACLE Triple Store ?
* POSTGRES RDF ?
* JENA and a specific Database ?
* SESAME and a specific Database ?
 
regards
 
 
Herbert Schentz
IT-Entwicklung
IT-Development
T: +43-(0)1-313 04/5308
F: +43-(0)1-313 04/3555
[hidden email]

Umweltbundesamt
Spittelauer Lände 5
1090 Wien
Österreich/Austria
http://www.umweltbundesamt.at

Die Informationen in dieser Nachricht sind vertraulich und ausschließlich für die/den AdressatIn bestimmt. Sollten
Sie diese Nachricht irrtümlich erhalten haben, benachrichtigen Sie bitte umgehend die/den SenderIn und löschen
Sie das Original. Jede andere Verwendung dieses E-Mails ist untersagt.

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise private
information. If you have received it in error, please notify the sender immediately and delete the original. Any other
use of the email by you is prohibited.

-----Ursprüngliche Nachricht-----
Von: [hidden email] [mailto:[hidden email]] Im Auftrag von Fabian Cretton
Gesendet: Dienstag, 23. Jänner 2007 16:52
An: [hidden email]
Betreff: [protege-owl] Protege API, Jena, Sesame

Hello,
 
When it comes to developp a websemantic application, is it correct that we could have to make a choice between Protege API, Jena and Sesame (to take 3 popular tools)
 
To be more precise, my question is : is Protege only an application that is used by end-users, or can we see Protege as an API on which a totally new project could be developped ?
 
If yes, than is there an easy way to make a choice between those 3 API ? or what are the main difference about them ?
 
Thanks for any help
Fabian

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: [protege-owl] Protege API, Jena, Sesame

adam.saltiel
In reply to this post by fabiandev
I can only answer this partially.
When it comes to developp a websemantic application, is it correct that we could have to make a choice between Protege API, Jena and Sesame (to take 3 popular tools)

As you phrase it this is correct. But semantic web development now has many strands to it which often branch off and, conversely, converge into one another.
For instance there is the WSMO effort for semantic web services. This uses WSML (web services modelling language) which is a superset of OWL, but you could use OWL and, therefore, Protege, to model. This is a branch off at the language level.
Another example is the use of the Manchester OWL API and conformance to Manchester OWL semantics. This can be found in various tools, I believe the code for the API is reused in every case? I would think there must be a commercial off shoot in the Top Quadrant product due in large part to Holger Knublauch's involvement. This is reuse at the language specification level and branch off in tooling.
So thinking about the second question
To be more precise, my question is : is Protege only an application that is used by end-users, or can we see Protege as an API on which a totally new project could be developped ?

no Protege is not a tool only used by end-users, AFAIK it is primarily used by those developing  Ontologies, whether OWL or not. The mailing list is over 25,000 (I don't have latest figures, but huge, anyway.)
It can be used for a totally new project, and there are many tools into the extensive API for this that would enable it to be used with Sesame etc. if you wished.
Finally, are Sesame, Jena and Protege mutually exclusive? No, certainly not. Protege relies on Jena for backend storage and Sesame can use Jena for backend storage using a SAIL interface (I think).
Many projects have their emphasis on Ontology development, once storage etc. is setup that part is done. Protege is used extensively in medical research, I believe especially in modelling molecules and genetic data. There is a lot of complexity to model here.
But protege may very well be used to model an Ontology consumed by Sesame and backended with Jena. The actual business logic programming would be done in Sesame, the modelling would be accomplished with Protege. But it could also be possible to use Protege instead of just Sesame, or both Sesame and Jena.
When developing a semantic application there are other considerations, the most important of which is the way it will be integrated with a logic engine. Ideally this is where the choices should be made as this determines everything else, but choosing between different types and levels of logical expressivity is not very easy, even if the theoretical differences are grasped  (I would make no claim here) the requirements of the application are not easy to determine in those terms.
In this regard I have a link that makes fascinating reading. 1.

Adam

1. Discusses the suitability of reasoners for WSML (a super-set of OWL) goes into considerable detail of the relationship between the different logics and their support in different language sets.

<a href="http://dip.semanticweb.org/documents/D1.6ReasonerTechnologyScan.pdf" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">http://dip.semanticweb.org/documents/D1.6ReasonerTechnologyScan.pdf




On 23/01/07, Fabian Cretton <[hidden email]> wrote:
Hello,
 
When it comes to developp a websemantic application, is it correct that we could have to make a choice between Protege API, Jena and Sesame (to take 3 popular tools)
 
To be more precise, my question is : is Protege only an application that is used by end-users, or can we see Protege as an API on which a totally new project could be developped ?
 
If yes, than is there an easy way to make a choice between those 3 API ? or what are the main difference about them ?
 
Thanks for any help
Fabian

_______________________________________________
protege-owl mailing list
[hidden email]
<a onclick="return top.js.OpenExtLink(window,event,this)" href="https://mailman.stanford.edu/mailman/listinfo/protege-owl" target="_blank">https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: <a onclick="return top.js.OpenExtLink(window,event,this)" href="http://protege.stanford.edu/doc/faq.html#01a.03" target="_blank">http://protege.stanford.edu/doc/faq.html#01a.03




_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: [protege-owl] how to handle some Mega Triples

Leonard Levering
In reply to this post by Schentz Herbert
In a research project I use the WordNet RDF/OWL representation[1] which  
contains over 1.8 million triples. To query this ontology, I use Jena  
2.5.1 and ARQ 1.5.1 together with Postgres 8.2, and this is giving very  
good performance. Some specific queries sometimes have to be re-written  
but until now all performance issues where easy to overcome. Andy  
Seaborne, the developer of the ARQ query engine for SPARQL is now busy  
with enhancements in Jena/ARQ to solve this issue and that seems even more  
impressive.

I am afraid you will have to test with the various solutions, as I do not  
think there is anyone yet who has tried more than one solution with such a  
big dataset.

Leonard Levering

[1] http://www.w3.org/TR/wordnet-rdf/

On Wed, 24 Jan 2007 09:36:47 +0100, Schentz Herbert  
<[hidden email]> wrote:

> We will have to handle a large amount of classes and instances. All  
> together we await about 100 Mega Triples.
> Does any one know, how we can handle such a giant amount:
> * ORACLE Triple Store ?
> * POSTGRES RDF ?
> * JENA and a specific Database ?
> * SESAME and a specific Database ?
> regards
> Herbert Schentz
> IT-Entwicklung
> IT-Development
> T: +43-(0)1-313 04/5308
> F: +43-(0)1-313 04/3555
> [hidden email]  
> <mailto:[hidden email]>
>
> Umweltbundesamt
> Spittelauer Lände 5
> 1090 Wien
> Österreich/Austria
> http://www.umweltbundesamt.at <http://www.umweltbundesamt.at/>
>
> Die Informationen in dieser Nachricht sind vertraulich und  
> ausschließlich für die/den AdressatIn bestimmt. Sollten
> Sie diese Nachricht irrtümlich erhalten haben, benachrichtigen Sie bitte  
> umgehend die/den SenderIn und löschen
> Sie das Original. Jede andere Verwendung dieses E-Mails ist untersagt.
>
> This message is for the designated recipient only and may contain  
> privileged, proprietary, or otherwise private
> information. If you have received it in error, please notify the sender  
> immediately and delete the original. Any other
> use of the email by you is prohibited.
>
> -----Ursprüngliche Nachricht-----
> Von: [hidden email]  
> [mailto:[hidden email]] Im Auftrag von Fabian  
> Cretton
> Gesendet: Dienstag, 23. Jänner 2007 16:52
> An: [hidden email]
> Betreff: [protege-owl] Protege API, Jena, Sesame
>
>
> Hello,
>
> When it comes to developp a websemantic application, is it correct that  
> we could have to make a choice between Protege API, Jena and Sesame (to  
> take 3 popular tools)
>
> To be more precise, my question is : is Protege only an application  
> that is used by end-users, or can we see Protege as an API on which a  
> totally new project could be developped ?
>
> If yes, than is there an easy way to make a choice between those 3 API  
> ? or what are the main difference about them ?
>
> Thanks for any help
> Fabian
>



_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: [protege-owl] how to handle some Mega Triples

Doug Holmes
In reply to this post by Schentz Herbert
Herbert,
Franz, Inc [  http://www.franz.com ] also offers a triple store called Allegro Graph.  They have a free version that you can experiment with.
Doug

On Jan 24, 2007, at 12:36 AM, Schentz Herbert wrote:

We will have to handle a large amount of classes and instances. All together we await about 100 Mega Triples.
Does any one know, how we can handle such a giant amount:
* ORACLE Triple Store ?
* POSTGRES RDF ?
* JENA and a specific Database ?
* SESAME and a specific Database ?
 
regards
 
 
Herbert Schentz
IT-Entwicklung
IT-Development
T: +43-(0)1-313 04/5308
F: +43-(0)1-313 04/3555
[hidden email]

Umweltbundesamt
Spittelauer Lände 5
1090 Wien
Österreich/Austria
http://www.umweltbundesamt.at

Die Informationen in dieser Nachricht sind vertraulich und ausschließlich für die/den AdressatIn bestimmt. Sollten
Sie diese Nachricht irrtümlich erhalten haben, benachrichtigen Sie bitte umgehend die/den SenderIn und löschen
Sie das Original. Jede andere Verwendung dieses E-Mails ist untersagt.

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise private
information. If you have received it in error, please notify the sender immediately and delete the original. Any other
use of the email by you is prohibited.

-----Ursprüngliche Nachricht-----
Von: [hidden email] [[hidden email]] Im Auftrag von Fabian Cretton
Gesendet: Dienstag, 23. Jänner 2007 16:52
An: [hidden email]
Betreff: [protege-owl] Protege API, Jena, Sesame

Hello,
 
When it comes to developp a websemantic application, is it correct that we could have to make a choice between Protege API, Jena and Sesame (to take 3 popular tools)
 
To be more precise, my question is : is Protege only an application that is used by end-users, or can we see Protege as an API on which a totally new project could be developped ?
 
If yes, than is there an easy way to make a choice between those 3 API ? or what are the main difference about them ?
 
Thanks for any help
Fabian
_______________________________________________
protege-owl mailing list

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 


_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: [protege-owl] how to handle some Mega Triples

Schentz Herbert
Nachricht
thanks Doug
-----Ursprüngliche Nachricht-----
Von: [hidden email] [mailto:[hidden email]] Im Auftrag von Doug Holmes
Gesendet: Mittwoch, 24. Jänner 2007 16:55
An: User support for the Protege-OWL editor
Betreff: Re: [protege-owl] how to handle some Mega Triples

Herbert,
Franz, Inc [  http://www.franz.com ] also offers a triple store called Allegro Graph.  They have a free version that you can experiment with.
Doug

On Jan 24, 2007, at 12:36 AM, Schentz Herbert wrote:

We will have to handle a large amount of classes and instances. All together we await about 100 Mega Triples.
Does any one know, how we can handle such a giant amount:
* ORACLE Triple Store ?
* POSTGRES RDF ?
* JENA and a specific Database ?
* SESAME and a specific Database ?
 
regards
 
 
Herbert Schentz
IT-Entwicklung
IT-Development
T: +43-(0)1-313 04/5308
F: +43-(0)1-313 04/3555
[hidden email]

Umweltbundesamt
Spittelauer Lände 5
1090 Wien
Österreich/Austria
http://www.umweltbundesamt.at

Die Informationen in dieser Nachricht sind vertraulich und ausschließlich für die/den AdressatIn bestimmt. Sollten
Sie diese Nachricht irrtümlich erhalten haben, benachrichtigen Sie bitte umgehend die/den SenderIn und löschen
Sie das Original. Jede andere Verwendung dieses E-Mails ist untersagt.

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise private
information. If you have received it in error, please notify the sender immediately and delete the original. Any other
use of the email by you is prohibited.

-----Ursprüngliche Nachricht-----
Von: [hidden email] [[hidden email]] Im Auftrag von Fabian Cretton
Gesendet: Dienstag, 23. Jänner 2007 16:52
An: [hidden email]
Betreff: [protege-owl] Protege API, Jena, Sesame

Hello,
 
When it comes to developp a websemantic application, is it correct that we could have to make a choice between Protege API, Jena and Sesame (to take 3 popular tools)
 
To be more precise, my question is : is Protege only an application that is used by end-users, or can we see Protege as an API on which a totally new project could be developped ?
 
If yes, than is there an easy way to make a choice between those 3 API ? or what are the main difference about them ?
 
Thanks for any help
Fabian
_______________________________________________
protege-owl mailing list

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 


_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: [protege-owl] how to handle some Mega Triples

Schentz Herbert
In reply to this post by Leonard Levering
Thank you Leonard! This helps a lot!

-----Ursprüngliche Nachricht-----
Von: [hidden email] [mailto:[hidden email]] Im Auftrag von Leonard Levering
Gesendet: Mittwoch, 24. Jänner 2007 12:33
An: User support for the Protege-OWL editor
Betreff: Re: [protege-owl] how to handle some Mega Triples


In a research project I use the WordNet RDF/OWL representation[1] which  
contains over 1.8 million triples. To query this ontology, I use Jena  
2.5.1 and ARQ 1.5.1 together with Postgres 8.2, and this is giving very  
good performance. Some specific queries sometimes have to be re-written  
but until now all performance issues where easy to overcome. Andy  
Seaborne, the developer of the ARQ query engine for SPARQL is now busy  
with enhancements in Jena/ARQ to solve this issue and that seems even more  
impressive.

I am afraid you will have to test with the various solutions, as I do not  
think there is anyone yet who has tried more than one solution with such a  
big dataset.

Leonard Levering

[1] http://www.w3.org/TR/wordnet-rdf/

On Wed, 24 Jan 2007 09:36:47 +0100, Schentz Herbert  
<[hidden email]> wrote:

> We will have to handle a large amount of classes and instances. All
> together we await about 100 Mega Triples.
> Does any one know, how we can handle such a giant amount:
> * ORACLE Triple Store ?
> * POSTGRES RDF ?
> * JENA and a specific Database ?
> * SESAME and a specific Database ?
> regards
> Herbert Schentz
> IT-Entwicklung
> IT-Development
> T: +43-(0)1-313 04/5308
> F: +43-(0)1-313 04/3555
> [hidden email]  
> <mailto:[hidden email]>
>
> Umweltbundesamt
> Spittelauer Lände 5
> 1090 Wien
> Österreich/Austria
> http://www.umweltbundesamt.at <http://www.umweltbundesamt.at/>
>
> Die Informationen in dieser Nachricht sind vertraulich und
> ausschließlich für die/den AdressatIn bestimmt. Sollten
> Sie diese Nachricht irrtümlich erhalten haben, benachrichtigen Sie bitte  
> umgehend die/den SenderIn und löschen
> Sie das Original. Jede andere Verwendung dieses E-Mails ist untersagt.
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise private
> information. If you have received it in error, please notify the sender  
> immediately and delete the original. Any other
> use of the email by you is prohibited.
>
> -----Ursprüngliche Nachricht-----
> Von: [hidden email]
> [mailto:[hidden email]] Im Auftrag von Fabian  
> Cretton
> Gesendet: Dienstag, 23. Jänner 2007 16:52
> An: [hidden email]
> Betreff: [protege-owl] Protege API, Jena, Sesame
>
>
> Hello,
>
> When it comes to developp a websemantic application, is it correct
> that
> we could have to make a choice between Protege API, Jena and Sesame (to  
> take 3 popular tools)
>
> To be more precise, my question is : is Protege only an application
> that is used by end-users, or can we see Protege as an API on which a  
> totally new project could be developped ?
>
> If yes, than is there an easy way to make a choice between those 3
> API
> ? or what are the main difference about them ?
>
> Thanks for any help
> Fabian
>



_______________________________________________
protege-owl mailing list
[hidden email] https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: [protege-owl] Protege API, Jena, Sesame

fabiandev
In reply to this post by adam.saltiel
thanks Adam,

i will go through what you say and maybe come back with some other concerns

about our projects, reasoning and rules will be something important

Fabian

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: [protege-owl] Protege API, Jena, Sesame

adam.saltiel
I would be interested to learn if possible and , maybe just generally, how you make decisions for project requirements in these terms, then.
Adam

On 25/01/07, Fabian <[hidden email]> wrote:
thanks Adam,

i will go through what you say and maybe come back with some other concerns

about our projects, reasoning and rules will be something important

Fabian

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03


_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: [protege-owl] how to handle some Mega Triples

fabiandev
In reply to this post by Leonard Levering
Leonard Levering <Leonard <at> levering.eu> writes:

>
> In a research project I use the WordNet RDF/OWL representation[1] which  
> contains over 1.8 million triples. To query this ontology, I use Jena  
> 2.5.1 and ARQ 1.5.1 together with Postgres 8.2, and this is giving very  
> good performance. Some specific queries sometimes have to be re-written  
> but until now all performance issues where easy to overcome. Andy  
> Seaborne, the developer of the ARQ query engine for SPARQL is now busy  
> with enhancements in Jena/ARQ to solve this issue and that seems even more  
> impressive.
>

Leonard, did you choose Postgres (and not MySQL for instance), for performance
reasons ? or just because you know it better ?

I will work on a new project, i think we'll use Jena, and now we have to make a
choice about the back-end storage.

Is it right that with Jena, the all model (inferenced model) must be in
memory ? is it always like that for any triple store when it comes to
websemantic, or is it only a Jena 'constraint' ?

Thanks for any help
Fabian


_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: [protege-owl] how to handle some Mega Triples

Leonard Levering
> Leonard, did you choose Postgres (and not MySQL for instance), for  
> performance
> reasons ? or just because you know it better ?
>
Partly for performance reasons, partly because my research is used by a  
company that prefers PostgreSQL. PostgreSQL is scaling much better on  
multi-cpu systems, it is optimising better (usually), but that doesn't  
mean that MySQL is bad. If you have much experience with MySQL you can  
probably tune it to perform almost as fast as PostgreSQL.

A quote from Andy Seaborne, who is the ARQ (the query component of Jena)  
developer:
"PG is nicer than MySQL because it does not take as much tuning to get the  
first 80% of effect.  We use both
MySQL and PostgreSQL; they are just different. MySQL is a bit dumb but it  
does mean it does not mis-optimize queries."

I had also a conversation with him, on the future of DB support for  
Jena/ARQ and they will focus more on PostgreSQL as MySQL.

So I would advice to go for PostgreSQL, but choosing MySQL isn't a don't.

> I will work on a new project, i think we'll use Jena, and now we have to  
> make a
> choice about the back-end storage.
>
> Is it right that with Jena, the all model (inferenced model) must be in
> memory ? is it always like that for any triple store when it comes to
> websemantic, or is it only a Jena 'constraint' ?
>

Well I do not know to much about inference, but for as far as I know a  
model is always inferenced on the fly in memory and not in the database. I  
think this is a Jena limitation, as there is no reason why an inferenced  
model can't be stored in a DB. However, I doubt if you would want that for  
big models as these would take huge amounts of diskspace. For small models  
in-memory inference is a much faster solution. To give you an idea about  
memory usage: I tried to load the files of the Wordnet RDF/OWL  
representation [1] to my memory. The program went out of Java Heap space  
after a few minutes and I set my Java heap space to 1,5gb! Imagine if you  
would use inference... DB size <-> Memory mapping of Jena is around the  
1-to-1 I've understood (personally I think it is a bit more in-memory).

I think you can better ask this question on the Jena Mailinglist [2].  
Maybe Sesame or another framework is able to handle inference in a  
database model, but I don't know the other frameworks, except for their  
names.

Leonard

[1] http://www.w3.org/TR/wordnet-rdf/
[2] http://jena.sourceforge.net/support.html

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: [protege-owl] how to handle some Mega Triples

Padmapriya Ashokkumar
In reply to this post by Leonard Levering
Leonard Levering <Leonard <at> levering.eu> writes:

>
> In a research project I use the WordNet RDF/OWL representation[1] which  
> contains over 1.8 million triples. To query this ontology, I use Jena  
> 2.5.1 and ARQ 1.5.1 together with Postgres 8.2, and this is giving very  
> good performance. Some specific queries sometimes have to be re-written  
> but until now all performance issues where easy to overcome. Andy  
> Seaborne, the developer of the ARQ query engine for SPARQL is now busy  
> with enhancements in Jena/ARQ to solve this issue and that seems even more  
> impressive.
>

Could you give an idea of the performance you got with this solution? How long
did it take for say a simple triple listing query and a more complicated one
with multiple triple matches in the where clause? This can give an idea while
evaluating alternative solutions.

Thanks!
Padma

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: [protege-owl] how to handle some Mega Triples

fabiandev
In reply to this post by Leonard Levering
Leonard, thanks a lot for all your time and advises!




_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: [protege-owl] how to handle some Mega Triples

Leonard Levering
In reply to this post by Padmapriya Ashokkumar
> Could you give an idea of the performance you got with this solution?  
> How long did it take for say a simple triple listing query and a more  
> complicated one with multiple triple matches in the where clause? This  
> can give an idea while evaluating alternative solutions.

Simple queries I use a lot are:
PREFIX  wn20schema: <http://www.w3.org/2006/03/wn/wn20/schema/>
SELECT  DISTINCT ?theWordLexical
WHERE   {
        <http://www.w3.org/2006/03/wn/wn20/instances/synset-part-noun-4>  
wn20schema:containsWordSense ?theWordSense .
        ?theWordSense wn20schema:word ?theWord .
        ?theWord wn20schema:lexicalForm ?theWordLexical .
        }

PREFIX  wn20schema: <http://www.w3.org/2006/03/wn/wn20/schema/>
SELECT  ?theHypSyn
WHERE   {
        <http://www.w3.org/2006/03/wn/wn20/instances/synset-power_module-noun-1>  
wn20schema:hyponymOf  ?theHypSyn
        }

In one program I do 1955 queries of this kind (with small variations),  
sometimes uris will be queried double, but this is less than 10% of the  
cases. A first query takes around 0.14 seconds, the average time of all  
the queries is around the 0.002 seconds. This is the time for executing  
the query and storing all the results in an arraylist. This is a very  
simple query (to come to this results I did several runs so the caches are  
filled, but in a production environment you will also have filled db  
caches to speed up performance).

Difficult based on graph patterns don't really exist, but rewriting  
queries is often needed. Often this kind of optimisations are needed:
?a predicate ?b
?b predicate ?c
?c predicate ?d

should be rewritten to:
?a predicate ?b
?c predicate ?d
?b predicate ?c

As this limits the join-sizes on the relational back-end of Jena/ARQ. The  
most difficult queries are with filters that contains REGEX expresssions.  
Not optimized queries and difficult queries (with regexes) in general can  
take more than a minute, but until now I've been able to rewrite every  
query to finish within a second and the help on the Jena-dev list is great  
when it comes to optimising queries. However I do not have much experience  
with regex filters in SPARQL so I can't tell your more on that specific  
type of queries.

Leonard

P.s. All this performance figures are from a Dell D520 laptop with T7200  
cpu, 2gb ram of which the standard amount is assigned to the JVM, a  
5400rpm HDD, the other components are not of much influence on query  
performance.
_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03