large ontologies

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

large ontologies

Edwin A Pell

I have tried to run openCYC with 80GB java heap. It loaded for 17 hours and then I killed it. Can you point me to a large (2+ million axioms) on line ontology I can use to test with.

Is there a way to turn on diagnostics so I can tell what Protege is doing during the 17+ hours load?

Thanks,
Ed

_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: large ontologies

Matthew Horridge-2
Administrator
Hi Ed,

I’m sorry you’re having these problems with Protege and openCYC. 

I tried to get a copy of openCYC to test with but it’s no longer publicly available as far as I can tell.  I therefore loaded the NCBITaxon ontology, which is one of the largest I know of.  It has 9.2 million axioms in it (including annotation assertions).  On my Mac laptop it loads in about 190 seconds and requires about 5.5GB of heap space for loading and then fits into around 3.2 GB of memory for browsing (loading from certain formats, triple based formats in particular, requires more memory than the final memory required to hold the ontology).  Browsing things on the Entities tab works pretty well, but for some reason (that I need to investigate - I think it’s the Metrics View that’s slow) switching back and forth between the “Active Ontology” tab is really sluggish.

Does the location where you’re trying to load openCYC from have a lot of sub-directories containing lost of files?  If so, this could be the cause of the problem.  To check this, move the openCYC file to a directory containing nothing else and load it from there.

In terms of diagnostics, Protege writes to a log file located in ~/.Protege/logs.  You could change the log level to debug - there’s a logback.xml file in the distribution conf directory for doing this.  

Cheers,

Matthew



On 19 Apr 2017, at 10:35, Edwin A Pell <[hidden email]> wrote:

I have tried to run openCYC with 80GB java heap. It loaded for 17 hours and then I killed it. Can you point me to a large (2+ million axioms) on line ontology I can use to test with.

Is there a way to turn on diagnostics so I can tell what Protege is doing during the 17+ hours load?

Thanks,
Ed

_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user


_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: large ontologies

Matthew Horridge-2
Administrator
Hi Ed,

I’ve investigated this further and the poor performance when loading opencyc comes from extremely verbose logging output from the OWLAPI.  To achieve better performance you can turn the logger in question off by adding the following to the logback.xml file in the conf directory of your protege installation:

<logger name="org.semanticweb.owlapi.util.OWLAnnotationPropertyTransformer" level="OFF">
        <appender-ref ref="stdout" />
        <appender-ref ref="files"/>
</logger>

(Add this as a child of the document element).

After setting logging as above, I can load opencyc in about 3.8GB of memory in about 4 minutes.

Cheers,

Matthew



On 19 Apr 2017, at 11:22, Matthew Horridge <[hidden email]> wrote:

Hi Ed,

I’m sorry you’re having these problems with Protege and openCYC. 

I tried to get a copy of openCYC to test with but it’s no longer publicly available as far as I can tell.  I therefore loaded the NCBITaxon ontology, which is one of the largest I know of.  It has 9.2 million axioms in it (including annotation assertions).  On my Mac laptop it loads in about 190 seconds and requires about 5.5GB of heap space for loading and then fits into around 3.2 GB of memory for browsing (loading from certain formats, triple based formats in particular, requires more memory than the final memory required to hold the ontology).  Browsing things on the Entities tab works pretty well, but for some reason (that I need to investigate - I think it’s the Metrics View that’s slow) switching back and forth between the “Active Ontology” tab is really sluggish.

Does the location where you’re trying to load openCYC from have a lot of sub-directories containing lost of files?  If so, this could be the cause of the problem.  To check this, move the openCYC file to a directory containing nothing else and load it from there.

In terms of diagnostics, Protege writes to a log file located in ~/.Protege/logs.  You could change the log level to debug - there’s a logback.xml file in the distribution conf directory for doing this.  

Cheers,

Matthew



On 19 Apr 2017, at 10:35, Edwin A Pell <[hidden email]> wrote:

I have tried to run openCYC with 80GB java heap. It loaded for 17 hours and then I killed it. Can you point me to a large (2+ million axioms) on line ontology I can use to test with.

Is there a way to turn on diagnostics so I can tell what Protege is doing during the 17+ hours load?

Thanks,
Ed

_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user



_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: large ontologies

Matthew Horridge-2
Administrator
…. and once loaded openCyc fits into about 1.7GB of memory when browsing in Protege.


On 23 Apr 2017, at 16:04, Matthew Horridge <[hidden email]> wrote:

Hi Ed,

I’ve investigated this further and the poor performance when loading opencyc comes from extremely verbose logging output from the OWLAPI.  To achieve better performance you can turn the logger in question off by adding the following to the logback.xml file in the conf directory of your protege installation:

<logger name="org.semanticweb.owlapi.util.OWLAnnotationPropertyTransformer" level="OFF">
        <appender-ref ref="stdout" />
        <appender-ref ref="files"/>
</logger>

(Add this as a child of the document element).

After setting logging as above, I can load opencyc in about 3.8GB of memory in about 4 minutes.

Cheers,

Matthew



On 19 Apr 2017, at 11:22, Matthew Horridge <[hidden email]> wrote:

Hi Ed,

I’m sorry you’re having these problems with Protege and openCYC. 

I tried to get a copy of openCYC to test with but it’s no longer publicly available as far as I can tell.  I therefore loaded the NCBITaxon ontology, which is one of the largest I know of.  It has 9.2 million axioms in it (including annotation assertions).  On my Mac laptop it loads in about 190 seconds and requires about 5.5GB of heap space for loading and then fits into around 3.2 GB of memory for browsing (loading from certain formats, triple based formats in particular, requires more memory than the final memory required to hold the ontology).  Browsing things on the Entities tab works pretty well, but for some reason (that I need to investigate - I think it’s the Metrics View that’s slow) switching back and forth between the “Active Ontology” tab is really sluggish.

Does the location where you’re trying to load openCYC from have a lot of sub-directories containing lost of files?  If so, this could be the cause of the problem.  To check this, move the openCYC file to a directory containing nothing else and load it from there.

In terms of diagnostics, Protege writes to a log file located in ~/.Protege/logs.  You could change the log level to debug - there’s a logback.xml file in the distribution conf directory for doing this.  

Cheers,

Matthew



On 19 Apr 2017, at 10:35, Edwin A Pell <[hidden email]> wrote:

I have tried to run openCYC with 80GB java heap. It loaded for 17 hours and then I killed it. Can you point me to a large (2+ million axioms) on line ontology I can use to test with.

Is there a way to turn on diagnostics so I can tell what Protege is doing during the 17+ hours load?

Thanks,
Ed

_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user


_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user


_______________________________________________
protege-user mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-user
Loading...