Why is getOwnSlotValues() so slow?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Why is getOwnSlotValues() so slow?

Daniel Elenius
This is a question on optimizing/efficiency:

I have a program that translates everything in the KB to a different
format. It goes through every class, then gets all the instances of that
class, then all properties for that instances, then all property values
for that instance-property pair. Then it does something with each such
property value assertion.

My problem is, this is VERY slow. The bottleneck seems to be the call to
get all property values for an instance-property pair. In my program,
about 80% of the time is spent in getOwnSlotValues(). I profiled this in
netbeans, and came up with the bizarre call chain shown in the attached
file.

Apparently, when you ask for a property value, Protege calls
FrameStoreAdapter.getOwnSlotValues() 9 times recursively, THEN, there is
another 15 or so levels of indirection before it gets to some hash table
that contains the values.

So my question is, is there a better way for me to do this? I suppose it
won't be fixed in Protege 3.4 itself since it is not being developed any
further?

Daniel


_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 

crazy protege.png (68K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Why is getOwnSlotValues() so slow?

Thomas Russ

On Jun 11, 2009, at 9:56 AM, Daniel Elenius wrote:

> This is a question on optimizing/efficiency:
>
> I have a program that translates everything in the KB to a different  
> format. It goes through every class, then gets all the instances of  
> that class, then all properties for that instances, then all  
> property values for that instance-property pair. Then it does  
> something with each such property value assertion.
>
> My problem is, this is VERY slow. The bottleneck seems to be the  
> call to get all property values for an instance-property pair. In my  
> program, about 80% of the time is spent in getOwnSlotValues(). I  
> profiled this in netbeans, and came up with the bizarre call chain  
> shown in the attached file.
>
> Apparently, when you ask for a property value, Protege calls  
> FrameStoreAdapter.getOwnSlotValues() 9 times recursively, THEN,  
> there is another 15 or so levels of indirection before it gets to  
> some hash table that contains the values.
>
> So my question is, is there a better way for me to do this? I  
> suppose it won't be fixed in Protege 3.4 itself since it is not  
> being developed any further?

Do you get the same performance problems if you try getting the values  
using

   OWLIndividual.getPropertyValues

?
_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: Why is getOwnSlotValues() so slow?

Daniel Elenius
Thomas Russ wrote:

> On Jun 11, 2009, at 9:56 AM, Daniel Elenius wrote:
>
>  
>> This is a question on optimizing/efficiency:
>>
>> I have a program that translates everything in the KB to a different  
>> format. It goes through every class, then gets all the instances of  
>> that class, then all properties for that instances, then all  
>> property values for that instance-property pair. Then it does  
>> something with each such property value assertion.
>>
>> My problem is, this is VERY slow. The bottleneck seems to be the  
>> call to get all property values for an instance-property pair. In my  
>> program, about 80% of the time is spent in getOwnSlotValues(). I  
>> profiled this in netbeans, and came up with the bizarre call chain  
>> shown in the attached file.
>>
>> Apparently, when you ask for a property value, Protege calls  
>> FrameStoreAdapter.getOwnSlotValues() 9 times recursively, THEN,  
>> there is another 15 or so levels of indirection before it gets to  
>> some hash table that contains the values.
>>
>> So my question is, is there a better way for me to do this? I  
>> suppose it won't be fixed in Protege 3.4 itself since it is not  
>> being developed any further?
>>    
>
> Do you get the same performance problems if you try getting the values  
> using
>
>    OWLIndividual.getPropertyValues
>
> ?
>  

That's what I'm actually doing. The implementation for that is
DefaultRDFIndividual.getPropertyValues, which calls
OWLUtil.getPropertyValues, which calls
OWLModel.getDirectOwnSlotValues(), and so on...
_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: Why is getOwnSlotValues() so slow?

Tania Tudorache
The nested calls are normal. This is part of the aspect-oriented design
that Protege is using internally. Each nested call is an aspect that can
be added/removed.

Unfortunately, I cannot see in your image where the times are spent. Can
you send a better screenshot that shows the times/percentages, or even
better, if you have an xml dump of the dump.

The aspects add only minimal overhead, and probably you can see that no
time is spent in an actual aspect.

So, it would be interesting to see where the time is actually spent. One
thing that would make things really slow, is if the events dispatch is
enabled while you do the translation. This means that each time you
create a new triple, several events are dispatched.

For imports/conversions, you should disable the event generator by calling:

owlModel.setGenerateEventsEnabled(false);

Make sure to turn it back on after the conversion. If you have a UI, you
will have to refresh it at the end of the conversion, since the UI does
not get any refresh events. If it is a tab  in Protege, you can call:

ProjectManager.getProjectManager().reloadUI(true)

Tania



Daniel Elenius wrote:

> Thomas Russ wrote:
>  
>> On Jun 11, 2009, at 9:56 AM, Daniel Elenius wrote:
>>
>>  
>>    
>>> This is a question on optimizing/efficiency:
>>>
>>> I have a program that translates everything in the KB to a different  
>>> format. It goes through every class, then gets all the instances of  
>>> that class, then all properties for that instances, then all  
>>> property values for that instance-property pair. Then it does  
>>> something with each such property value assertion.
>>>
>>> My problem is, this is VERY slow. The bottleneck seems to be the  
>>> call to get all property values for an instance-property pair. In my  
>>> program, about 80% of the time is spent in getOwnSlotValues(). I  
>>> profiled this in netbeans, and came up with the bizarre call chain  
>>> shown in the attached file.
>>>
>>> Apparently, when you ask for a property value, Protege calls  
>>> FrameStoreAdapter.getOwnSlotValues() 9 times recursively, THEN,  
>>> there is another 15 or so levels of indirection before it gets to  
>>> some hash table that contains the values.
>>>
>>> So my question is, is there a better way for me to do this? I  
>>> suppose it won't be fixed in Protege 3.4 itself since it is not  
>>> being developed any further?
>>>    
>>>      
>> Do you get the same performance problems if you try getting the values  
>> using
>>
>>    OWLIndividual.getPropertyValues
>>
>> ?
>>  
>>    
>
> That's what I'm actually doing. The implementation for that is
> DefaultRDFIndividual.getPropertyValues, which calls
> OWLUtil.getPropertyValues, which calls
> OWLModel.getDirectOwnSlotValues(), and so on...
> _______________________________________________
> protege-owl mailing list
> [hidden email]
> https://mailman.stanford.edu/mailman/listinfo/protege-owl
>
> Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
>
>  

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: Why is getOwnSlotValues() so slow?

Daniel Elenius
This attachment shows the number of calls and time percentages.

There is something very fishy going on with the profiling here. My
method translateInstances() actually only calls getPropertyValues() 126
times (see translateInstances.png). It does not directly call
getOwnSlotValues(). For some reason, the profiler shows
getOwnSlotValues() as being called by my method 6443 times! Still, I
believe that this method is where the time goes, and I need to
understand why...

The other attachment shows the call chain for getOwnSlotValues, with
percentages. You see that a significant amount of time is lost in the
call chain if you compare the times at the top and the bottom. Still,
you're correct that the main problem is not the depth of the call chain.
The main problem appears to be why the top-level method is called so
many times in the first place.

As for your suggestion to disable events: I am not actually putting
anything into the KB. I am generating objects of my own custom classes.
So I guess this suggestion would not help me.


Tania Tudorache wrote:

> The nested calls are normal. This is part of the aspect-oriented design
> that Protege is using internally. Each nested call is an aspect that can
> be added/removed.
>
> Unfortunately, I cannot see in your image where the times are spent. Can
> you send a better screenshot that shows the times/percentages, or even
> better, if you have an xml dump of the dump.
>
> The aspects add only minimal overhead, and probably you can see that no
> time is spent in an actual aspect.
>
> So, it would be interesting to see where the time is actually spent. One
> thing that would make things really slow, is if the events dispatch is
> enabled while you do the translation. This means that each time you
> create a new triple, several events are dispatched.
>
> For imports/conversions, you should disable the event generator by calling:
>
> owlModel.setGenerateEventsEnabled(false);
>
> Make sure to turn it back on after the conversion. If you have a UI, you
> will have to refresh it at the end of the conversion, since the UI does
> not get any refresh events. If it is a tab  in Protege, you can call:
>
> ProjectManager.getProjectManager().reloadUI(true)
>
> Tania
>
>
>
> Daniel Elenius wrote:
>  
>> Thomas Russ wrote:
>>  
>>    
>>> On Jun 11, 2009, at 9:56 AM, Daniel Elenius wrote:
>>>
>>>  
>>>    
>>>      
>>>> This is a question on optimizing/efficiency:
>>>>
>>>> I have a program that translates everything in the KB to a different  
>>>> format. It goes through every class, then gets all the instances of  
>>>> that class, then all properties for that instances, then all  
>>>> property values for that instance-property pair. Then it does  
>>>> something with each such property value assertion.
>>>>
>>>> My problem is, this is VERY slow. The bottleneck seems to be the  
>>>> call to get all property values for an instance-property pair. In my  
>>>> program, about 80% of the time is spent in getOwnSlotValues(). I  
>>>> profiled this in netbeans, and came up with the bizarre call chain  
>>>> shown in the attached file.
>>>>
>>>> Apparently, when you ask for a property value, Protege calls  
>>>> FrameStoreAdapter.getOwnSlotValues() 9 times recursively, THEN,  
>>>> there is another 15 or so levels of indirection before it gets to  
>>>> some hash table that contains the values.
>>>>
>>>> So my question is, is there a better way for me to do this? I  
>>>> suppose it won't be fixed in Protege 3.4 itself since it is not  
>>>> being developed any further?
>>>>    
>>>>      
>>>>        
>>> Do you get the same performance problems if you try getting the values  
>>> using
>>>
>>>    OWLIndividual.getPropertyValues
>>>
>>> ?
>>>  
>>>    
>>>      
>> That's what I'm actually doing. The implementation for that is
>> DefaultRDFIndividual.getPropertyValues, which calls
>> OWLUtil.getPropertyValues, which calls
>> OWLModel.getDirectOwnSlotValues(), and so on...
>> _______________________________________________
>> protege-owl mailing list
>> [hidden email]
>> https://mailman.stanford.edu/mailman/listinfo/protege-owl
>>
>> Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
>>
>>  
>>    
>
> _______________________________________________
> protege-owl mailing list
> [hidden email]
> https://mailman.stanford.edu/mailman/listinfo/protege-owl
>
> Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
>  

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 

getOwnSlotValues chain.png (71K) Download Attachment
translateInstances.png (70K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Why is getOwnSlotValues() so slow?

Daniel Elenius
I figured out how to get better output from the profile. The attached
screen shot shows much better what is going on.

The key is this: I call getRDFProperties() for every individual. That
method calls getPropertyValueCount() for *every* property. This means
that there will be N*M calls to getPropertyValueCount(), where N is the
number of individuals, and M is the number of properties. In the example
I ran here, I had 53 individuals and 115 properties, which results in
6095 calls to getPropertyValueCount(). In other words, this does not
scale very well... I have a larger KB which gets over 400,000 calls to
this method, and this is still a "toy example" compared to the scale of
KBs we're aiming at.

To make matters worse, getPropertyValueCount(), the critical method, is
implemented through all these 20+ levels of indirection.




Daniel Elenius wrote:

> This attachment shows the number of calls and time percentages.
>
> There is something very fishy going on with the profiling here. My
> method translateInstances() actually only calls getPropertyValues()
> 126 times (see translateInstances.png). It does not directly call
> getOwnSlotValues(). For some reason, the profiler shows
> getOwnSlotValues() as being called by my method 6443 times! Still, I
> believe that this method is where the time goes, and I need to
> understand why...
>
> The other attachment shows the call chain for getOwnSlotValues, with
> percentages. You see that a significant amount of time is lost in the
> call chain if you compare the times at the top and the bottom. Still,
> you're correct that the main problem is not the depth of the call
> chain. The main problem appears to be why the top-level method is
> called so many times in the first place.
>
> As for your suggestion to disable events: I am not actually putting
> anything into the KB. I am generating objects of my own custom
> classes. So I guess this suggestion would not help me.
>
>
> Tania Tudorache wrote:
>> The nested calls are normal. This is part of the aspect-oriented
>> design that Protege is using internally. Each nested call is an
>> aspect that can be added/removed.
>>
>> Unfortunately, I cannot see in your image where the times are spent.
>> Can you send a better screenshot that shows the times/percentages, or
>> even better, if you have an xml dump of the dump.
>>
>> The aspects add only minimal overhead, and probably you can see that
>> no time is spent in an actual aspect.
>>
>> So, it would be interesting to see where the time is actually spent.
>> One thing that would make things really slow, is if the events
>> dispatch is enabled while you do the translation. This means that
>> each time you create a new triple, several events are dispatched.
>>
>> For imports/conversions, you should disable the event generator by
>> calling:
>>
>> owlModel.setGenerateEventsEnabled(false);
>>
>> Make sure to turn it back on after the conversion. If you have a UI,
>> you will have to refresh it at the end of the conversion, since the
>> UI does not get any refresh events. If it is a tab  in Protege, you
>> can call:
>>
>> ProjectManager.getProjectManager().reloadUI(true)
>>
>> Tania
>>
>>
>>
>> Daniel Elenius wrote:
>>  
>>> Thomas Russ wrote:
>>>      
>>>> On Jun 11, 2009, at 9:56 AM, Daniel Elenius wrote:
>>>>
>>>>            
>>>>> This is a question on optimizing/efficiency:
>>>>>
>>>>> I have a program that translates everything in the KB to a
>>>>> different  format. It goes through every class, then gets all the
>>>>> instances of  that class, then all properties for that instances,
>>>>> then all  property values for that instance-property pair. Then it
>>>>> does  something with each such property value assertion.
>>>>>
>>>>> My problem is, this is VERY slow. The bottleneck seems to be the  
>>>>> call to get all property values for an instance-property pair. In
>>>>> my  program, about 80% of the time is spent in getOwnSlotValues().
>>>>> I  profiled this in netbeans, and came up with the bizarre call
>>>>> chain  shown in the attached file.
>>>>>
>>>>> Apparently, when you ask for a property value, Protege calls  
>>>>> FrameStoreAdapter.getOwnSlotValues() 9 times recursively, THEN,  
>>>>> there is another 15 or so levels of indirection before it gets to  
>>>>> some hash table that contains the values.
>>>>>
>>>>> So my question is, is there a better way for me to do this? I  
>>>>> suppose it won't be fixed in Protege 3.4 itself since it is not  
>>>>> being developed any further?
>>>>>                  
>>>> Do you get the same performance problems if you try getting the
>>>> values  using
>>>>
>>>>    OWLIndividual.getPropertyValues
>>>>
>>>> ?
>>>>            
>>> That's what I'm actually doing. The implementation for that is
>>> DefaultRDFIndividual.getPropertyValues, which calls
>>> OWLUtil.getPropertyValues, which calls
>>> OWLModel.getDirectOwnSlotValues(), and so on...
>>> _______________________________________________
>>> protege-owl mailing list
>>> [hidden email]
>>> https://mailman.stanford.edu/mailman/listinfo/protege-owl
>>>
>>> Instructions for unsubscribing:
>>> http://protege.stanford.edu/doc/faq.html#01a.03
>>>      
>>
>> _______________________________________________
>> protege-owl mailing list
>> [hidden email]
>> https://mailman.stanford.edu/mailman/listinfo/protege-owl
>>
>> Instructions for unsubscribing:
>> http://protege.stanford.edu/doc/faq.html#01a.03   
>
>
> ------------------------------------------------------------------------
>
>
> ------------------------------------------------------------------------
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> protege-owl mailing list
> [hidden email]
> https://mailman.stanford.edu/mailman/listinfo/protege-owl
>
> Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
>  

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 

protege.png (67K) Download Attachment