Class Name URI bug and questions

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Class Name URI bug and questions

Mark Feblowitz
Since it is difficult to find definitive answers to the question
"just what characters can I use in an OWL Class or Property name?", I
often resort to using Protege as a test vehicle.

As I expected, just about every non-alphanumeric character is
excluded, with the exception of dashes and underscores.

What surprised me is that a "$" is allowed by Protege. A class named
ABC$DEF retains its name within the tool. And it is saved with that
name when serialized to an OWL file. There is one unusual thing in
the file, though - an unexpected xmlns appears:

        xmlns:p1="http://example.com/test#ABC$"

Even more odd is what happens when it's parsed it back in. The $ is
treated as the end demarcation of the namespace prefix - everything
up to and including the $ is treated as the prefix, with everything
after the $ being treated as the class name.

So, a class named ABC$DEF in namespace http://example.com/test shows
up in Protege as a class named DEF in the namespace with prefix
"http://example.com/test#ABC?"

The serialized file contains an xmlns declaration

        xmlns:p1="http://example.com/test#ABC$"

and a class named ABC$DEF. The class name shows up in Protege as "p1:DEF"

This seems like the combination of (at least) 2 bugs: one in Protege,
for letting the "$" into the name, and one in how that $ is
interpreted (something to do with Jena?)

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: Class Name URI bug and questions

Mark Feblowitz
So, I explained the bug, but didn't really ask any questions (as
promised in the subject).

Questions:

Where is the definitive answer of what can and cannot appear in the name?
Is the "$" a legal character or not?
Has there been any discussion of using generated IDs, e.g., LSIDs,
for names and using something like rdf:Label to capture names with
odd characers?

Thanks,

Mark

At 12:37 PM 3/15/2007, Mark Feblowitz wrote:

>Since it is difficult to find definitive answers to the question
>"just what characters can I use in an OWL Class or Property name?",
>I often resort to using Protege as a test vehicle.
>
>As I expected, just about every non-alphanumeric character is
>excluded, with the exception of dashes and underscores.
>
>What surprised me is that a "$" is allowed by Protege. A class named
>ABC$DEF retains its name within the tool. And it is saved with that
>name when serialized to an OWL file. There is one unusual thing in
>the file, though - an unexpected xmlns appears:
>
>         xmlns:p1="http://example.com/test#ABC$"
>
>Even more odd is what happens when it's parsed it back in. The $ is
>treated as the end demarcation of the namespace prefix - everything
>up to and including the $ is treated as the prefix, with everything
>after the $ being treated as the class name.
>
>So, a class named ABC$DEF in namespace http://example.com/test shows
>up in Protege as a class named DEF in the namespace with prefix
>"http://example.com/test#ABC?"
>
>The serialized file contains an xmlns declaration
>
>         xmlns:p1="http://example.com/test#ABC$"
>
>and a class named ABC$DEF. The class name shows up in Protege as "p1:DEF"
>
>This seems like the combination of (at least) 2 bugs: one in
>Protege, for letting the "$" into the name, and one in how that $ is
>interpreted (something to do with Jena?)

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: Class Name URI bug and questions

Fragoso, Gilberto (NIH/NCI) [E]
Mark,

Somebody may have answered this already,

see http://www.w3.org/TR/rdf-syntax-grammar/#rdf-id and then follow the link for ncname.

cheers,
Gilberto

-----Original Message-----
From: Mark Feblowitz [mailto:[hidden email]]
Sent: Thursday, March 15, 2007 12:43 PM
To: [hidden email]
Subject: Re: [protege-owl] Class Name URI bug and questions


So, I explained the bug, but didn't really ask any questions (as
promised in the subject).

Questions:

Where is the definitive answer of what can and cannot appear in the name?
Is the "$" a legal character or not?
Has there been any discussion of using generated IDs, e.g., LSIDs,
for names and using something like rdf:Label to capture names with
odd characers?

Thanks,

Mark

At 12:37 PM 3/15/2007, Mark Feblowitz wrote:

>Since it is difficult to find definitive answers to the question
>"just what characters can I use in an OWL Class or Property name?",
>I often resort to using Protege as a test vehicle.
>
>As I expected, just about every non-alphanumeric character is
>excluded, with the exception of dashes and underscores.
>
>What surprised me is that a "$" is allowed by Protege. A class named
>ABC$DEF retains its name within the tool. And it is saved with that
>name when serialized to an OWL file. There is one unusual thing in
>the file, though - an unexpected xmlns appears:
>
>         xmlns:p1="http://example.com/test#ABC$"
>
>Even more odd is what happens when it's parsed it back in. The $ is
>treated as the end demarcation of the namespace prefix - everything
>up to and including the $ is treated as the prefix, with everything
>after the $ being treated as the class name.
>
>So, a class named ABC$DEF in namespace http://example.com/test shows
>up in Protege as a class named DEF in the namespace with prefix
>"http://example.com/test#ABC?"
>
>The serialized file contains an xmlns declaration
>
>         xmlns:p1="http://example.com/test#ABC$"
>
>and a class named ABC$DEF. The class name shows up in Protege as "p1:DEF"
>
>This seems like the combination of (at least) 2 bugs: one in
>Protege, for letting the "$" into the name, and one in how that $ is
>interpreted (something to do with Jena?)

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: Class Name URI bug and questions

Tania Tudorache
In reply to this post by Mark Feblowitz
Hi Mark,

The naming mechanism in Protege is definitely not ideal, and we intend
to improve it in the near future. Your observations are correct.
The names in Protege are usually more restrictive than necessary, but
obviously more allowing than needed in the "$" case.

If you are interested in the exact algorithm which decides whether a
name is a valid OWL resource name, you can look at the method
isValidOWLFrameName() from AbstractOWLModel. You can find the
implementation here:

http://smi-protege.stanford.edu/svn/owl/trunk/src/edu/stanford/smi/protegex/owl/model/impl/AbstractOWLModel.java?rev=5639&view=markup

What is basically happening is that only characters that can appear in a
Java identifier are allowed. It seems that "$" is allowed in a Java
identifier, so you can use it also in Protege OWL as a class name. The
namespace manager which interprets the URIs at load time is another part
of Protege OWL that we intend to improve soon.

Tania




Mark Feblowitz wrote:

> So, I explained the bug, but didn't really ask any questions (as
> promised in the subject).
>
> Questions:
>
> Where is the definitive answer of what can and cannot appear in the name?
> Is the "$" a legal character or not?
> Has there been any discussion of using generated IDs, e.g., LSIDs,
> for names and using something like rdf:Label to capture names with
> odd characers?
>
> Thanks,
>
> Mark
>
> At 12:37 PM 3/15/2007, Mark Feblowitz wrote:
>  
>> Since it is difficult to find definitive answers to the question
>> "just what characters can I use in an OWL Class or Property name?",
>> I often resort to using Protege as a test vehicle.
>>
>> As I expected, just about every non-alphanumeric character is
>> excluded, with the exception of dashes and underscores.
>>
>> What surprised me is that a "$" is allowed by Protege. A class named
>> ABC$DEF retains its name within the tool. And it is saved with that
>> name when serialized to an OWL file. There is one unusual thing in
>> the file, though - an unexpected xmlns appears:
>>
>>         xmlns:p1="http://example.com/test#ABC$"
>>
>> Even more odd is what happens when it's parsed it back in. The $ is
>> treated as the end demarcation of the namespace prefix - everything
>> up to and including the $ is treated as the prefix, with everything
>> after the $ being treated as the class name.
>>
>> So, a class named ABC$DEF in namespace http://example.com/test shows
>> up in Protege as a class named DEF in the namespace with prefix
>> "http://example.com/test#ABC?"
>>
>> The serialized file contains an xmlns declaration
>>
>>         xmlns:p1="http://example.com/test#ABC$"
>>
>> and a class named ABC$DEF. The class name shows up in Protege as "p1:DEF"
>>
>> This seems like the combination of (at least) 2 bugs: one in
>> Protege, for letting the "$" into the name, and one in how that $ is
>> interpreted (something to do with Jena?)
>>    
>
> _______________________________________________
> protege-owl mailing list
> [hidden email]
> https://mailman.stanford.edu/mailman/listinfo/protege-owl
>
> Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
>
>  

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
Reply | Threaded
Open this post in threaded view
|

Re: Class Name URI bug and questions

Tania Tudorache
Hi Mark,

One clarification to my email.

What I have described before is the way Protege OWL decides whether a
name is a valid OWL identifier or not. And this is more restrictive than
the RDF specification and we intend to fix this. The definition of the
RDF URIs is given here:

http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#dfn-URI-reference

Tania



Tania Tudorache wrote:

> Hi Mark,
>
> The naming mechanism in Protege is definitely not ideal, and we intend
> to improve it in the near future. Your observations are correct.
> The names in Protege are usually more restrictive than necessary, but
> obviously more allowing than needed in the "$" case.
>
> If you are interested in the exact algorithm which decides whether a
> name is a valid OWL resource name, you can look at the method
> isValidOWLFrameName() from AbstractOWLModel. You can find the
> implementation here:
>
> http://smi-protege.stanford.edu/svn/owl/trunk/src/edu/stanford/smi/protegex/owl/model/impl/AbstractOWLModel.java?rev=5639&view=markup
>
> What is basically happening is that only characters that can appear in a
> Java identifier are allowed. It seems that "$" is allowed in a Java
> identifier, so you can use it also in Protege OWL as a class name. The
> namespace manager which interprets the URIs at load time is another part
> of Protege OWL that we intend to improve soon.
>
> Tania
>
>
>
>
> Mark Feblowitz wrote:
>  
>> So, I explained the bug, but didn't really ask any questions (as
>> promised in the subject).
>>
>> Questions:
>>
>> Where is the definitive answer of what can and cannot appear in the name?
>> Is the "$" a legal character or not?
>> Has there been any discussion of using generated IDs, e.g., LSIDs,
>> for names and using something like rdf:Label to capture names with
>> odd characers?
>>
>> Thanks,
>>
>> Mark
>>
>> At 12:37 PM 3/15/2007, Mark Feblowitz wrote:
>>  
>>    
>>> Since it is difficult to find definitive answers to the question
>>> "just what characters can I use in an OWL Class or Property name?",
>>> I often resort to using Protege as a test vehicle.
>>>
>>> As I expected, just about every non-alphanumeric character is
>>> excluded, with the exception of dashes and underscores.
>>>
>>> What surprised me is that a "$" is allowed by Protege. A class named
>>> ABC$DEF retains its name within the tool. And it is saved with that
>>> name when serialized to an OWL file. There is one unusual thing in
>>> the file, though - an unexpected xmlns appears:
>>>
>>>         xmlns:p1="http://example.com/test#ABC$"
>>>
>>> Even more odd is what happens when it's parsed it back in. The $ is
>>> treated as the end demarcation of the namespace prefix - everything
>>> up to and including the $ is treated as the prefix, with everything
>>> after the $ being treated as the class name.
>>>
>>> So, a class named ABC$DEF in namespace http://example.com/test shows
>>> up in Protege as a class named DEF in the namespace with prefix
>>> "http://example.com/test#ABC?"
>>>
>>> The serialized file contains an xmlns declaration
>>>
>>>         xmlns:p1="http://example.com/test#ABC$"
>>>
>>> and a class named ABC$DEF. The class name shows up in Protege as "p1:DEF"
>>>
>>> This seems like the combination of (at least) 2 bugs: one in
>>> Protege, for letting the "$" into the name, and one in how that $ is
>>> interpreted (something to do with Jena?)
>>>    
>>>      
>> _______________________________________________
>> protege-owl mailing list
>> [hidden email]
>> https://mailman.stanford.edu/mailman/listinfo/protege-owl
>>
>> Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
>>
>>  
>>    
>
> _______________________________________________
> protege-owl mailing list
> [hidden email]
> https://mailman.stanford.edu/mailman/listinfo/protege-owl
>
> Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03 
>
>  

_______________________________________________
protege-owl mailing list
[hidden email]
https://mailman.stanford.edu/mailman/listinfo/protege-owl

Instructions for unsubscribing: http://protege.stanford.edu/doc/faq.html#01a.03