RELAX NG Issues List

Version $Id: issues2.xml,v 1.7 2002/11/14 14:38:54 Bear Exp $

Table of Contents

Compact syntax

Remaining Issues

20. Use of "attribute grammar" for formal description
26. Two different annotations in section 5

All Issues

1. C-like (or XML-like?) character escape.
2. Syntax of the "except" clause
3. Encoding declaration inside the compact syntax
4. Mapping comments and <a:documentation>
5. Equivalent of <div>
6. local namespace declaration
7. long string literal
8. newlines in string
9. Mixing the compact syntax and the XML syntax
10. Declaring the prefix "xsd" for XML Schema datatypes
11. Keyword Change for "externalRef" and "notAllowed"
12. empty attribute body
13. Make parameter quoting optional
14. Another notation for writing characters inside annotation element
15. Another annotation syntax
16. Pre-declaring the xsd prefix to XML Schema datatypes
17. Keyword Change for "empty"
18. remove annotation syntax
19. Allow redundant ',' '&' '|'
20. Use of "attribute grammar" for formal description
21. Definition of"Non-structure preserving translator"
22. Keywords "element" and "attribute" are verbose
23. Transformation can produce not-well-formed XML documents
24. blocks like ##### produces unexpected documentation element
25. Annotation element with both attributes and text content
26. Two different annotations in section 5

Introduction

This issue list is for the second phase of the RELAX NG development.

Issues

1. charEscape:C-like (or XML-like?) character escape.

Originator: John Cowan (?) Status: closed Category: compactSyntax

Description

To allow writing a schema that describes non-ASCII element/attribute names with an editor that doesn't support them, we might want to have character escapes in the compact syntax, just like Java or XML.

Such escapes are also useful for parameter names and their values.

Proposed Resolution

a post from James Clark proposes \u{XXXX}. He also proposes \x{XXXX}, which is the syntax used by Perl. The TC seems to prefer \x over \u.

Actual Resolution

(2002/4/25) TC has adopted \x{XXXX} as the character entity. It works like Java Unicode escapes; they are processed after the newline normalization has done, but before the compact schema is parsed.

2. exceptSyntax:Syntax of the "except" clause

Originator: James Clark Status: closed Category: compactSyntax

Description

In the current draft, nc1 '-' nc2 is used to represent the "except" clause. However, to increase the readability, it was suggested that we should use the keyword "except" instead of a dash.

Proposed Resolution

The proposed syntax for the except clause is nc1 "except" nc2

Actual Resolution

The editor believes this issue has closed with no action.

3. encodingDeclaration:Encoding declaration inside the compact syntax

Originator: James Clark (?) Status: closed Category: compactSyntax

Description

The current draft doesn't have any mechanism to specify the encoding of a RELAX NG grammar writtein in the compact syntax. Therefore, it must be specified as out-of-band information (such as a parameter to the command line.)

To avoid this inconvenience, it was suggested to add something like the XML declaration to the header.

Proposed Resolution

John Cowan proposed encoding 'encodingname'.

James Clark also proposed a bunch of solutions (see his post). In particular, one of the proposals was !rnc encoding="iso-8859-1".

Actual Resolution

(2002/04/25) TC has decided NOT to have an encoding declaration inside the compact syntax. The processor will always assume that schemas are written in UTF-8/16 (BOM will be used to distinguish them.)

4. comment and a:documentation:Mapping comments and <a:documentation>

Originator: John Cowan (?) Status: closed Category: compactSyntax

Description

It was suggested that it would be nice if the compact syntax specifies the mapping between comments in the compact syntax and <a:documentation> in the XML syntax. In this way, comments in the compact syntax would be automatically copied into XML syntax as <a:documentation> element.

Proposed Resolution

The proposal was only to map comments starting with "##", whereas any line that starts with single "#" is treated as a comment (this behavior is somewhat like Javadoc)

Mapping was not proposed yet.

Actual Resolution

(2002/04/25) "##" to <a:documentation> is adopted. See the original post for the semantics. For writing comments to the implicit grammar, we require the author to make it explicit.

5. div:Equivalent of <div>

Originator: James Clark Status: closed Category: compactSyntax

Description

The compact syntax doesn't have the equivalent of <div> in the XML syntax.

Proposed Resolution

The current proposal is:

[ ... annotation ... ]
{
  ... pattern ...
}

The annotation can be empty.

Actual Resolution

(2002/04/25) The following syntax was adopted. The annotation is optional.

[ ... annotation ... ]
div {
  ... pattern ...
}

6. nsDecl:local namespace declaration

Originator: James Clark Status: closed Category: compactSyntax

Description

The current syntax doesn't allow namespace<->prefix bindings to be declared locally. This prevents some grammars written in XML from being converted into a compact syntax (see below.)

<element name="root">
  <element name="x:child" xmlns:x="urn:x1"/>
  <element name="x:child" xmlns:x="urn:x2"/>
</element>

Proposed Resolution

  1. Live with it. Schemas in the XML syntax that use prefixes inconsistently may not be translatable into the non-XML syntax. (jjc)
  2. A proposal from James Clark:
    local {
      namespace prefix = "namespace URI"
      ...
      
      body
    }
  3. A proposal from John Cowan:
    local prefix = "namespace URI" ... {
      body
    }
  4. XML-like syntax that makes use of annotation (from James Clark):
    [xmlns:prefix="namespace URI"]
    pattern
    			

One of the difficulties of this issue is to allow namespace declarations to affect both annotations and RELAX NG constructs in an intuitive way.

Actual Resolution

(2002/06/06) This issue is closed with no action, based on the input that none of the proposed syntax works very well.

7. longStrLiteral:long string literal

Originator: James Clark Status: closed Category: compactSyntax

Description

It might be nice if the compact syntax allows users to write a long string literal (such as a complex regular expression) in multiple lines, just like in C.

Proposed Resolution

  1. Use an infix operator to indicate string concatenation. '+' and '^' are proposed. '^' seems to enjoy wider support because '+' is overlapping with oneOrMore.
  2. Mere concatenation, just like C. With a possibility of prohibiting the use of doubling (such as "" and '' to represent " and ')

Actual Resolution

(2002/04/25) Adjacent string literals will be automatically concatanated. Thus "AA" 'BB' will be treated as "AABB". Wherever you can write a string literal, you can have this automatic concatanation semantics. String concatanation always takes precedence to two independent string literals happened to be adjacent. Thus the following schema

<value xmlns:p="abc">def</value>

might be casually transformed to

ns p = "abc"
"def"
		

But this is processed as:

ns p = "abcdef"
		

8. newlineInStr:newlines in string

Originator: James Clark Status: closed Category: compactSyntax

Description

The syntax currently allows newlines inside strings. Do we want to prohibit them?

Proposed Resolution

See a post from James Clark. Most of the programming languages (such as C and Java) prohibit them. If we are to prohibit them, users will need to use the character escape \x{A}.

Actual Resolution

(2002/04/25) TC has decided to allow newlines inside string literals, mainly because of the way we handle character escapes.

(2002/06/20) TC has re-opened this issue, based on a feedback (from John Cowan I believe) that we could use Python-like """string""" and '''string''' syntax for string literals.

TC has voted and agreed to allow new lines in this triple-quoted strings. At the same time, TC has adopted to this allow literal new lines in normal strings ("string" and 'string') but allow character escapes that represent new lines (such as \x{A}.)

To summarize, these are allowed:

"""abc
def
ghi"""

'''abc def ghi \x{D}\x{A}'''

"hello world\x{D}\x{A}"

'hello world\x{D}\x{A}'

But these are not allowed:

"Can
you hear me?"

'How about
now?'

9. mixAndMatch:Mixing the compact syntax and the XML syntax

Originator: Makoto Murata Status: closed Category: compactSyntax

Description

Do we allow a schema written in the XML syntax to include or externalRef another schema written in the compact syntax (or vice versa)?

Actual Resolution

(2002/04/25) TC has decided not to allow this. One of the concerns was the added complexity to the implementation. Of course, we can allow this later if we want.

10. XSDdefaultBinding:Declaring the prefix "xsd" for XML Schema datatypes

Originator: James Clark Status: closed Category: compactSyntax

Description

W3C XML Schema Datatypes are practically only one interoperable datatype library. And many people are using it with RELAX NG.

To make a life relaxing for those majority, do we want to pre-declare the prefix "xsd" for "http://www.w3.org/2001/XMLSchema-datatypes"?

Actual Resolution

(2002/05/09) TC has decided to declare this binding by default. This would free users from remembering a long namespace URI. A user who wishes to use this prefix for other namespace URIs can always override this default by declaring the prefix mapping explicitly.

11. externalRef:Keyword Change for "externalRef" and "notAllowed"

Originator: John Cowan Status: closed Category: compactSyntax

Description

John Cowan suggested that the keyword "externalRef" should be changed to "external" just like the keyword of <parentRef/> is "parent".

This leaves "notAllowed" as the only camel notation keyword, which leads to another issue of whether we should also change this keyword to completely get rid of camel notation keywords.

Proposed Resolution

A number of alternative keywords were suggested for "notAllowed", including but not limited to "!","fordbidden","disallowed","invalid", and "ilegal".

Actual Resolution

(2002/06/25) "externalRef" is changed to "external" on the basis that all the other "xxxxRef" is written as "xxxx" in the compact syntax. "notAllowed" will remain as-is simply because no one strongly wants to change it.

12. emptyAtt:empty attribute body

Originator: James Clark Status: closed Category: compactSyntax

Description

Shall we allow attribute name {} just like we allow <attribute name="name"/>?

Actual Resolution

TC agreed that the saving of four key strokes are not a big win. Based on the argument that the compact syntax is to preserve the semantics of RELAX NG, not the surface of the XML syntax, we decided not to allow this short-hand.

13. quotingParameter:Make parameter quoting optional

Originator: James Clark Status: closed Category: compactSyntax

Description

Should we make quotes on parameter values optional when the value is an NMTOKEN: xs:string { minLength = 1 }

Currently, we need to write minLength="1"

Actual Resolution

(2002/06/20) closed with no action.

14. charactersInsideAnnElem:Another notation for writing characters inside annotation element

Originator: James Clark Status: closed Category: compactSyntax

Description

A feedback from RelaxNGCC claims that it becomes awkward to write lengthy character data inside foreign elements in the compact syntax.

Currently, we need to quote all character data inside elements. Shall we provide another syntax to make this less painful? If so, how?

Proposed Resolution

James Clark proposed sh-like multi-line comments

<<KEYWORD
characters
characters
KEYWORD

KEYWORD can be an arbitrary keyword. John Cowan proposed Python-like syntax:

Python has multi-line strings delimited by """ at each end,
as another possible syntax; unlike Perl/sh documents, these
strings are embedded directly.

Actual Resolution

TC adopted Phyton-like """string""" literals. As a result, this issue has closed with no action. See the re-opend issue 8 "newlineInStr" for more on this syntax.

15. altAnnotation:Another annotation syntax

Originator: James Clark Status: closed Category: compactSyntax

Description

A feedback from RelaxNGCC also reveals a case where the current annotation syntax doesn't quite work well.

In a nut-shell, the compact syntax has an assumption as to the use of annotation --- it is assumed that an annotation is always attached to the parent element. But RelaxNGCC is using annotations in a different way.

Proposed Resolution

James Clark proposed:

element foo {
	cc:java [ "program()" ] >> pattern

But he wrote he was not sure if the above syntax is unambiguous.

No other syntax is proposed yet.

Actual Resolution

TC has decided not to introduce this new annotation syntax.

16. prefixXSD:Pre-declaring the xsd prefix to XML Schema datatypes

Originator: James Clark Status: closed Category: compactSyntax

Description

Because of the popularity of XML Schema Datatypes as THE datatype library, it might be worthy to pre-declare the "xsd" prefix for "http://www.w3.org/2001/XMLSchema-datatypes". Users no longer need to type a datatype library declaration if s/he is going to use this library, which is highly likely.

Of course the "xsd" prefix can be overrided manually.

Actual Resolution

(2002/05/09) TC has decided to adopt this proposal.

17. emptyKeyword:Keyword Change for "empty"

Originator: James Clark Status: closed Category: compactSyntax

Description

James proposed that () could be used instead of empty. In fact, this is what XQuery does.

Actual Resolution

(2002/06/25) TC decided to close this issue with no action.

18. noAnnotation:remove annotation syntax

Originator: John Cowan Status: closed Category: compactSyntax

Description

John Cowan listed arguments against the annotation syntax.

Proposed Resolution

Varios people suggested many ideas about possible annotation syntax, including but not limited to:

  • embedding XML fragments.
  • using attribute annotation syntax only.
  • C-like /* */ style.

Certainly this has been one of the most controversial features in the compact syntax.

Actual Resolution

(2002/07/18) TC has decided to keep the current annotation syntax.

19. trailingCombinor:Allow redundant ',' '&' '|'

Originator: John Cowan Status: closed Category: compactSyntax

Description

John Cowan proposed to allow these combinaors right before a right parenthesis so that a list of items can be written beautifuly (and easy to maintain.)

bar = element bar {
  element bar1 {text},
  element bar2 {text},
  element bar3 {text},
}

Proposed Resolution

Java and C allow this in certain places (e.g., array initializer.) But this also adds a complexity to implementations.

Allowing this redundancy in arbitrary place will require arbitrary length of look-ahead for implementations to parse the compact syntax.

One proposed workaround within the current syntax is to write this like:

bar = element bar {
  element bar1 {text},
  element bar2 {text},
  element bar3 {text},
  empty
}

Actual Resolution

(2002/07/18) TC has decided to close this issue with no action.

20. attributeGrammar:Use of "attribute grammar" for formal description

Originator: MURATA Makoto Status: open Category: compactSyntax

Description

Murata-san pointed out the lack of type information in the current notation of the formal description of the compact syntax.

Proposed Resolution

His proposal is to use the notation of attribute grammar, which gives you rigorous definition at the expense of terseness. The author believes that the attribute grammar is well known in various fields of science, such as liguistics and computer science.

A comment from David Rosenborg says he likes the current spec because of its compactness.

James proposed to extend the current notation by adding type information (like return type from production rules, etc.)

(2002/11/14) TC seems to be very happy with the latest change to the spec to make type information explicit. The only one issue discussed was to capitalize the type name.

21. isomorphism:Definition of"Non-structure preserving translator"

Originator: David Rosenborg Status: closed Category: compactSyntax

Description

In section 6.3, the wording currently reads "For this purpose, two instances of data model ... are considered loosely equivalent if they are identical after applying all the simplification ..."

But they may not be identical, because processors can choose names on <define>s freely and put them in any order.

Circumstance

TC has agreed that this is a bug and needs to be fixed. That is, we need a clearer definition of what it means for two grammars to be equal.

Actual Resolution

On 2002/11/14, TC has voted to publish the working draft as the commity specification, so this issue is automatically closed.

22. KeywordElemAndAtt:Keywords "element" and "attribute" are verbose

Originator: Paul T Status: closed Category: compactSyntax

Description

Paul and David suggested to use "@foo {...}" instead of "attribute foo {...}" and "<foo> {...}" instead of "element foo {...}" to make the compact syntax even more compact.

Circumstance

One concern raised was that the proposed syntax breaks the symmetricity of elements and attributes. See the comments list for detailed discussion.

Actual Resolution

TC has decided to keep the current keywords "element" and "attribute".

23. nonwellformedTranslation:Transformation can produce not-well-formed XML documents

Originator: David Rosenborg Status: closed Category: compactSyntax

Description

The following schema produces two document elements, which is not well-formed.

## Oops
"Just a value pattern"

Circumstance

James proposed to add a new constraint to the topLevelBody production and limit the result to have a single top-level element.

Actual Resolution

As proposed, a restriction is put in place of the topLevelBody production.

24. multiplePounds:blocks like ##### produces unexpected documentation element

Originator: David Rosenborg Status: closed Category: compactSyntax

Description

People often use #################### as a formatting aid (for example to separate sections.) According to the current draft, this will be also transformed into a documentation element (except the first two ##).

Circumstance

John Cowan proposed to change the doc comment marker from "##" to "##+" (in regexp notation), thereby treating the entire block as the marker.

Another option is to treat it as a doc comment only when ## is not followed by another #. Or yet another option is just to leave things as is.

TC seems to prefer "##+".

Actual Resolution

##+ was chosen.

25. stringConcatanation:Annotation element with both attributes and text content

Originator: David Rosenborg Status: closed Category: compactSyntax

Description

The current syntax cannot create an annotation element with both attributes and text content.

x:foo [ bar="baz" "some content" ]

Because the above will not work due to literal segment concatenation.

Actual Resolution

Following James' proposal, we will introduce an explicit concatanation operator '~'. So

x:foo [ bar="baz" "some content" ]

and

x:foo [ bar="baz" ~ "some content" ]

are different

26. twoAnnotations:Two different annotations in section 5

Originator: Mike Status: open Category: compactSyntax

Description

Annotation elements (foreign namespace elements) and RELAX NG DTD compatibility annotation are both refered to by the word "annotation".

Proposed Resolution

Add introduction to the beginning of section 5 to clarify this.