Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 22 additions & 15 deletions spec.html
Original file line number Diff line number Diff line change
Expand Up @@ -36800,7 +36800,10 @@ <h1>Static Semantics: Early Errors</h1>
It is a Syntax Error if the source text matched by |UnicodePropertyName| is not a Unicode property name or property alias listed in the “Property name and aliases” column of <emu-xref href="#table-nonbinary-unicode-properties"></emu-xref>.
</li>
<li>
It is a Syntax Error if the source text matched by |UnicodePropertyValue| is not a property value or property value alias for the Unicode property or property alias given by the source text matched by |UnicodePropertyName| listed in <a href="https://unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt"><code>PropertyValueAliases.txt</code></a>.
It is a Syntax Error if the source text matched by |UnicodePropertyName| is neither `Script_Extensions` nor `scx` and the source text matched by |UnicodePropertyValue| is not a property value or property value alias for the Unicode property or property alias given by the source text matched by |UnicodePropertyName| listed in <a href="https://unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt"><code>PropertyValueAliases.txt</code></a>.
</li>
<li>
It is a Syntax Error if the source text matched by |UnicodePropertyName| is either `Script_Extensions` or `scx` and the source text matched by |UnicodePropertyValue| is not a property value or property value alias for the Unicode property Script (sc) listed in <a href="https://unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt"><code>PropertyValueAliases.txt</code></a>.
</li>
</ul>
<emu-grammar>UnicodePropertyValueExpression :: LoneUnicodePropertyNameOrValue</emu-grammar>
Expand Down Expand Up @@ -38282,19 +38285,25 @@ <h1>
<emu-alg>
1. Let _ps_ be the source text matched by |UnicodePropertyName|.
1. Let _p_ be UnicodeMatchProperty(_rer_, _ps_).
1. Assert: _p_ is a Unicode property name or property alias listed in the “Property name and aliases” column of <emu-xref href="#table-nonbinary-unicode-properties"></emu-xref>.
1. Assert: _p_ is a Unicode property listed in the “Canonical <emu-not-ref>property name</emu-not-ref>” column of <emu-xref href="#table-nonbinary-unicode-properties"></emu-xref>.
1. Let _vs_ be the source text matched by |UnicodePropertyValue|.
1. Let _v_ be UnicodeMatchPropertyValue(_p_, _vs_).
1. Let _A_ be the CharSet containing all Unicode code points whose character database definition includes the property _p_ with value _v_.
1. If _p_ is `Script_Extensions`, then
1. Assert: _vs_ is a property value or property value alias for property “Script” listed in <a href="https://unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt"><code>PropertyValueAliases.txt</code></a>.
1. Let _v_ be the Set containing the “short name”, “long name”, and any other aliases corresponding with value _vs_ for property “Script” in <a href="https://unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt"><code>PropertyValueAliases.txt</code></a>.
1. Let _A_ be the CharSet containing all Unicode code points whose character database definition includes the property “Script_Extensions” with value having a non-empty intersection with _v_.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to call MaybeSimpleCaseFolding here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that would affect any code point that case-folds across script (or to/from Common). I don't know if there are any, but it's easy enough to accommodate. Done.

1. Else,
1. Let _v_ be UnicodeMatchPropertyValue(_p_, _vs_).
1. Let _A_ be the CharSet containing all Unicode code points whose character database definition includes the property _p_ with value _v_.
1. Return MaybeSimpleCaseFolding(_rer_, _A_).
</emu-alg>
<emu-grammar>UnicodePropertyValueExpression :: LoneUnicodePropertyNameOrValue</emu-grammar>
<emu-alg>
1. Let _s_ be the source text matched by |LoneUnicodePropertyNameOrValue|.
1. If UnicodeMatchPropertyValue(`General_Category`, _s_) is a Unicode property value or property value alias for the General_Category (gc) property listed in <a href="https://unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt"><code>PropertyValueAliases.txt</code></a>, then
1. Return the CharSet containing all Unicode code points whose character database definition includes the property “General_Category” with value _s_.
1. If _s_ is a Unicode property value or property value alias for the General_Category (gc) property listed in <a href="https://unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt"><code>PropertyValueAliases.txt</code></a>, then
1. Let _v_ be UnicodeMatchPropertyValue(StringToCodePoints(*"General_Category"*), _s_).
1. Return the CharSet containing all Unicode code points whose character database definition includes the property “General_Category” with value _v_.
1. Let _p_ be UnicodeMatchProperty(_rer_, _s_).
1. Assert: _p_ is a binary Unicode property or binary property alias listed in the “<emu-not-ref>Property name</emu-not-ref> and aliases” column of <emu-xref href="#table-binary-unicode-properties"></emu-xref>, or a binary Unicode property of strings listed in the “<emu-not-ref>Property name</emu-not-ref>” column of <emu-xref href="#table-binary-unicode-properties-of-strings"></emu-xref>.
1. Assert: _p_ is a binary Unicode property listed in the “<emu-not-ref>Canonical property name</emu-not-ref>” column of <emu-xref href="#table-binary-unicode-properties"></emu-xref>, or a binary Unicode property of strings listed in the “<emu-not-ref>Property name</emu-not-ref>” column of <emu-xref href="#table-binary-unicode-properties-of-strings"></emu-xref>.
1. Let _A_ be the CharSet containing all CharSetElements whose character database definition includes the property _p_ with value “True”.
1. Return MaybeSimpleCaseFolding(_rer_, _A_).
</emu-alg>
Expand Down Expand Up @@ -38545,11 +38554,10 @@ <h1>
<dl class="header">
</dl>
<emu-alg>
1. If _rer_.[[UnicodeSets]] is *true* and _p_ is a Unicode <emu-not-ref>property name</emu-not-ref> listed in the “<emu-not-ref>Property name</emu-not-ref>” column of <emu-xref href="#table-binary-unicode-properties-of-strings"></emu-xref>, then
1. Return the List of Unicode code points _p_.
1. Assert: _p_ is a Unicode <emu-not-ref>property name</emu-not-ref> or property alias listed in the “<emu-not-ref>Property name</emu-not-ref> and aliases” column of <emu-xref href="#table-nonbinary-unicode-properties"></emu-xref> or <emu-xref href="#table-binary-unicode-properties"></emu-xref>.
1. Let _c_ be the canonical <emu-not-ref>property name</emu-not-ref> of _p_ as given in the “Canonical <emu-not-ref>property name</emu-not-ref>” column of the corresponding row.
1. Return the List of Unicode code points _c_.
1. If _rer_.[[UnicodeSets]] is *true* and _p_ is listed in the “<emu-not-ref>Property name</emu-not-ref>” column of <emu-xref href="#table-binary-unicode-properties-of-strings"></emu-xref>, then
1. Return _p_.
1. Assert: _p_ is listed in the “<emu-not-ref>Property name</emu-not-ref> and aliases” column of <emu-xref href="#table-nonbinary-unicode-properties"></emu-xref> or <emu-xref href="#table-binary-unicode-properties"></emu-xref>.
1. Return the “canonical <emu-not-ref>property name</emu-not-ref>” corresponding to the <emu-not-ref>property name</emu-not-ref> or property alias _p_ in <emu-xref href="#table-nonbinary-unicode-properties"></emu-xref> or <emu-xref href="#table-binary-unicode-properties"></emu-xref>.
</emu-alg>
<p>Implementations must support the Unicode property names and aliases listed in <emu-xref href="#table-nonbinary-unicode-properties"></emu-xref>, <emu-xref href="#table-binary-unicode-properties"></emu-xref>, and <emu-xref href="#table-binary-unicode-properties-of-strings"></emu-xref>. To ensure interoperability, implementations must not support any other property names or aliases.</p>
<emu-note>
Expand Down Expand Up @@ -38578,12 +38586,11 @@ <h1>
<emu-alg>
1. Assert: _p_ is a canonical, unaliased Unicode property name listed in the “Canonical property name” column of <emu-xref href="#table-nonbinary-unicode-properties"></emu-xref>.
1. Assert: _v_ is a property value or property value alias for the Unicode property _p_ listed in <a href="https://unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt"><code>PropertyValueAliases.txt</code></a>.
1. Let _value_ be the canonical property value of _v_ as given in the “Canonical property value” column of the corresponding row.
1. Return the List of Unicode code points _value_.
1. [declared="l"] If _v_ is a “short name” or other alias associated with some “long name” _l_ for <emu-not-ref>property name</emu-not-ref> _p_ in <a href="https://unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt"><code>PropertyValueAliases.txt</code></a>, return _l_; otherwise, return _v_.
</emu-alg>
<p>Implementations must support the Unicode property values and property value aliases listed in <a href="https://unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt"><code>PropertyValueAliases.txt</code></a> for the properties listed in <emu-xref href="#table-nonbinary-unicode-properties"></emu-xref>. To ensure interoperability, implementations must not support any other property values or property value aliases.</p>
<emu-note>
<p>For example, `Xpeo` and `Old_Persian` are valid `Script_Extensions` values, but `xpeo` and `Old Persian` aren't.</p>
<p>For example, `Xpeo` and `Old_Persian` are valid `Script` values, but `xpeo` and `Old Persian` aren't.</p>
</emu-note>
<emu-note>
<p>This algorithm differs from <a href="https://unicode.org/reports/tr44/#Matching_Symbolic">the matching rules for symbolic values listed in UAX44</a>: case, <emu-xref href="#sec-white-space">white space</emu-xref>, U+002D (HYPHEN-MINUS), and U+005F (LOW LINE) are not ignored, and the `Is` prefix is not supported.</p>
Expand Down