Specify one character:
Grammar Chars <SmallA> a <EuroSign> € <GClef> 𝄞 End End
The name of a character set must be written in angular brackets.
Special characters:
Grammar Chars <Space> " " <Minus> "-" <Hash> "#" <DoubleQuote> """" <LineFeed> ""/"" End End
Unicode code points with four digits:
Grammar Chars <Space> 0020 <Minus> 002D <Hash> 0023 <DoubleQuote> 0022 <LineFeed> 000A End End
Unicode code points with six digits (supplementary characters):
Grammar Chars <SmallA> 0061 <EuroSign> 20AC <GClef> 01D11E End End
Mulitple characters:
Grammar Chars <Digit> 0 1 2 3 4 5 6 7 8 9 End End
Reuse sets:
Grammar Chars <Digit> 0 1 2 3 4 5 6 7 8 9 <Hex> <Digit> a b c d e f A B C D E F End End
Define ranges of characters from / to:
Grammar Chars <Digit> Range 0 9 <Hex> <Digit> Range a f Range A F End End
All characters belonging to the specified Unicode category:
Grammar Chars <UpperCaseLetter> Category Lu <NumberDecimalDigit> Category Nd End End
Categories:
Cc = Other, control Cf = Other, format Cn = Other, not assigned Co = Other, private use Cs = Other, surrogate Lo = Letter, other Ll = Letter, lowercase Lm = Letter, modifier Lt = Letter, titlecase Lu = Letter, uppercase Mc = Mark, spacing combining Me = Mark, enclosing Mn = Mark, nonspacing Nd = Number, decimal digit Nl = Number, letter No = Number, other Pc = Punctuation, connector Pd = Punctuation, dash Pe = Punctuation, close Pf = Punctuation, final quote Pi = Punctuation, initial quote Po = Punctuation, other Ps = Punctuation, open Sc = Symbol, currency Sk = Symbol, modifier Sm = Symbol, math So = Symbol, other Zl = Separator, line Zp = Separator, paragraph Zs = Separator, space
Predefined sets
Grammar Chars <Letter> Letter <Digit> Digit <LetterOrDigit> Letter Digit End End
Letter = Unicode Categories Ll, Lm, Lo, Lt, Lu
Digit = Unicode Categories Nd
Number = Unicode Categories Nd, Nl, No
Exclude characters:
Grammar Chars <Digit> 0 1 2 3 4 5 6 7 8 9 <EvenDigit> <Digit> Except 1 3 5 7 9 End End
Any character (Unicode scalar):
Grammar Chars <Any> Range 0000 D7FF Range E000 10FFFF <NotLineFeed> <Any> Except 000A End End
Note that defining values in the range of U+D800 to U+DFFF is not allowed.
Shortcut:
Grammar Chars <NotLineFeed> Any Except 000A End End
Grammar Chars <DigitChar> Range 0 9 End Tokens <Digit> <DigitChar> End End
Grammar Chars <Digit> Range 0 9 End Tokens <ThreeDigitNumber> <Digit> <Digit> <Digit> End End
Repeat 0..N:
Grammar Chars <Digit> Range 0 9 <Digit1To9> Range 1 9 End Tokens <Number> <Digit1To9> Repeat* <Digit> End End
Repeat 1..N:
Grammar Chars <Letter> Range A Z Range a z End Tokens <Name> Repeat+ <Letter> End End
Occurrence 0..1:
Grammar Chars <Letter> Range A Z Range a z End Tokens <OneOrTwoLetterName> <Letter> Optional <Letter> End End
Grammar Chars <Letter> Range A Z Range a z End Tokens <OneOrThreeLetterName> <Letter> Optional ( <Letter> <Letter> ) End End
Use brackets to specify alternatives:
Grammar Chars <LcLetter> Range a z <UcLetter> Range A Z End Tokens <TwoLetterName> [ ( <UcLetter> <UcLetter> ) ( <LcLetter> <LcLetter> ) ] End End
Case-sensitive string:
Grammar Tokens <BeginKeyword> String Begin <EndKeyword> String End End End
Case-insensitive string:
Grammar Tokens <BeginKeyword> CiString Begin <EndKeyword> CiString End End End
Use tokens or other syntax rules. Define the root rule.
Grammar Chars <StringChar> Any Except <DoubleQuote> <DoubleQuote> 0022 <SpaceChar> 0020 End Tokens <StringStart> <DoubleQuote> <StringEnd> <DoubleQuote> <StringContent> Repeat+ <StringChar> <Space> Repeat+ <SpaceChar> End Syntax <String> <StringStart> Optional <StringContent> <StringEnd> <Document> <String> Repeat* ( <Space> <String> ) End RootRule <Document> End