My proposal was to introduce the widely used 'c' syntax to represent a character in Stratego. The stratego compiler can desugar this to the integer ASCII representation of the character, what we use to work with characters right now. The implementation is trivial, it requires no changes to the backend, it requires no special ATerm types.
Advantages:
Disadvantages:
I'm currently writing strategies to rewrite XML entities and character references. I'm using overlays right now. This is already an improvement, but quite verbose.
------------------------ rules unescape-amp : [c_amp(), c_a(), c_m(), c_p(), c_semicolon() | cs] -> [c_amp() | cs] unescape-lt : [c_amp(), c_l(), c_t(), c_semicolon() | cs] -> [c_lt() | cs] unescape-gt : [c_amp(), c_g(), c_t(), c_semicolon() | cs] -> [c_gt() | cs] overlays c_space() = 32 c_quote() = 34 c_amp() = 38 c_apos() = 39 c_0() = 48 c_9() = 57 c_semicolon() = 59 c_numbersign() = 35 ------------------------
This would be possible with characters in Stratego:
----------------------------------- unescape-amp : ['&', 'a', 'm', 'p', ';' | cs] -> ['&' | cs] unescape-lt : ['&', 'l', 't', ';' | cs] -> ['<' | cs] unescape-gt : ['&', 'g', 't', ';' | cs] -> ['>' | cs] -----------------------------------
Of course (un)escaping is an example where the usefulness of character literals is huge. In general you won't use character literals a lot in Stratego. Because of the simplicity of the implementation I think that it is still worth the effort.
I would like to hear your opinion :-) .
-- Martin Bravenboer - 07 Dec 2002
Ok. I have added the following to the SDF definition of Stratego:
---------------------------------------------------------------------- lexical syntax "\'" CharChar "\'" -> Char ~[\'] -> CharChar [\\] [\'ntr\ ] -> CharChar Char -> Id {reject} context-free syntax Char -> Term {cons("Char")} ----------------------------------------------------------------------
and the following desugaring rules to stratego-desugar:
---------------------------------------------------------------------- Desugar : Char(c) -> Int(i) where <DesugarChar <+ explode-string; DesugarCharGeneric> c => i DesugarCharGeneric : [39, i, 39] -> i DesugarChar : "'\\''" -> 39 DesugarChar : "'\\n'" -> 10 DesugarChar : "'\\t'" -> 9 DesugarChar : // carriage return "'\\r'" -> 13 DesugarChar : // space "'\\ '" -> 32 ----------------------------------------------------------------------
Note that the desugaring is done at the syntactic level as part of parsing. This means that characters are pretty-printed as integers. This can be improved later by shifting the desugaring until later in the process. This requires deeper embedding of this notion in Stratego, though.
Are any other escapes needed? Note that this will break existing specifications with identifiers of the form 'c' (which I have never seen).
These changes are available in StrategoRelease09 (beta7).
-- EelcoVisser - 21 Dec 2002