Issue 8594 - Enum string validator in Phobos?
Summary: Enum string validator in Phobos?
Status: NEW
Alias: None
Product: D
Classification: Unclassified
Component: phobos (show other issues)
Version: D2
Hardware: All All
: P4 enhancement
Assignee: No Owner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-08-27 12:00 UTC by bearophile_hugs
Modified: 2024-12-01 16:15 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description bearophile_hugs 2012-08-27 12:00:47 UTC
In the Ada language there is a handy feature, you can define an enumeration of chars, and then give enum arrays literal as strings, and the compiler enforces the usage of just the allowed chars:


procedure Test is
   type Hexa is ('A', 'B', 'C', 'D', 'E', 'F');
   type Hex_Array is array (0 .. 5) of Hexa;
   data : Hex_Array;
begin
  data := "BACEDC";
end;


Similar literals are very useful, there are many kinds of problems that use data defined on a subset of the chars, and the chars are a compact representation. Such strings are able to represent sequence of commands, start configurations of problems, boards of games, and many kinds of discrete problems.

(Note: in Ada stack-allocated arrays like Hex_Array are used quite often, more than heap-allocated arrays.)


If you try to define a literal that contains a wrong char:

procedure Test is
   type Hexa is ('A', 'B', 'C', 'D', 'E', 'F');
   type Hex_Array is array (0 .. 5) of Hexa;
   data : Hex_Array;
begin
  data := "BACgDC";
end;


The Ada compiler gives you a compile-time error:

prog.adb:6:15: character not defined for type "Hexa" defined at line 2


Such compile-time validation is very useful to avoid bugs in the program, and in D using enum literals is useful because it allows you to use a safer "static switch" to process the data, instead of a regular "switch" on string chars.


This is one possible D translation, but even using with() the array literal requires commas (and strings are often more handy literals):

enum Hexa : char { A='A', B='B', C='C', D='D', E='E', F='F' }
alias Hexa[6] HexArray; // this is not a true type as in Ada
void main() {
    HexArray data;
    with (Hexa)
        data = [B,A,C,E,D,C];
}



So I have created a small compile-time function + template that validates a string at compile time:

// - - - - - - - - - - - - - - - -
import std.traits: isSomeChar, EnumMembers;

private E[] _validateEnumString(E)(in string txt)
pure nothrow if (is(E TC == enum) && isSomeChar!TC) {
    auto result = new typeof(return)(txt.length);

    OUTER:
    foreach (i, c; txt) {
        /*static*/ foreach (e; EnumMembers!E)
            if (c == e) {
                result[i] = e;
                continue OUTER;
            }
        assert(false, "Not valid enum char: " ~ c);
    }

    return result;
}

enum Hexa : char { A='A', B='B', C='C', D='D', E='E', F='F' }

template Hexas(string path) {
    enum Hexas = _validateEnumString!Hexa(path);
}

alias Hexa[6] HexArray;
void main() {
    HexArray data = Hexas!"BACEDC";
}
// - - - - - - - - - - - - - - - -




This alternative design uses a cast to avoid the input duplication and maybe reduces the compilation time, but produces only arrays of immutable enums:

// - - - - - - - - - - - - - - - -
import std.traits: isSomeChar, EnumMembers;

private immutable(E)[] _validateEnumString(E)(in string txt)
pure nothrow if (is(E TC == enum) && isSomeChar!TC) {

    OUTER:
    foreach (i, c; txt) {
        /*static*/ foreach (e; EnumMembers!E)
            if (c == e)
                continue OUTER;
        assert(false, "Not valid enum char: " ~ c);
    }

    return cast(typeof(return))txt;
}

enum Hexa : char { A='A', B='B', C='C', D='D', E='E', F='F' }

template Hexas(string path) {
    enum Hexas = _validateEnumString!Hexa(path);
}

alias Hexa[6] HexArray;
void main() {
    HexArray data = Hexas!"BACEDC";
    auto data2 = Hexas!"BACEDC";
    static assert(is(typeof(data2) == immutable(Hexa)[]));
}
// - - - - - - - - - - - - - - - -


Defining enum array literals this way is one of the built-in features of Ada, because it's commonly useful, this is quoted from Wikipedia:
http://en.wikipedia.org/wiki/Enumerated_type#Ada

Like Modula-3 Ada treats Boolean and Character as special pre-defined (in package "Standard") enumerated types. Unlike Modula-3 one can also define own character types:

type Cards is ("7", "8", "9", "J", "Q", "K", "A");


So maybe a template similar to the ones I have shown here is useful enough to be added to Phobos.
Comment 1 dlangBugzillaToGithub 2024-12-01 16:15:35 UTC
THIS ISSUE HAS BEEN MOVED TO GITHUB

https://github.com/dlang/phobos/issues/9937

DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB