String
Strings are unicode encode text. Specifically, string are immutable UTF-8 encoded sequences of Unicode code points.
Dark v1 problems
Concatenation
Problem: Users currently have to do concatenation like so:
"I am "
|> ++ user.name
|> ++ " and I am "
|> ++ (toString user.age)
|> ++ " years old"
Solution: Instead, we'd like to support string interpolation
"I am ${user.name} and I am ${user.age} years old"
Status: language definition spec'ed. Interaction model not spec'ed
Special characters
Problem: To enter a newline, carriage return, tab, or other special character, you have to paste them directly. You can't type any of them. Related to this, the display of these tokens in the editor is broken.
Solution: support using escape characters (\
) to support them (\n, \r, \t, \\, \", etc
). Describe the complex UX for adding them, deleting, displaying, and editing them, in the spec below.
Status: language definition spec'ed, interaction model not spec'ed
Emoji
Problem: I think the editor does not support proper unicode - I'm not sure.
Solution: the editor should support entering all LTR Unicode text (RTL can wait until Dark v3) - if you can type it into the browser, we should support it in the editor.
Status: problem not understood, not spec'ed
String length
Problem: String length is determined in O(n)
time.
Solution: String length should be cached as part of the string. Using a better string implementation would help solve this.
Status: spec'ed
Shortened display
Problem: We wrap strings at 40 characters to make lines not run on forever. This has a number of annoying problems:
- sometimes the string is only 41 character and it looks bad
- sometimes the line has more room than 40 characters and it looks dumb
- sometimes the line has builtin line breaks, but it breaks off length instead
- We should do a better job of wrapping that takes into account the entire length of the line, and make 40 configurable.
Solution: TODO
Status: not spec'ed
Cursor affinity
Problem: the cursor can be in two different places which logically mean the same thing (the end of a line, and the start of the subsequent line). This leads to "cursor affinity" problems.
Solution: TODO: this was written down somewhere.
Status: Not spec'ed
v2 spec
Strings are unicode, and character are unicode “characters” (if it appears as one character on the screen, that’s a “character” in Dark).
Specifically, string are immutable UTF-8 encoded sequences of Unicode code points. Chars are “Extended Grapheme Clusters”. (A codepoint is some bytes that implement unicode characters, a grapheme is some codepoints forming a unicode entity, such as an emoji; an EGC is some graphemes, used to handle things like emojis which combine to form a single emoji).
v2 Language definition
type string = # unicode supporting type, should include length
type stringSegment =
| Text of string
| InterpolatedExpr of expr
type Expr =
| EString of stringSegment list
| ...
type Pattern =
| PString of string list
| ...
type Dval =
| DString of string
| ...
type DType =
| TString
| ...
Escaped characters can be stored as their actual values in the string, and displayed/entered differently in the editor.
v2 Standard library
type StringError =
| FloatConversionError
| IntegerConversionError
// same as v1
String::append_v1(String: s1, String: s2) -> String
String::base64Decode(String: s) -> String
String::base64Encode(String: s) -> String
String::contains(String: lookingIn, String: searchingFor) -> Bool
String::digest(String: s) -> String
String::dropFirst(String: string, Int: characterCount) -> String
String::dropLast(String: string, Int: characterCount) -> String
String::endsWith(String: subject, String: suffix) -> Bool
String::first(String: string, Int: characterCount) -> String
String::fromChar_v1(Character: c) -> String
String::isEmpty(String: s) -> Bool
String::join(List l, String separator) -> String
String::last(String: string, Int: characterCount) -> String
String::length_v1(String: s) -> Int
String::padEnd(String: string, String: padWith, Int: goalLength) -> String
String::padStart(String: string, String: padWith, Int: goalLength) -> String
String::prepend(String: s1, String: s2) -> String
String::replaceAll(String: s, String: searchFor, String: replaceWith) -> String
String::reverse(String: string) -> String
String::slice(String: string, Int: from, Int: to) -> String
String::slugify_v2(String string) -> String
String::split(String s, String separator) -> List
String::startsWith(String: subject, String: prefix) -> Bool
String::toBytes(String: str) -> Bytes
String::toFloat_v1(String: s) -> Result (Float, StringError)
String::toInt_v1(String: s) -> Result (Float, StringError)
String::toList_v1(String: s) -> List Character
String::toLowercase_v1(String: s) -> String
String::toUppercase_v1(String: s) -> String
String::trim(String: str) -> String
String::trimEnd(String: str) -> String
String::trimStart(String: str) -> String
// Maybe could be better
String::htmlEscape(String html) -> String
String::newline() -> String
// Move to UUID module
String::toUUID_v1(String: uuid) -> Result (UUID, StringError)
// Different in v2
String::foreach_v1(String: s, Block f) -> String
String::fromList_v1(List l) -> String
String::random_v2(Int: length) -> String // length < 0 means empty string
v2 Interaction model
String escaping
TODO
Interpolation
TODO