-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Fuzz.string generate unicode strings. #92
Conversation
N.B. master already has a major change on it, renaming the exposed Shrink module to Simplify. |
f68dd5c
to
4c4e25d
Compare
Ready for another round of reviews. @mgold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the fuzzer generate a sting containing two combining marks immediately after each other?
Would it be possible to unit test against it?
I think it can. Do we want to limit ourselves to just valid unicode strings? I bet for 99% of cases it would shrink down to a single one. |
https://en.m.wikipedia.org/wiki/Combining_character
So multiple combination diacritics are valid! |
If there's a bug that requires more than one diacritic mark in a row to trigger, it doesn't hurt if we find it, even if the diff is hard to read (hex output in diff helps). If it's triggered by multiple combining characters in a row but doesn't require those exact characters, it'll shrink to something readable. I think we should keep it as-is, mainly on the argument that it's easier to code/test/understand. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is okay other than adding an unrelated method to the public API.
@mgold Are you thinking of this multi-line import exposing? Github folded it a bit unhelpfully. https://github.com/elm-explorations/test/pull/92/files#diff-adac3ec51c96ebc8ace1960bee9f3d35R51 There's an addition to the public interface of Fuzz.Internal, but that module isn't exposed, and I need that function in Fuzz.elm. |
src/Fuzz.elm
Outdated
, ( 1, Random.int 11 50 ) | ||
, ( 1, Random.int 50 1000 ) | ||
] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this function slightly hard to grok and, whilst having a play with it to get my head round what the code does, rewrote it slightly. I believe the function might be slightly easier to read is this form:
string : Fuzzer String
string =
let
( firstFreq, restFreqs ) =
unicodeCharGeneratorFrequencies
lengthGenerator =
(Random.frequency
( 3, Random.int 1 10 )
[ ( 0.2, Random.constant 0 )
, ( 1, Random.int 11 50 )
, ( 1, Random.int 50 1000 )
]
)
unicodeGenerator =
frequencyList lengthGenerator firstFreq restFreqs
|> Random.map String.fromList
in
custom unicodeGenerator Simplify.string
Otherwise this patch looks good to me :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me :) Took it as you wrote it and ran it though elm-format.
@drathier Ah, that's an import, not an exposing. Looks good! |
Copy of elm-community/elm-test#204. This PR was originally made for elm 0.18 back in August 2017. We delayed merging it since there was a bug with
String.reverse
where it didn't handle emojis or composite characters correctly. The bug has since been both fixed and published, so all 0.19 elm/core versions now reverse the strings we use in this fuzzer properly. There are still issues with the string reversal algorithm used in elm, but this new fuzzer doesn't generate strings that hit those cases.