Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

analysis #25

Open
1 of 10 tasks
KiaraGrouwstra opened this issue Feb 11, 2020 · 0 comments
Open
1 of 10 tasks

analysis #25

KiaraGrouwstra opened this issue Feb 11, 2020 · 0 comments

Comments

@KiaraGrouwstra
Copy link
Owner

KiaraGrouwstra commented Feb 11, 2020

analysis on synthesis:

  • for each task fn I think the stats I may want include: total samples, number correct/wrong, ratio correct/wrong, then the number / ratio over all task fns.

see Jupyter PoC - reported usage issues with this here.

out of scope:

visualization:

analysis on generated datasets:

  • further analysis -- I imagine a lot of factors may influence how much my synthesizer can learn from types, including perhaps number of types used, generics used, typeclasses / constraints used, arities of all these... if I can find the patterns here it will likely help find a dataset demonstrating added value of this additional info.
  • per dataset, visualize distribution of sample i/o string length, so I can visually determine a sweet spot cut-off string length (esp. round numbers e.g. powers of 2) so as to statically determine a limited maximum string length (and corresponding dataset length) that will still allow most of the samples...

plot results for analysis

  • hasktorch tensorboard issue
  • regular plotting libraries in haskell seem inferior to seaborn so maybe just dump stats to csv then use that (old code: 1, 2, 3, 4)
@KiaraGrouwstra KiaraGrouwstra changed the title analyze dataset analysis May 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant