Releases · utahplt/gtp-benchmarks

03 May 19:20

bennn

9b451b6

v9.3 Latest

Latest

In morsecode change an import type in main.rkt to match the export type (from Index to Integer). No change to performance.

0 = old, 1 = new

morsecode-pr51.tar.gz

Assets 2

22 May 21:01

bennn

v9.2

44cabf5

v9.2 minor take5 changes

Replace the module+ main with a plain expression. Having the submodule is a problem for tools like the contract profiler (a minor problem, but it's easier to drop the submodule).
Add an assert around the call to random because its type no longer guarantees nonnegative numbers. (The old type was unsound but fine to use here.)

Performance is the same afterward:

Assets 3

01 Dec 16:06

bennn

v9.0

3cbcdae

v9.0

Substantially revise acquire and take5. Before, acquire ran a game with AI players that all raised exceptions and take5 ignored an input list of AI players. After, the acquire players make valid moves and take5 uses its input. These changes do not affect the typed/untyped overhead.

Thank you @LLazarek . #38 #39

acquire_8.1.tar.gz
take5_8.2.tar.gz

Contributors

LLazarek

Assets 2

22 Oct 01:03

bennn

v8.0

3d70679

v8.0

Remove racket/sandbox dependency from acquire and remove the player AI that times out.

Performance is similar before and after in a first test.

But in general, this change should make acquire measurements more stable. We care about the cost of types, not of system calls.

Data:
acquire-sandbox.tar.gz

Assets 2

10 Jul 21:32

bennn

v7.0

68cdfe2

v7.0

Fix a return value in lnm. Before it was a port. After it's a void.

Affects the module benchmarks/lnm/untyped/modulegraph.rkt and function ensure-tikz

There is no change in performance

Original issue report: https://github.com/bennn/gtp-benchmarks/issues/25

Assets 3

20 Jul 17:32

bennn

v6.0

d1a8594

v6.0

Major Changes Edited all benchmarks so that typed and untyped code are very similar.

If you compare any two typed/A.rkt and untyped/A.rkt files, the only differences should be the requires and the type annotations.

Example: gregor

In at least one place, the untyped gregor code had an extra assert. It's gone now.

diff --git a/benchmarks/gregor/untyped/date.rkt b/benchmarks/gregor/untyped/date.rkt
index a3102a9..6ceccb7 100644
--- a/benchmarks/gregor/untyped/date.rkt
+++ b/benchmarks/gregor/untyped/date.rkt
@@ -63,7 +64,6 @@
 (define date->ymd Date-ymd)
 ;(: date->jdn (-> Any Integer))
 (define (date->jdn d)
-  (unless (Date? d) (error "date->jdn type error"))
   (Date-jdn d))

Example: lnm

Typed lnm now uses asserts instead of casts to validate input data. Untyped lnm uses the same casts.

diff --git a/benchmarks/lnm/typed/spreadsheet.rkt b/benchmarks/lnm/typed/spreadsheet.rkt
index dd2dcbc..b869929 100644
--- a/benchmarks/lnm/typed/spreadsheet.rkt
+++ b/benchmarks/lnm/typed/spreadsheet.rkt
@@ -62,7 +62,7 @@
   (void)
   ;; For each row, print the config ID and all the values
   (for ([(row n) (in-indexed vec)])
-    (void (natural->bitstring (cast n Index) #:pad (log2 num-configs)))
+    (void (natural->bitstring (assert n index?) #:pad (log2 num-configs)))
     (for ([v row]) (void "~a~a" sep v))
     (void)))
 
@@ -71,8 +71,18 @@
 (define (rktd->spreadsheet input-filename
                              #:output [output #f]
                              #:format [format 'tab])
-  (define vec (cast (file->value input-filename) (Vectorof (Listof Index))))
+  (define vec
+    (for/vector : (Vectorof (Listof Index)) ((x (in-vector (assert (file->value input-filename) vector?))))
+      (listof-index x)))
   (define suffix (symbol->extension format))
   (define out (or output (path-replace-suffix input-filename suffix)))
   (define sep (symbol->separator format))
   (vector->spreadsheet vec out sep))
+
+(: listof-index (-> Any (Listof Index)))
+(define (listof-index x)
+  (if (and (list? x)
+           (andmap index? x))
+    x
+    (error 'listof-index)))

diff --git a/benchmarks/lnm/untyped/spreadsheet.rkt b/benchmarks/lnm/untyped/spreadsheet.rkt
index 18be330..6466fb0 100644
--- a/benchmarks/lnm/untyped/spreadsheet.rkt
+++ b/benchmarks/lnm/untyped/spreadsheet.rkt
@@ -14,6 +14,7 @@
 ;; ----------------------------------------------------------------------------
 
 (require
+  "../base/untyped.rkt"
   (only-in racket/file file->value)
   (only-in "bitstring.rkt" log2 natural->bitstring)
 )
@@ -55,7 +56,7 @@
   (void)
   ;; For each row, print the config ID and all the values
   (for ([(row n) (in-indexed vec)])
-    (void (natural->bitstring n #:pad (log2 num-configs)))
+    (void (natural->bitstring (assert n index?) #:pad (log2 num-configs)))
     (for ([v row]) (void "~a~a" sep v))
     (void)))
 
@@ -64,8 +65,16 @@
 (define (rktd->spreadsheet input-filename
                              #:output [output #f]
                              #:format [format 'tab])
-  (define vec (file->value input-filename))
+  (define vec
+    (for/vector ((x (in-vector (assert (file->value input-filename) vector?))))
+      (listof-index x)))
   (define suffix (symbol->extension format))
   (define out (or output (path-replace-suffix input-filename suffix)))
   (define sep (symbol->separator format))
   (vector->spreadsheet vec out sep))
+
+(define (listof-index x)
+  (if (and (list? x)
+           (andmap index? x))
+    x
+    (error 'listof-index)))

results (on Racket 7.7 BC release)

For most benchmarks, performance is the same before & after. But:

lnm has lower overhead
quadT has higher overhead
quadU has higher overhead

lnm typed code is much faster now (down from ~4.5s to 0.7s) because it uses assert instead of cast. The vector casts in spreadsheet.rkt and summary.rkt cost a little --- putting them back adds 1.5s and 0.5s, respectively. But the big savings comes from replacing (cast .... Index) with (assert .... index?) in bitstring.rkt --- reverting adds almost 2.5s.

Both the untyped and fully-typed quad configurations run faster now, which likely makes the mixed configs. look worse. One reason for the change is that quad? is a simple function instead of a define-predicate ... but things are harder to tease apart. (There are few changes to the main files, so things must be happening related to the base/ context, and that's hard to swap out & test.)

Full data & plots here:
gtp-benchmarks-v5-vs-v6.tar.gz

Raw gtp-measure output:
manifest-v6.tar.gz

Assets 2

20 Nov 00:01

bennn

v5.0

6b2bc0d

v5.0

Fix one bug in lnm and one bug in zordoz.

lnm

The typed lnm code performs an extra cast to satisfy the type checker, BUT the code doing the cast had a use-before-definition bug. That bug is fixed, and now the typed & untyped code compute the same plots.

Pull request, with more details on the issue:
https://github.com/bennn/gtp-benchmarks/pull/19

This change improves performance a little. I guess plot throwing & handling and exception is more expensive than computing the next point to draw.

zordoz

The typed zordoz contained an unused call to format. This call is gone now, so (hopefully) the typed & untyped benchmarks are now running the same code.

Pull request:
https://github.com/bennn/gtp-benchmarks/pull/20

Unfortunately this change has BIG implications for performance. That format call must have been executed often and suffered from runtime checks / wrappers.

old typed/untyped ratio = 10.91x
new typed/untyped ratio = 1.36x

The new zordoz now has worst-case <4x overhead. Before, things went up to 14x. Many thanks to @camoy for finding this small-looking error that introduced large overhead in typed code.

data for plots

zordoz-lnm-v5.tar.gz

Thank you Cameron Moy

Assets 2

03 Nov 01:33

bennn

v4.0

ccdf564

v4.0

Replace a cast in the typed version of zombie with a predicate test.

The untyped code now uses the same predicate.

zombie is now a better gradual typing benchmark because less of its typed/untyped performance changes are explained by a call to cast.

EDIT: here's some data collected with Racket 7.4

old typed/untyped ratio = 4.37x
new typed/untyped ratio = 1.83x

plot of old (zombie-3) vs new (zombie-4) showing that the new version has MORE configurations that suffer LESS overhead

full data behind the plot:
zombie-v4.tar.gz

Thank you Sam Tobin-Hochstadt and Cameron Moy

Assets 2

17 Oct 15:52

bennn

v3.0

1bcb114

v3.0

Fix an issue with the untyped zordoz code.

Before, two untyped modules imported from a typed library. After, the untyped code imports the untyped library.

This change removes an unnecessary boundary, making the untyped code a more realistic baseline for measuring Typed Racket's overhead.

The following plot compares the overhead in zordoz for version 2 (zordoz-v2) and version 3 (zordoz-v3) of the GTP benchmarks. Version 3 is significantly worse:

Full results:
zordoz-gtp-2-vs-3.tar.gz

Thank you Cameron Moy

Assets 2

21 May 01:07

bennn

v2.0

6bf5102

v2.0

Fix a difference between the typed and untyped mbta code. Both are the same now.

The fix does not appear to affect performance.

Attached data:

mbta2-vs-orig.tar.gz : output from a gtp-measure run comparing 0-mbta (after the change) to 1-mbtaorig (before). Also a tab-separated-file with 95% confidence intervals for each configuration

mbta2-vs-orig.tar.gz

picture of overhead before (0-mbta) and after (1-mbtaorig)

Thank you Robby Findler and Sam Sundar

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributors

Example: gregor

Example: lnm

results (on Racket 7.7 BC release)

lnm

zordoz

data for plots

Releases: utahplt/gtp-benchmarks

v9.3

v9.2 minor take5 changes

v9.0

Contributors

v8.0

v7.0

v6.0

Example: gregor

Example: lnm

results (on Racket 7.7 BC release)

v5.0

lnm

zordoz

data for plots

v4.0

v3.0

v2.0