Graph Tests #1464

Marsella8 · 2024-08-11T06:01:55Z

Description of changes:

Additional testing for graph utilities

Related Issues:

Linked Issues:

Issue #

Issues closed by this PR:

Closes #

This change is

Marsella8

Reviewable status: 0 of 16 files reviewed, 2 unresolved discussions (waiting on @lockshaw)

lib/utils/test/src/utils/graph/views/views.cc line 48 at r1 (raw file):

      std::unordered_set<UndirectedEdge> result = get_edges(view);

      // CHECK(result == expected);

possible bug? currently the get_edges also returns the edges that "cross the cut" (e.g. (1, 2)), though this does not happen for example with DiSubgraphView.

lib/utils/test/src/utils/graph/views/views.cc line 88 at r1 (raw file):

    }
  }

commented because not yet implemented

codecov · 2024-08-11T06:24:46Z

Codecov Report

Attention: Patch coverage is 98.42141% with 23 lines in your changes missing coverage. Please review.

Project coverage is 73.12%. Comparing base (2b4106f) to head (a371c02).

Files with missing lines	Patch %	Lines
lib/utils/src/utils/graph/views/views.cc	38.46%	8 Missing ⚠️
.../utils/graph/instances/hashmap_undirected_graph.cc	61.53%	5 Missing ⚠️
lib/utils/src/utils/graph/algorithms.cc	0.00%	3 Missing ⚠️
lib/utils/include/utils/fmt/optional.h	0.00%	2 Missing ⚠️
lib/utils/test/src/utils/disjoint_set.cc	95.00%	2 Missing ⚠️
...graph/digraph/algorithms/get_imm_post_dominator.cc	0.00%	1 Missing ⚠️
...tils/src/utils/graph/undirected/undirected_edge.cc	0.00%	1 Missing ⚠️
lib/utils/test/src/utils/containers/flatmap.cc	92.85%	1 Missing ⚠️

Additional details and impacted files

@@                Coverage Diff                @@
##           repo-refactor    #1464      +/-   ##
=================================================
+ Coverage          68.49%   73.12%   +4.63%     
=================================================
  Files                612      690      +78     
  Lines              18120    20145    +2025     
  Branches             519      566      +47     
=================================================
+ Hits               12412    14732    +2320     
+ Misses              5708     5413     -295

Flag	Coverage Δ
unittests	`73.12% <98.42%> (+4.63%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
lib/compiler/src/machine_mapping.cc	`0.00% <ø> (ø)`
...ls/include/utils/bidict/algorithms/merge_bidicts.h	`100.00% <100.00%> (ø)`
lib/utils/include/utils/containers/are_all_same.h	`100.00% <100.00%> (ø)`
lib/utils/include/utils/containers/compare_by.h	`100.00% <100.00%> (ø)`
lib/utils/include/utils/containers/get_one_of.h	`100.00% <100.00%> (ø)`
lib/utils/include/utils/containers/index_of.h	`100.00% <100.00%> (ø)`
lib/utils/include/utils/containers/is_submap.h	`100.00% <100.00%> (ø)`
.../utils/include/utils/containers/is_superseteq_of.h	`100.00% <100.00%> (ø)`
lib/utils/include/utils/containers/map_keys.h	`100.00% <ø> (ø)`
lib/utils/include/utils/containers/maximum.h	`100.00% <100.00%> (ø)`
... and 88 more

... and 33 files with indirect coverage changes

Marsella8

Reviewable status: 0 of 57 files reviewed, 3 unresolved discussions (waiting on @lockshaw)

lib/utils/test/src/utils/containers/get_first.cc line 10 at r2 (raw file):

  TEST_CASE("get_first") {
    std::unordered_set<int> s = {1, 2, 3};
    CHECK(contains(s, get_first(s)));

maybe get_any? get first is a bit confusing.

lockshaw

Reviewed 14 of 16 files at r1, 41 of 41 files at r2, 12 of 12 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 100 unresolved discussions (waiting on @Marsella8)

lib/utils/include/utils/graph/README.md line 0 at r4 (raw file):
Thanks for going out of your way to update the documentation! This is very much appreciated 🙂

lib/utils/include/utils/graph/README.md line 21 at r4 (raw file):

- `DiGraph`: at most one edge allowed between every ordered pair of nodes, edges are directed (i.e., have a source node and a destination node)
- `MultiDiGraph`: arbitrary numbers of directed edges allowed between every pair of nodes.
- `DataFlowGraph`: similar to `MultiDiGraph`, but the edges entering, exiting a given nodes now have a well-defined order.

DataflowGraph (note capitalization) also has some additional restrictions: first of all all DataflowGraphs are DAGs, and they have an explicit concept of inputs and outputs from a node, with the restriction that exactly one edge (no more and no less) can be going into an individual input. Conceptually DataflowGraph is intended to represent computation-style graphs, where edges represent value uses and nodes represent functions from tuples of inputs to tuples of outputs.

lib/utils/include/utils/graph/README.md line 126 at r4 (raw file):

This graph class is particularly useful for processing a sub-graph of a given graph while still maintaining information regarding the edges that cross the cut.

### Labelled Dataflow Variant

The main (and possibly only now?) labelled variant is LabelledDataflowGraph, which has labels for nodes and for node outputs, similar to the old OutputLabelledMultiDiGraph, and LabelledOpenDataflowGraph has labels for nodes and for OpenDataflowValues, which are a a variant of node outputs and open graph inputs. It might be worth addressing these a bit more individually, and making it clear what in them is labelled (the concept and consequences of outputs being labelled rather than edges being labelled is a bit unintuitive, at least if you're approaching it from a more pure graph theory perspective)

lib/utils/include/utils/graph/undirected/undirected_edge.h line 7 at r4 (raw file):

namespace FlexFlow {

struct UndirectedEdge {

FYI this can be moved over to use dtgen and commutative_pair

lib/utils/test/src/utils/containers.cc line 13 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("sum") {

Ideally pull these out into their own files in utils/containers/. The goal is to eventually remove containers.h be either moving functions into their own files in utils/containers.h, or by deleting functions that aren't particularly useful

lib/utils/test/src/utils/containers.cc line 15 at r4 (raw file):

  TEST_CASE("sum") {
    std::vector<int> v = {1, 2, 3, 4, 5};
    CHECK(sum(v) == 15);

Check case of an empty list

lib/utils/test/src/utils/containers.cc line 18 at r4 (raw file):

  }

  TEST_CASE("sum_where") {

Check case of an empty list and/or a predicate that fails on every element

lib/utils/test/src/utils/containers.cc line 24 at r4 (raw file):

  }

  TEST_CASE("product_where") {

Check case of an empty list and/or a predicate that fails on every element

lib/utils/test/src/utils/containers.cc line 30 at r4 (raw file):

  }

  TEST_CASE("contains_l and contains_r") {

These should be moved into their own files in utils/bidict/algorithms

lib/utils/test/src/utils/containers.cc line 38 at r4 (raw file):

    CHECK(contains_l(bd, 3) == false);
    CHECK(contains_r(bd, std::string("one")) == true);
    CHECK(contains_r(bd, std::string("three")) == false);

Add a testcase where L and R are the same type to ensure that nothing gets confused (i.e., check that for bd = {{1, 2}, {3, 4}} that contains_l(bd, 2) is true and contains_r(bd, 2) is false

lib/utils/test/src/utils/containers.cc line 48 at r4 (raw file):

    CHECK(is_submap(m1, m2) == true);
    CHECK(is_submap(m1, m3) == false);

Check additional edge cases here--what if the keys are a subset but not the values? What if the values are a subset but not the keys?

lib/utils/test/src/utils/containers.cc line 51 at r4 (raw file):

  }

  TEST_CASE("index_of") {

Check edge case where there are duplicate values (i.e., index_of({1, 2, 3, 4, 3, 5}, 3)

lib/utils/test/src/utils/containers.cc line 57 at r4 (raw file):

  }

  TEST_CASE("merge_maps") {

Split into separate merge_maps and merge_bidicts functions

lib/utils/test/src/utils/containers.cc line 66 at r4 (raw file):

                                                       {"four", 4}};
      bidict<int, std::string> lhs{fwd_map1, bwd_map1};
      bidict<int, std::string> rhs{fwd_map2, bwd_map2};

Why construct like this, why not just use equate or take in a list of pairs or something? Constructing fwd_map and bwd_map individually seems really error prone (and honestly the API probably shouldn't even allow this)

Code quote:

      std::unordered_map<int, std::string> fwd_map1 = {{1, "one"}, {2, "two"}};
      std::unordered_map<std::string, int> bwd_map1 = {{"one", 1}, {"two", 2}};
      std::unordered_map<int, std::string> fwd_map2 = {{3, "three"},
                                                       {4, "four"}};
      std::unordered_map<std::string, int> bwd_map2 = {{"three", 3},
                                                       {"four", 4}};
      bidict<int, std::string> lhs{fwd_map1, bwd_map1};
      bidict<int, std::string> rhs{fwd_map2, bwd_map2};

lib/utils/test/src/utils/containers.cc line 78 at r4 (raw file):

      std::unordered_map<int, std::string> rhs = {{3, "three"}, {4, "four"}};
      std::unordered_map<int, std::string> result = merge_maps(lhs, rhs);
      std::unordered_map<int, std::string> expected = {

Add a check that merge_maps throws an exception if the maps aren't disjoint in the key set

lib/utils/test/src/utils/containers.cc line 78 at r4 (raw file):

      std::unordered_map<int, std::string> rhs = {{3, "three"}, {4, "four"}};
      std::unordered_map<int, std::string> result = merge_maps(lhs, rhs);
      std::unordered_map<int, std::string> expected = {

Add a testcase where the value sets aren't disjoint

lib/utils/test/src/utils/containers.cc line 87 at r4 (raw file):

    auto f = lookup_in(m);
    CHECK(f(1) == "one");
    CHECK(f(2) == "two");

Check that the correct exception type is thrown if an invalid key is used

lib/utils/test/src/utils/containers.cc line 90 at r4 (raw file):

  }

  TEST_CASE("lookup_in_l") {

Move to its own file in utils/bidict/algorithms

lib/utils/test/src/utils/containers.cc line 99 at r4 (raw file):

  }

  TEST_CASE("lookup_in_r") {

Move to its own file in utils/bidict/algorithms

lib/utils/test/src/utils/containers.cc line 114 at r4 (raw file):

    CHECK(is_superseteq_of(s1, s2) == true);
    CHECK(is_superseteq_of(s1, s3) == false);

Check that ever set is a superseteq of itself (i.e., check that this function is reflexive)

lib/utils/test/src/utils/containers.cc line 117 at r4 (raw file):

  }

  TEST_CASE("restrict_keys") {

Isn't there already a test file for this (i.e., restrict_keys.cc)?

lib/utils/test/src/utils/containers.cc line 127 at r4 (raw file):

  TEST_CASE("optional_all_of") {
    std::vector<int> v = {2, 4, 6, 8};

What if the list is empty?

lib/utils/test/src/utils/containers.cc line 128 at r4 (raw file):

  TEST_CASE("optional_all_of") {
    std::vector<int> v = {2, 4, 6, 8};
    auto f = [](int x) -> std::optional<bool> { return x % 2 == 0; };

Separate subcases

lib/utils/test/src/utils/containers.cc line 140 at r4 (raw file):

  }

  TEST_CASE("are_all_same") {

What if the list is empty?

lib/utils/test/src/utils/containers.cc line 161 at r4 (raw file):

    std::vector<int> v = {2, 3, 4, 5};
    auto result = flatmap<int, decltype(get_factors), int>(v, get_factors);

Fix the type/template signature of flatmap so that the template arguments can be deduced (i.e., deduce Out from In and F using std::invoke_result_t)

lib/utils/test/src/utils/containers.cc line 173 at r4 (raw file):

  }

  TEST_CASE("maximum") {

What's the behavior if the list is empty?

lib/utils/test/src/utils/hash-utils.cc line 93 at r4 (raw file):

      CHECK(hash1 != hash2);

      tuple1 = tuple2;

For consistency with the other testcases

Suggestion:

      gte<0>(tuple1) = 2;

lib/utils/test/src/utils/join_strings.cc line 12 at r4 (raw file):

    std::vector<std::string> const v = {"Hello", "world", "!"};

    SUBCASE("iterator") {

Ideally also add a testcase for the join_strings(InputIt, InputIt, std::string, F) overload

lib/utils/test/src/utils/containers/are_disjoint.cc line 13 at r4 (raw file):

    CHECK(are_disjoint(l, r));
    r.insert(3);
    CHECK_FALSE(are_disjoint(l, r));

Add a test case where one or both sets are empty

lib/utils/test/src/utils/containers/as_vector.cc line 12 at r4 (raw file):

    std::unordered_set<int> s = {1, 2, 3};
    std::vector<int> result = as_vector(s);
    CHECK(result == std::vector<int>({3, 2, 1}));

Ideally don't rely on the ordering of std::unordered_set

lib/utils/test/src/utils/containers/enumerate.cc line 12 at r4 (raw file):

  TEST_CASE("enumerate") {
    std::unordered_set<int> input_set = {1, 2, 3, 4, 5};
    std::unordered_map<size_t, int> result = enumerate(input_set);

Doesn't this return a bidict?

lib/utils/test/src/utils/containers/enumerate.cc line 13 at r4 (raw file):

    std::unordered_set<int> input_set = {1, 2, 3, 4, 5};
    std::unordered_map<size_t, int> result = enumerate(input_set);
    std::unordered_map<size_t, int> expected = {

Suggestion:

    std::unordered_map<size_t, int> correct = {

lib/utils/test/src/utils/containers/enumerate.cc line 14 at r4 (raw file):

    std::unordered_map<size_t, int> result = enumerate(input_set);
    std::unordered_map<size_t, int> expected = {
        {1, 4}, {2, 3}, {3, 2}, {4, 1}, {0, 5}};

Ideally don't rely on the ordering of an unordered_set--better to enumerate a vector if you want to check precise ordering, or do different checks that aren't order dependent to test the overload for unordered_set

Code quote:

        {1, 4}, {2, 3}, {3, 2}, {4, 1}, {0, 5}};

lib/utils/test/src/utils/containers/filter_keys.cc line 14 at r4 (raw file):

    auto f = [](int x) { return x % 2 == 1; }; // Filtering function
    std::unordered_map<int, std::string> result = filter_keys(m, f);
    std::unordered_map<int, std::string> expected = {{1, "one"}, {3, "three"}};

Suggestion:

    std::unordered_map<int, std::string> correct = {{1, "one"}, {3, "three"}};

lib/utils/test/src/utils/containers/find.cc line 7 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("find") {

Since find is intended to work over a bunch of data structures (vector, unordered_set, set at the very least) add tests of additional datatypes--just because the code of the function is the same doesn't necessarily mean that the APIs used have the same meaning for each of these datatu[es

Code quote:

  TEST_CASE("find") {

lib/utils/test/src/utils/containers/find.cc line 9 at r4 (raw file):

  TEST_CASE("find") {
    std::vector<int> v = {1, 2, 3, 4, 5};
    CHECK(find(v, 3) != v.cend());

Suggestion:

    CHECK(find(v, 3) == v.find(3));

lib/utils/test/src/utils/containers/find.cc line 10 at r4 (raw file):

    std::vector<int> v = {1, 2, 3, 4, 5};
    CHECK(find(v, 3) != v.cend());
    CHECK(find(v, 6) == v.cend());

Suggestion:

    CHECK(find(v, 6) == v.find(6));

lib/utils/test/src/utils/containers/get_first.cc line 10 at r2 (raw file):

Previously, Marsella8 wrote…

maybe get_any? get first is a bit confusing.

Good point, but I'd probably go with get_one_of or choose_one_of

lib/utils/test/src/utils/containers/keys.cc line 14 at r4 (raw file):

        {1, "one"}, {2, "two"}, {3, "three"}};
    std::unordered_set<int> result = keys(m);
    std::unordered_set<int> expected = {3, 2, 1};

Why flip this unnecessarily?

Suggestion:

    std::unordered_set<int> expected = {1, 2, 3};

lib/utils/test/src/utils/containers/map_keys.cc line 11 at r4 (raw file):

  TEST_CASE("map_keys") {
    std::unordered_map<int, std::string> m = {{1, "one"}, {2, "two"}};
    auto f = [](int x) { return x * x; }; // Mapping function

Suggestion:

    auto f = [](int x) { return x * x; };

lib/utils/test/src/utils/containers/map_keys.cc line 12 at r4 (raw file):

    std::unordered_map<int, std::string> m = {{1, "one"}, {2, "two"}};
    auto f = [](int x) { return x * x; }; // Mapping function
    auto result = map_keys(m, f);

Suggestion:

    std::unordered_map<int, std::string> result = map_keys(m, f);

lib/utils/test/src/utils/containers/map_keys.cc line 15 at r4 (raw file):

    CHECK(result.size() == 2);
    CHECK(result[1] == "one");
    CHECK(result[4] == "two");

Suggestion:

    std::unordered_map<int, std::string> correct = {{1, "one"}, {4, "two"}};
    CHECK(result == correct);

lib/utils/test/src/utils/containers/map_values.cc line 10 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("map_values") {
    std::unordered_map<int, std::string> m = {{1, "one"}, {2, "two"}};

It would be nice to have not all the resulting sizes be the same

Suggestion:

    std::unordered_map<int, std::string> m = {{1, "one"}, {3, "three"}};

lib/utils/test/src/utils/containers/map_values.cc line 11 at r4 (raw file):

  TEST_CASE("map_values") {
    std::unordered_map<int, std::string> m = {{1, "one"}, {2, "two"}};
    auto f = [](std::string const &s) { return s.size(); }; // Mapping function

Suggestion:

    auto f = [](std::string const &s) { return s.size(); };

lib/utils/test/src/utils/containers/map_values.cc line 13 at r4 (raw file):

    auto f = [](std::string const &s) { return s.size(); }; // Mapping function
    std::unordered_map<int, size_t> result = map_values(m, f);
    std::unordered_map<int, size_t> expected = {{1, 3}, {2, 3}};

Use correct instead of expected for consistency

Suggestion:

    std::unordered_map<int, size_t> correct = {{1, 3}, {2, 3}};

lib/utils/test/src/utils/containers/restrict_keys.cc line 15 at r4 (raw file):

    std::unordered_set<int> mask = {2, 3, 4};
    std::unordered_map<int, std::string> result = restrict_keys(m, mask);
    std::unordered_map<int, std::string> expected = {{2, "two"}, {3, "three"}};

Suggestion:

    std::unordered_map<int, std::string> correct = {{2, "two"}, {3, "three"}};

lib/utils/test/src/utils/containers/reversed.cc line 14 at r4 (raw file):

    // Checking the reversed sequence is as expected
    CHECK(result == expected);

Suggestion:

    std::vector<int> result = reversed(input_vec);
    std::vector<int> correct = {5, 4, 3, 2, 1};
    CHECK(result == correct);

lib/utils/test/src/utils/containers/set_union.cc line 12 at r4 (raw file):

    std::unordered_set<int> s2 = {2, 3, 4};
    std::unordered_set<int> result = set_union(s1, s2);
    std::unordered_set<int> expected = {1, 2, 3, 4};

Suggestion:

    std::unordered_set<int> correct = {1, 2, 3, 4};

lib/utils/test/src/utils/containers/sorted_by.cc line 9 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("Testing sorted_by function") {

Suggestion:

  TEST_CASE("sorted_by") {

lib/utils/test/src/utils/containers/sorted_by.cc line 12 at r4 (raw file):

    std::unordered_set<int> s = {5, 2, 3, 4, 1};
    auto sorted_s = sorted_by(s, [](int a, int b) { return a < b; });
    CHECK(sorted_s == std::vector<int>({1, 2, 3, 4, 5}));

Suggestion:

    SUBCASE("sort increasing") {
      std::unordered_set<int> s = {5, 2, 3, 4, 1};
      std::vector<int> result = sorted_by(s, [](int a, int b) { return a < b; });
      std::vector<int> correct = {1, 2, 3, 4, 5};
      CHECK(result == correct);
    }

lib/utils/test/src/utils/containers/sorted_by.cc line 16 at r4 (raw file):

    std::unordered_set<int> s2 = {-5, -1, -3, -2, -4};
    auto sorted_s2 = sorted_by(s2, [](int a, int b) { return a > b; });
    CHECK(sorted_s2 == std::vector<int>({-1, -2, -3, -4, -5}));

Suggestion:

    SUBCASE("sort decreasing") {
      std::unordered_set<int> input = {-5, -1, -3, -2, -4};
      std::vector<int> result = sorted_by(input, [](int a, int b) { return a > b; });
      std::vector<int> correct = {-1, -2, -3, -4, -5};
      CHECK(result == correct);
    }

lib/utils/test/src/utils/containers/subvec.cc line 8 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("Testing subvec function") {

Suggestion:

  TEST_CASE("subvec") {

lib/utils/test/src/utils/containers/subvec.cc line 15 at r4 (raw file):

    auto subvec_v2 = subvec(v, std::nullopt, std::optional<int>(3));
    CHECK(subvec_v2 == std::vector<int>({1, 2, 3}));

Break up into subcases, use result and correct names, test more edge cases (what if both values are std::nullopt? what if they are both integers, but the first one is less than the other? What if the integers are negative?)

Code quote:

    std::vector<int> v = {1, 2, 3, 4, 5};
    auto subvec_v = subvec(v, std::optional<int>(1), std::optional<int>(4));

    CHECK(subvec_v == std::vector<int>({2, 3, 4}));

    auto subvec_v2 = subvec(v, std::nullopt, std::optional<int>(3));
    CHECK(subvec_v2 == std::vector<int>({1, 2, 3}));

lib/utils/test/src/utils/containers/values.cc line 10 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("values") {

Add a testcase where there are duplicate values

lib/utils/test/src/utils/containers/values.cc line 14 at r4 (raw file):

        {1, "one"}, {2, "two"}, {3, "three"}};
    std::vector<std::string> result = values(m);
    std::vector<std::string> expected = {"three", "two", "one"};

Don't rely on the order of unordered_set

Code quote:

    std::vector<std::string> expected = {"three", "two", "one"};

lib/utils/test/src/utils/containers/values.cc line 14 at r4 (raw file):

        {1, "one"}, {2, "two"}, {3, "three"}};
    std::vector<std::string> result = values(m);
    std::vector<std::string> expected = {"three", "two", "one"};

Use correct instead of expected for consistency

lib/utils/test/src/utils/containers/vector_split.cc line 8 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("Testing vector_split function") {

Check boundary cases (what if idx is 0? what if idx is 4?)

lib/utils/test/src/utils/containers/vector_split.cc line 8 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("Testing vector_split function") {

Suggestion:

  TEST_CASE("vector_split") {

lib/utils/test/src/utils/graph/traversal.cc line 8 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("traversal") {

Split into separate TEST_CASEs for separate functions (i.e., one TEST_CASE for get_unchecked_dfs_ordering, one TEST_CASE for get_bfs_ordering, one TEST_CASE for get_dfs_ordering, each with potentially multiple SUBCASEs)

Code quote:

  TEST_CASE("traversal") {

lib/utils/test/src/utils/graph/traversal.cc line 34 at r4 (raw file):

      CHECK(get_dfs_ordering(g, {n[0]}) ==
            std::vector<Node>{n[0], n[1], n[2], n[3]});

Use result and correct to improve readability

Code quote:

      CHECK(get_dfs_ordering(g, {n[0]}) ==
            std::vector<Node>{n[0], n[1], n[2], n[3]});

lib/utils/test/src/utils/graph/traversal.cc line 37 at r4 (raw file):

    }
    SUBCASE("nonlinear") {
      g.add_edge(DirectedEdge{n[1], n[3]});

No check in this SUBCASE?

lib/utils/test/src/utils/graph/digraph/algorithms.cc line 11 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("DiGraph - algorithms.cc") {

Break into separate TEST_CASEs (one for each function) and add tests for some more error cases (graphs with multiple components, cycles, etc. Try to keep them small, you won't need huge graphs to test these properties as long as you're not trying to test them all at once)

lib/utils/test/src/utils/graph/digraph/algorithms.cc line 19 at r4 (raw file):

        DirectedEdge{n[0], n[1]},
        DirectedEdge{n[0], n[2]},
        DirectedEdge{n[1], n[2]},

Sort for readability

Suggestion:

        DirectedEdge{n[0], n[1]},
        DirectedEdge{n[0], n[2]},
        DirectedEdge{n[0], n[3]},
        DirectedEdge{n[1], n[2]},

lib/utils/test/src/utils/graph/digraph/algorithms.cc line 24 at r4 (raw file):

    SUBCASE("get_edges") {
      std::unordered_set<DirectedEdge> expected_edges = unordered_set_of(e);

Prefer correct over expected for consistency

Code quote:

expected_edges

lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc line 17 at r4 (raw file):

    add_edges(g,
              {DirectedEdge{n[0], n[1]},

Prefer .at for bounds checking

lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc line 35 at r4 (raw file):

    SUBCASE("matches_edge") {
      DirectedEdgeQuery q{{n[0]}, {n[1]}};

Suggestion:

      DirectedEdgeQuery q = DirectedEdgeQuery{
        query_set{n[0]}, 
        query_set{n[1]}
      };

lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc line 41 at r4 (raw file):

    }

    SUBCASE("query_intersection") {

Should also test query_intersection when some of the query_sets are std::nullopt (i.e., they match everything). Then again this function isn't hugely important, so up to you

lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc line 42 at r4 (raw file):

    SUBCASE("query_intersection") {
      DirectedEdgeQuery q1{{n[0], n[1]}, {n[1], n[2], n[4]}};

More verbose, but probably more readable than trying to parse the nested curly braces

Suggestion:

      DirectedEdgeQuery q1 = DirectedQuery{
        query_set{n[0], n[1]}, 
        query_set{n[1], n[2], n[4]},
      };

lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc line 51 at r4 (raw file):

      CHECK(result.srcs == expected_srcs);
      CHECK(result.dsts == expected_dsts);

Suggestion:

      DirectedEdgeQuery result = query_intersection(q1, q2);
      DirectedEdgeQuery correct = DirectedEdgeQuery{
        query_set{n.at(1)},
        query_set{n.at(2)},
      };

      CHECK(result == correct);

lib/utils/test/src/utils/graph/digraph/get_connected_components.cc line 20 at r4 (raw file):

    };

    CHECK(get_connected_components(g) == expected_components);

Suggestion:

    UndirectedGraph g = UndirectedGraph::create<HashmapUndirectedGraph>();
    std::vector<Node> n = add_nodes(g, 4);

    std::vector<UndirectedEdge> edges = {{n[0], n[1]}, {n[2], n[1]}};
    add_edges(g, edges);

    std::unordered_set<std::unordered_set<Node>> result = get_connected_components(g);
    std::unordered_set<std::unordered_set<Node>> correct = {
        {n[0], n[1], n[2]},
        {n[3]},
    };

    CHECK(result == correct););

lib/utils/test/src/utils/graph/digraph/get_topological_ordering.cc line 13 at r4 (raw file):

    DiGraph g = DiGraph::create<AdjacencyDiGraph>();
    std::vector<Node> n = add_nodes(g, 6);
    std::vector<DirectedEdge> edges = {DirectedEdge{n[0], n[1]},

Prefer .at for bounds checking

lib/utils/test/src/utils/graph/digraph/get_topological_ordering.cc line 24 at r4 (raw file):

      CHECK(index_of(ordering, n[l]).has_value());
      CHECK(index_of(ordering, n[r]).has_value());
      CHECK(index_of(ordering, n[l]) < index_of(ordering, n[r]));

Comparing the std::optionals directly feels a bit weird. Also you probably don't need the contains checks if you dereference the values, as dereferencing a std::optional without a value already throws an error

Suggestion:

     CHECK(index_of(ordering, n[l]).value() < index_of(ordering, n[r]).value());

lib/utils/test/src/utils/graph/digraph/get_weakly_connected_components.cc line 9 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {

  TEST_CASE(FF_TEST_SUITE) {

Test case name?

lib/utils/test/src/utils/graph/digraph/get_weakly_connected_components.cc line 9 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {

  TEST_CASE(FF_TEST_SUITE) {

Suggestion:

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE(FF_TEST_SUITE) {

lib/utils/test/src/utils/graph/digraph/get_weakly_connected_components.cc line 22 at r4 (raw file):

    };

    CHECK(get_weakly_connected_components(g) == expected_components);

Suggestion:

    std::vector<DirectedEdge> edges = {DirectedEdge{n[0], n[1]},
                                       DirectedEdge{n[2], n[1]}};
    add_edges(g, edges);
    
    std::unordered_set<std::unordered_set<Node>> result = get_weakly_connected_components(g);
    std::unordered_set<std::unordered_set<Node>> correct = {
        {n[0], n[1], n[2]},
        {n[3]},
    };

    CHECK(result == correct);

lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 10 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("get_dominators") {

Another good test graph (contains cycles):

FlexFlow/lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators_map.cc

Lines 10 to 26 in 5e1f349

    
           // example from 
        
           // https://en.wikipedia.org/w/index.php?title=Dominator_(graph_theory)&oldid=1189814332 
        
           DiGraph g = DiGraph::create<AdjacencyDiGraph>(); 
        
           std::vector<Node> n = add_nodes(g, 6); 
        
           add_edges(g, 
        
                     { 
        
                         DirectedEdge{n.at(0), n.at(1)}, 
        
                         DirectedEdge{n.at(1), n.at(2)}, 
        
                         DirectedEdge{n.at(1), n.at(3)}, 
        
                         DirectedEdge{n.at(1), n.at(5)}, 
        
                         DirectedEdge{n.at(2), n.at(4)}, 
        
                         DirectedEdge{n.at(3), n.at(4)}, 
        
                         DirectedEdge{n.at(4), n.at(1)}, 
        
                     });

lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 24 at r4 (raw file):

    SUBCASE("single node") {
      std::unordered_set<Node> expected_dominators = {n[0], n[2]};
      CHECK(get_dominators(g, n[2]) == expected_dominators);

Prefer consistency where possible, even if it's a bit verbose

Suggestion:

      std::unordered_set<Node> result = get_dominators(g, n[2]);
      std::unordered_set<Node> correct = {n[0], n[2]};
      CHECK(result == correct);

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 23 at r4 (raw file):

    MultiDiEdge e1 = g.add_edge(n2, n0);
    MultiDiEdge e2 = g.add_edge(n0, n2);
    MultiDiEdge e3 = g.add_edge(n1, n2);

Why no duplicate edges? The whole point of MultiDiGraph is that you can do g.add_edge(n0, n1); g.add_edge(n0, n1);, and then you end up with two MultiDiEdges having been added. add_edge should be tested for this.

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 25 at r4 (raw file):

    MultiDiEdge e3 = g.add_edge(n1, n2);

    SUBCASE("add_node") {

Could consider using get_graph_data as introduced in #1471 for some of these? Up to you, this way of doing things is also fine, just checking all of the graph data might be more complete in some ways

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 28 at r4 (raw file):

      Node n3 = g.add_node();
      std::unordered_set<Node> nodes = g.query_nodes(NodeQuery{{n3}});
      CHECK(contains(nodes, n3));

Suggestion:

      std::unordered_set<Node> query_result = g.query_nodes(NodeQuery{{n3}});
      std::unordered_set<Node> correct = {n3};
      CHECK(query_result == correct);

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 35 at r4 (raw file):

      std::unordered_set<MultiDiEdge> edges =
          g.query_edges(MultiDiEdgeQuery({n2}, {n1}));
      CHECK(contains(edges, e4));

Suggestion:

      MultiDiEdge e4 = g.add_edge(n2, n1);
      std::unordered_set<MultiDiEdge> query_result =
          g.query_edges(MultiDiEdgeQuery({n2}, {n1}));
      std::unordered_set<MultiDiEdge> correct = {e4};
      CHECK(query_result == correct);

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 41 at r4 (raw file):

      g.remove_node(n0);
      std::unordered_set<Node> nodes = g.query_nodes(NodeQuery{{n0}});
      CHECK_FALSE(contains(nodes, n0));

Suggestion:

      std::unordered_set<Node> node_query_result = g.query_nodes(NodeQuery{{n0}});
      std::unordered_set<Node> correct_node_query_result = {}; 
      CHECK(node_query_result == correct_node_query_result);

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 47 at r4 (raw file):

      CHECK_FALSE(contains(edges_after_removal, e0));
      CHECK_FALSE(contains(edges_after_removal, e1));
      CHECK_FALSE(contains(edges_after_removal, e2));

This test isn't correct--this query will only return edges from n0 -> n0, so even if n0 wasn't deleted, edges_after_removal would still be empty

Code quote:

      std::unordered_set<MultiDiEdge> edges_after_removal =
          g.query_edges(MultiDiEdgeQuery({n0}, {n0}));
      CHECK_FALSE(contains(edges_after_removal, e0));
      CHECK_FALSE(contains(edges_after_removal, e1));
      CHECK_FALSE(contains(edges_after_removal, e2));

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 54 at r4 (raw file):

      std::unordered_set<MultiDiEdge> edges =
          g.query_edges(MultiDiEdgeQuery({n1}, {n2}));
      CHECK_FALSE(contains(edges, e3));

Suggestion:

      std::unordered_set<MultiDiEdge> query_result =
          g.query_edges(MultiDiEdgeQuery({n1}, {n2}));
      std::unordered_set<MultiDiEdge> correct = {};
      CHECK_FALSE(query_result == correct);

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 59 at r4 (raw file):

    SUBCASE("query_nodes") {
      std::unordered_set<Node> all_nodes =
          g.query_nodes(NodeQuery{{n0, n1, n2}});

Also test matchall, and queries for nodes that don't exist?

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 69 at r4 (raw file):

    SUBCASE("query_edges") {
      std::unordered_set<MultiDiEdge> all_edges =
          g.query_edges(MultiDiEdgeQuery({n0, n1, n2}, {n0, n1, n2}));

Also test matchall, and for nodes that don't exist?

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc line 14 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("MultiDiGraph - get_incoming_edges") {

Suggestion:

  TEST_CASE("get_incoming_edges(MultiDiGraphView, Node)") {

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc line 23 at r4 (raw file):

        {n.at(1), n.at(1)},
        {n.at(0), n.at(0)},
    };

Why not the usual std::vector<DirectedEdge>?

Code quote:

    std::vector<std::pair<Node, Node>> input = {
        {n.at(0), n.at(1)},
        {n.at(0), n.at(1)},
        {n.at(1), n.at(1)},
        {n.at(0), n.at(0)},
    };

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc line 23 at r4 (raw file):

        {n.at(1), n.at(1)},
        {n.at(0), n.at(0)},
    };

Sort for readability

Suggestion:

    std::vector<std::pair<Node, Node>> input = {
        {n.at(0), n.at(0)},
        {n.at(0), n.at(1)},
        {n.at(0), n.at(1)},
        {n.at(1), n.at(1)},
    };

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc line 29 at r4 (raw file):

    CHECK(get_incoming_edges(g, n[1]) ==
          std::unordered_set<MultiDiEdge>{edges[0], edges[1], edges[2]});
    CHECK(get_incoming_edges(g, n[2]) == std::unordered_set<MultiDiEdge>{});

Suggestion:

    SUBCASE("node has incoming edges") {
      std::unordered_set<MultiDiEdge> result = get_incoming_edges(g, n.at(1));
      std::unordered_set<MultiDiEdge> correct = {edges.at(0), edges.at(1), edges.at(2)};
      CHECK(result == correct);
    }
  
    SUBCASE("node has no incoming edges") {
      std::unordered_set<MultiDiEdge> result = get_incoming_edges(g, n.at(2));
      std::unordered_set<MultiDiEdge> correct = {};
      CHECK(result == correct);
    }

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc line 14 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("MultiDiGraph - get_outgoing_edges") {

Suggestion:

  TEST_CASE("get_outgoing_edges(MultiDiGraphView, Node)") {

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc line 22 at r4 (raw file):

        {n.at(0), n.at(1)},
        {n.at(1), n.at(1)},
        {n.at(0), n.at(0)},

Sort for readability

Suggestion:

        {n.at(0), n.at(0)},
        {n.at(0), n.at(1)},
        {n.at(0), n.at(1)},
        {n.at(1), n.at(1)},

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc line 29 at r4 (raw file):

    CHECK(get_outgoing_edges(g, n[0]) ==
          std::unordered_set<MultiDiEdge>{edges[0], edges[1], edges[3]});
    CHECK(get_outgoing_edges(g, n[2]) == std::unordered_set<MultiDiEdge>{});

SUBCASEs as in `get_incoming_edges

Code quote:

    CHECK(get_outgoing_edges(g, n[0]) ==
          std::unordered_set<MultiDiEdge>{edges[0], edges[1], edges[3]});
    CHECK(get_outgoing_edges(g, n[2]) == std::unordered_set<MultiDiEdge>{});

lib/utils/test/src/utils/graph/views/views.cc line 48 at r1 (raw file):

Previously, Marsella8 wrote…

possible bug? currently the get_edges also returns the edges that "cross the cut" (e.g. (1, 2)), though this does not happen for example with DiSubgraphView.

Good catch, yeah that seems like a bug

lib/utils/test/src/utils/graph/views/views.cc line 88 at r1 (raw file):

Previously, Marsella8 wrote…

commented because not yet implemented

As in the class isn't implemented or the testcase isn't implemented?

lib/utils/test/src/utils/graph/views/views.cc line 52 at r4 (raw file):

  }

  TEST_CASE("DiSubgraphView") {

Probably better to just test the corresponding get_subgraph method instead rather than testing the views directly (and same for the other views tested here)--the view classes themselves are probably better thought of as internal implementation details. Also, get_graph_data as introduced in #1471 should simply these tests.

It's not quite perfect because it doesn't test the full power of the query interface (both this and get_graph_data just query all, so concrete queries don't get tested), but I don't have a good solution for this yet so what's here is totally fine for now

lib/utils/test/src/utils/random_utils.cc line 5 at r4 (raw file):

#include <algorithm>

void checkProbabilities(std::vector<int> const &counts,

Might be better as a lambda since it's only used in one testcase, but not a big deal either way

lib/utils/test/src/utils/random_utils.cc line 6 at r4 (raw file):

void checkProbabilities(std::vector<int> const &counts,
                        int numIterations,

Why pass numIterations instead of just computing sum(values(counts))?

lib/utils/test/src/utils/random_utils.cc line 8 at r4 (raw file):

                        int numIterations,
                        std::vector<float> const &weights,
                        float totalWeight) {

Why not just use sum(weights) instead of passing totalWeight explicitly?

Code quote:

float totalWeight

lib/utils/test/src/utils/random_utils.cc line 12 at r4 (raw file):

    float expectedProbability = weights[i] / totalWeight;
    float observedProbability = static_cast<float>(counts[i]) / numIterations;
    CHECK(observedProbability ==

Prefer a single equality check of the whole map instead of a for loop containing an equality check of each key/value

lib/utils/test/src/utils/random_utils.cc line 18 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("select_random") {

Suggestion:

TEST_CASE("select_random(std::vector<T>)") {

lib/utils/test/src/utils/random_utils.cc line 24 at r4 (raw file):

      int result = select_random(values);

      CHECK(std::find(values.begin(), values.end(), result) != values.end());

Feel free to use contains here

lib/utils/test/src/utils/random_utils.cc line 29 at r4 (raw file):

    SUBCASE("Invalid arguments") {
      std::vector<float> weights = {0.1f, 0.3f, 0.2f};
      CHECK(select_random(values, weights) == 2);

What's going on here? Why should we return 2 if the arguments are invalid? Also, why is this testcase not under the weighted section below?

lib/utils/test/src/utils/random_utils.cc line 38 at r4 (raw file):

      std::vector<float> weights = {1.0f, 1.0f, 1.0f, 1.0f, 1.0f};

      std::vector<int> counts(values.size(), 0);

Not sure if this code is exactly right, but something like it

Suggestion:

      std::unordered_map<int, int> counts = unordered_map{zip(values, repeat(0)};

lib/utils/test/src/utils/random_utils.cc line 42 at r4 (raw file):

      for (int i = 0; i < numIterations; i++) {
        int selected = select_random(values, weights);
        counts[selected - 1]++;

Suggestion:

        counts.at(selected)++;

lib/utils/test/src/utils/random_utils.cc line 57 at r4 (raw file):

        int selected = select_random(values, weights);
        counts[selected - 1]++;
      }

Pull out into a separate sample lambda?

Code quote:

      std::vector<int> counts(values.size(), 0);
      int const numIterations = 10000;
      for (int i = 0; i < numIterations; i++) {
        int selected = select_random(values, weights);
        counts[selected - 1]++;
      }

lib/utils/test/src/utils/stack_string.cc line 85 at r4 (raw file):

  }

  TEST_CASE("Arbitrary<stack_string>") {

Why was this removed?

Marsella8

Reviewable status: all files reviewed, 100 unresolved discussions (waiting on @lockshaw)

lib/utils/include/utils/graph/README.md line 21 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

DataflowGraph (note capitalization) also has some additional restrictions: first of all all DataflowGraphs are DAGs, and they have an explicit concept of inputs and outputs from a node, with the restriction that exactly one edge (no more and no less) can be going into an individual input. Conceptually DataflowGraph is intended to represent computation-style graphs, where edges represent value uses and nodes represent functions from tuples of inputs to tuples of outputs.

Done.

lib/utils/include/utils/graph/README.md line 126 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

The main (and possibly only now?) labelled variant is LabelledDataflowGraph, which has labels for nodes and for node outputs, similar to the old OutputLabelledMultiDiGraph, and LabelledOpenDataflowGraph has labels for nodes and for OpenDataflowValues, which are a a variant of node outputs and open graph inputs. It might be worth addressing these a bit more individually, and making it clear what in them is labelled (the concept and consequences of outputs being labelled rather than edges being labelled is a bit unintuitive, at least if you're approaching it from a more pure graph theory perspective)

Done.

lib/utils/include/utils/graph/README.md line at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Thanks for going out of your way to update the documentation! This is very much appreciated 🙂

np!

lib/utils/include/utils/graph/undirected/undirected_edge.h line 7 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

FYI this can be moved over to use dtgen and commutative_pair

done (if you see rc adjacent changes it's because of this)

lib/utils/test/src/utils/containers.cc line 13 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Ideally pull these out into their own files in utils/containers/. The goal is to eventually remove containers.h be either moving functions into their own files in utils/containers.h, or by deleting functions that aren't particularly useful

containers.cc is no more (deleted a few functions which were very specific and not being used anywhere within the codebase, everything else has been moved into their own modules).

lib/utils/test/src/utils/containers.cc line 15 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Check case of an empty list

Done.

lib/utils/test/src/utils/containers.cc line 18 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Check case of an empty list and/or a predicate that fails on every element

Done.

lib/utils/test/src/utils/containers.cc line 24 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Check case of an empty list and/or a predicate that fails on every element

Done.

lib/utils/test/src/utils/containers.cc line 30 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

These should be moved into their own files in utils/bidict/algorithms

already in bidict as a member function

lib/utils/test/src/utils/containers.cc line 38 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Add a testcase where L and R are the same type to ensure that nothing gets confused (i.e., check that for bd = {{1, 2}, {3, 4}} that contains_l(bd, 2) is true and contains_r(bd, 2) is false

Done.

lib/utils/test/src/utils/containers.cc line 48 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Check additional edge cases here--what if the keys are a subset but not the values? What if the values are a subset but not the keys?

Done.

lib/utils/test/src/utils/containers.cc line 51 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Check edge case where there are duplicate values (i.e., index_of({1, 2, 3, 4, 3, 5}, 3)

Done.

lib/utils/test/src/utils/containers.cc line 57 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Split into separate merge_maps and merge_bidicts functions

Done.

lib/utils/test/src/utils/containers.cc line 66 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why construct like this, why not just use equate or take in a list of pairs or something? Constructing fwd_map and bwd_map individually seems really error prone (and honestly the API probably shouldn't even allow this)

Done.

lib/utils/test/src/utils/containers.cc line 78 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Add a check that merge_maps throws an exception if the maps aren't disjoint in the key set

Done.

lib/utils/test/src/utils/containers.cc line 78 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Add a testcase where the value sets aren't disjoint

Done.

lib/utils/test/src/utils/containers.cc line 87 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Check that the correct exception type is thrown if an invalid key is used

removed the function: seems overly specific (can be done with other functions), and is not used anywhere in the codebase

lib/utils/test/src/utils/containers.cc line 90 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Move to its own file in utils/bidict/algorithms

removed the functions (not used anywhere in the codebase, seemed very specific).

lib/utils/test/src/utils/containers.cc line 99 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Move to its own file in utils/bidict/algorithms

see above.

lib/utils/test/src/utils/containers.cc line 114 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Check that ever set is a superseteq of itself (i.e., check that this function is reflexive)

Done.

lib/utils/test/src/utils/containers.cc line 117 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Isn't there already a test file for this (i.e., restrict_keys.cc)?

Done.

lib/utils/test/src/utils/containers.cc line 127 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

What if the list is empty?

Done.

lib/utils/test/src/utils/containers.cc line 128 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Separate subcases

Done.

lib/utils/test/src/utils/containers.cc line 140 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

What if the list is empty?

ended up putting a throw for an empty container (i think for other functions it makes sense to vacuously define how it behaves for an empty container (e.g. sum, optional_all_of), but for are_all_same I can't really think of a situation where somebody would want to pass an empty container. If you want to be pedantic, you could argue that you can define is_all_same recursively and the empty container would have to evaluate to true, but I feel like in practice in doesn't make much sense.

lib/utils/test/src/utils/containers.cc line 161 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Fix the type/template signature of flatmap so that the template arguments can be deduced (i.e., deduce Out from In and F using std::invoke_result_t)

Done.

lib/utils/test/src/utils/containers.cc line 173 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

What's the behavior if the list is empty?

now throws an exception

lib/utils/test/src/utils/hash-utils.cc line 93 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For consistency with the other testcases

Done.

lib/utils/test/src/utils/join_strings.cc line 12 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Ideally also add a testcase for the join_strings(InputIt, InputIt, std::string, F) overload

Done.

lib/utils/test/src/utils/containers/are_disjoint.cc line 13 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Add a test case where one or both sets are empty

Done.

lib/utils/test/src/utils/containers/as_vector.cc line 12 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Ideally don't rely on the ordering of std::unordered_set

Done.

lib/utils/test/src/utils/containers/enumerate.cc line 12 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Doesn't this return a bidict?

File has been overwritten by recent repo-refactor change

lib/utils/test/src/utils/containers/enumerate.cc line 13 at r4 (raw file):

    std::unordered_set<int> input_set = {1, 2, 3, 4, 5};
    std::unordered_map<size_t, int> result = enumerate(input_set);
    std::unordered_map<size_t, int> expected = {

Done.

lib/utils/test/src/utils/containers/enumerate.cc line 14 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Ideally don't rely on the ordering of an unordered_set--better to enumerate a vector if you want to check precise ordering, or do different checks that aren't order dependent to test the overload for unordered_set

Done.

lib/utils/test/src/utils/containers/filter_keys.cc line 14 at r4 (raw file):

    auto f = [](int x) { return x % 2 == 1; }; // Filtering function
    std::unordered_map<int, std::string> result = filter_keys(m, f);
    std::unordered_map<int, std::string> expected = {{1, "one"}, {3, "three"}};

Done.

lib/utils/test/src/utils/containers/find.cc line 7 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Since find is intended to work over a bunch of data structures (vector, unordered_set, set at the very least) add tests of additional datatypes--just because the code of the function is the same doesn't necessarily mean that the APIs used have the same meaning for each of these datatu[es

Done.

lib/utils/test/src/utils/containers/find.cc line 9 at r4 (raw file):

  TEST_CASE("find") {
    std::vector<int> v = {1, 2, 3, 4, 5};
    CHECK(find(v, 3) != v.cend());

Done.

lib/utils/test/src/utils/containers/find.cc line 10 at r4 (raw file):

    std::vector<int> v = {1, 2, 3, 4, 5};
    CHECK(find(v, 3) != v.cend());
    CHECK(find(v, 6) == v.cend());

Done.

lib/utils/test/src/utils/containers/get_first.cc line 10 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Good point, but I'd probably go with get_one_of or choose_one_of

changed to get_one_of

lib/utils/test/src/utils/containers/keys.cc line 14 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why flip this unnecessarily?

Done.

lib/utils/test/src/utils/containers/map_keys.cc line 11 at r4 (raw file):

  TEST_CASE("map_keys") {
    std::unordered_map<int, std::string> m = {{1, "one"}, {2, "two"}};
    auto f = [](int x) { return x * x; }; // Mapping function

Done.

lib/utils/test/src/utils/containers/map_keys.cc line 12 at r4 (raw file):

    std::unordered_map<int, std::string> m = {{1, "one"}, {2, "two"}};
    auto f = [](int x) { return x * x; }; // Mapping function
    auto result = map_keys(m, f);

Done.

lib/utils/test/src/utils/containers/map_keys.cc line 15 at r4 (raw file):

    CHECK(result.size() == 2);
    CHECK(result[1] == "one");
    CHECK(result[4] == "two");

Done.

lib/utils/test/src/utils/containers/map_values.cc line 10 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

It would be nice to have not all the resulting sizes be the same

Done.

lib/utils/test/src/utils/containers/map_values.cc line 11 at r4 (raw file):

  TEST_CASE("map_values") {
    std::unordered_map<int, std::string> m = {{1, "one"}, {2, "two"}};
    auto f = [](std::string const &s) { return s.size(); }; // Mapping function

Done.

lib/utils/test/src/utils/containers/map_values.cc line 13 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Use correct instead of expected for consistency

Done.

lib/utils/test/src/utils/containers/restrict_keys.cc line 15 at r4 (raw file):

    std::unordered_set<int> mask = {2, 3, 4};
    std::unordered_map<int, std::string> result = restrict_keys(m, mask);
    std::unordered_map<int, std::string> expected = {{2, "two"}, {3, "three"}};

Done.

lib/utils/test/src/utils/containers/reversed.cc line 14 at r4 (raw file):

    // Checking the reversed sequence is as expected
    CHECK(result == expected);

Done.

lib/utils/test/src/utils/containers/set_union.cc line 12 at r4 (raw file):

    std::unordered_set<int> s2 = {2, 3, 4};
    std::unordered_set<int> result = set_union(s1, s2);
    std::unordered_set<int> expected = {1, 2, 3, 4};

Done.

lib/utils/test/src/utils/containers/sorted_by.cc line 9 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("Testing sorted_by function") {

Done.

lib/utils/test/src/utils/containers/sorted_by.cc line 12 at r4 (raw file):

    std::unordered_set<int> s = {5, 2, 3, 4, 1};
    auto sorted_s = sorted_by(s, [](int a, int b) { return a < b; });
    CHECK(sorted_s == std::vector<int>({1, 2, 3, 4, 5}));

Done.

lib/utils/test/src/utils/containers/sorted_by.cc line 16 at r4 (raw file):

    std::unordered_set<int> s2 = {-5, -1, -3, -2, -4};
    auto sorted_s2 = sorted_by(s2, [](int a, int b) { return a > b; });
    CHECK(sorted_s2 == std::vector<int>({-1, -2, -3, -4, -5}));

Done.

lib/utils/test/src/utils/containers/subvec.cc line 8 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("Testing subvec function") {

Done.

lib/utils/test/src/utils/containers/subvec.cc line 15 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Break up into subcases, use result and correct names, test more edge cases (what if both values are std::nullopt? what if they are both integers, but the first one is less than the other? What if the integers are negative?)

Done.

lib/utils/test/src/utils/containers/values.cc line 10 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Add a testcase where there are duplicate values

Done.

lib/utils/test/src/utils/containers/values.cc line 14 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't rely on the order of unordered_set

Done.

lib/utils/test/src/utils/containers/values.cc line 14 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Use correct instead of expected for consistency

Done.

lib/utils/test/src/utils/containers/vector_split.cc line 8 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Check boundary cases (what if idx is 0? what if idx is 4?)

Done.

lib/utils/test/src/utils/containers/vector_split.cc line 8 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("Testing vector_split function") {

Done.

lib/utils/test/src/utils/graph/traversal.cc line 8 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Split into separate TEST_CASEs for separate functions (i.e., one TEST_CASE for get_unchecked_dfs_ordering, one TEST_CASE for get_bfs_ordering, one TEST_CASE for get_dfs_ordering, each with potentially multiple SUBCASEs)

Done.

lib/utils/test/src/utils/graph/traversal.cc line 34 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Use result and correct to improve readability

Done.

lib/utils/test/src/utils/graph/traversal.cc line 37 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

No check in this SUBCASE?

Done.

lib/utils/test/src/utils/graph/digraph/algorithms.cc line 11 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Break into separate TEST_CASEs (one for each function) and add tests for some more error cases (graphs with multiple components, cycles, etc. Try to keep them small, you won't need huge graphs to test these properties as long as you're not trying to test them all at once)

Done.

lib/utils/test/src/utils/graph/digraph/algorithms.cc line 19 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Sort for readability

Done.

lib/utils/test/src/utils/graph/digraph/algorithms.cc line 24 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer correct over expected for consistency

Done.

lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc line 17 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer .at for bounds checking

Done.

lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc line 35 at r4 (raw file):

    SUBCASE("matches_edge") {
      DirectedEdgeQuery q{{n[0]}, {n[1]}};

Done.

lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc line 42 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

More verbose, but probably more readable than trying to parse the nested curly braces

Done.

lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc line 51 at r4 (raw file):

      CHECK(result.srcs == expected_srcs);
      CHECK(result.dsts == expected_dsts);

Done.

lib/utils/test/src/utils/graph/digraph/get_connected_components.cc line 20 at r4 (raw file):

    };

    CHECK(get_connected_components(g) == expected_components);

Done.

lib/utils/test/src/utils/graph/digraph/get_topological_ordering.cc line 13 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer .at for bounds checking

Done.

lib/utils/test/src/utils/graph/digraph/get_topological_ordering.cc line 24 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Comparing the std::optionals directly feels a bit weird. Also you probably don't need the contains checks if you dereference the values, as dereferencing a std::optional without a value already throws an error

Done.

lib/utils/test/src/utils/graph/digraph/get_weakly_connected_components.cc line 9 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Test case name?

Done.

lib/utils/test/src/utils/graph/digraph/get_weakly_connected_components.cc line 9 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {

  TEST_CASE(FF_TEST_SUITE) {

Done.

lib/utils/test/src/utils/graph/digraph/get_weakly_connected_components.cc line 22 at r4 (raw file):

    };

    CHECK(get_weakly_connected_components(g) == expected_components);

Done.

lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 24 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer consistency where possible, even if it's a bit verbose

Done.

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 23 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why no duplicate edges? The whole point of MultiDiGraph is that you can do g.add_edge(n0, n1); g.add_edge(n0, n1);, and then you end up with two MultiDiEdges having been added. add_edge should be tested for this.

Done.

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 28 at r4 (raw file):

      Node n3 = g.add_node();
      std::unordered_set<Node> nodes = g.query_nodes(NodeQuery{{n3}});
      CHECK(contains(nodes, n3));

Done.

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 35 at r4 (raw file):

      std::unordered_set<MultiDiEdge> edges =
          g.query_edges(MultiDiEdgeQuery({n2}, {n1}));
      CHECK(contains(edges, e4));

Done.

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 41 at r4 (raw file):

      g.remove_node(n0);
      std::unordered_set<Node> nodes = g.query_nodes(NodeQuery{{n0}});
      CHECK_FALSE(contains(nodes, n0));

Done.

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 47 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

This test isn't correct--this query will only return edges from n0 -> n0, so even if n0 wasn't deleted, edges_after_removal would still be empty

Done.

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 54 at r4 (raw file):

      std::unordered_set<MultiDiEdge> edges =
          g.query_edges(MultiDiEdgeQuery({n1}, {n2}));
      CHECK_FALSE(contains(edges, e3));

Done.

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 59 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Also test matchall, and queries for nodes that don't exist?

Done.

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 69 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Also test matchall, and for nodes that don't exist?

Done.

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc line 14 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("MultiDiGraph - get_incoming_edges") {

Done.

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc line 23 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why not the usual std::vector<DirectedEdge>?

MultiDiEdges cannot be constructed directly, since they are essentially wrappers for an uid (and the graph itself does the bookkeping about which edge has which source and sink).

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc line 23 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Sort for readability

Done.

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc line 29 at r4 (raw file):

    CHECK(get_incoming_edges(g, n[1]) ==
          std::unordered_set<MultiDiEdge>{edges[0], edges[1], edges[2]});
    CHECK(get_incoming_edges(g, n[2]) == std::unordered_set<MultiDiEdge>{});

Done.

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc line 14 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("MultiDiGraph - get_outgoing_edges") {

Done.

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc line 22 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Sort for readability

Done.

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc line 29 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

SUBCASEs as in `get_incoming_edges

Done.

lib/utils/test/src/utils/graph/views/views.cc line 48 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Good catch, yeah that seems like a bug

Fixed it, issue had to do with the query_edges function which was including all open edges coming out of the subgraph

lib/utils/test/src/utils/graph/views/views.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

As in the class isn't implemented or the testcase isn't implemented?

std::unordered_set<Node> JoinedNodeView::query_nodes(NodeQuery const &query) in views.cc has not yet been implemented.

lib/utils/test/src/utils/graph/views/views.cc line 52 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Probably better to just test the corresponding get_subgraph method instead rather than testing the views directly (and same for the other views tested here)--the view classes themselves are probably better thought of as internal implementation details. Also, get_graph_data as introduced in #1471 should simply these tests.

It's not quite perfect because it doesn't test the full power of the query interface (both this and get_graph_data just query all, so concrete queries don't get tested), but I don't have a good solution for this yet so what's here is totally fine for now

get_graph_data I think is only implemented for DataflowGraphs (I can add it for the other classes as well if needed).

lib/utils/test/src/utils/random_utils.cc line 6 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why pass numIterations instead of just computing sum(values(counts))?

Done.

lib/utils/test/src/utils/random_utils.cc line 8 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why not just use sum(weights) instead of passing totalWeight explicitly?

Done.

lib/utils/test/src/utils/random_utils.cc line 12 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer a single equality check of the whole map instead of a for loop containing an equality check of each key/value

i think in this case it ends up being a bit clunky to compare the 2 vectors (since we are testing for the values being approximately equal), it would be something like in the code snippet.
In general, why is it preferrable to have a single check rather than multiple ones? To have a more defined point of where the check is being done / more awareness of the entire container rather than just a single value?

Code snippet:

CHECK(all_of(zip(observed_probabilities, expected_probabilities), ...)

lib/utils/test/src/utils/random_utils.cc line 18 at r4 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("select_random") {

Done.

lib/utils/test/src/utils/random_utils.cc line 24 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Feel free to use contains here

Done.

lib/utils/test/src/utils/random_utils.cc line 29 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

What's going on here? Why should we return 2 if the arguments are invalid? Also, why is this testcase not under the weighted section below?

Test case was very odd and outdated, remodernized it, lmk what you think.

lib/utils/test/src/utils/random_utils.cc line 38 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Not sure if this code is exactly right, but something like it

Done.

lib/utils/test/src/utils/random_utils.cc line 42 at r4 (raw file):

      for (int i = 0; i < numIterations; i++) {
        int selected = select_random(values, weights);
        counts[selected - 1]++;

Done.

lib/utils/test/src/utils/random_utils.cc line 57 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Pull out into a separate sample lambda?

See above.

lib/utils/test/src/utils/stack_string.cc line 85 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why was this removed?

re-added

Marsella8

Reviewable status: 20 of 131 files reviewed, 105 unresolved discussions (waiting on @lockshaw)

lib/utils/include/utils/graph/undirected/undirected_edge.struct.toml line 1 at r6 (raw file):

namespace = "FlexFlow"

Changed undirected edge to use commutative pair and dtgen, if you see some rapidcheck adjacent changes it's because of that.

lib/utils/src/utils/graph/instances/hashmap_undirected_graph.cc line 41 at r5 (raw file):

}

void HashmapUndirectedGraph::remove_edge(UndirectedEdge const &e) {

BUG? Intuitively this should be symmetric, something like:

Code snippet:

  std::unordered_set<Node> &max_map = this->adjacency.at(e.endpoints.max());
  max_map.erase(e.endpoints.min());
  std::unordered_set<Node> &min_map = this->adjacency.at(e.endpoints.max());
  min_map.erase(e.endpoints.max());

lib/utils/src/utils/graph/instances/hashmap_undirected_graph.cc line 53 at r6 (raw file):

      keys(query_keys(query.nodes, this->adjacency));
  for (auto const &src_kv : query_keys(query.nodes, this->adjacency)) {
    for (auto const &dst : src_kv.second) {

Fix for the bug in where getting the subgraph view of an undirected graph would result in the edges crossing the cut from inside subgraph to outside subgraph being included in the subgraph.

lib/utils/src/utils/graph/views/views.cc line 263 at r6 (raw file):

    DirectedEdgeQuery const &q) const {
  std::unordered_set<UndirectedEdge> undirected_edges =
      set_union(g.query_edges(UndirectedEdgeQuery{q.srcs}),

I think this should be set_union instead of intersection right? After changing the query_edges in HashmapUndirectedGraph another test using as_digraph was failing, I'd assume it's due to this.

lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc line 67 at r5 (raw file):

        CHECK(matches_edge(result, DirectedEdge{n.at(1), n.at(3)}));
        CHECK(matches_edge(result, DirectedEdge{n.at(2), n.at(4)}));
        // CHECK_FALSE(matches_edge(result, DirectedEdge{n.at(2), n.at(3)}));

Do they match all edges, even if

Marsella8

Reviewable status: 20 of 131 files reviewed, 105 unresolved discussions (waiting on @lockshaw)

lib/utils/src/utils/graph/instances/hashmap_undirected_graph.cc line 53 at r6 (raw file):

Previously, Marsella8 wrote…

Fix for the bug in where getting the subgraph view of an undirected graph would result in the edges crossing the cut from inside subgraph to outside subgraph being included in the subgraph.

(unsure of correctness, but should make sense)

lib/utils/src/utils/graph/views/views.cc line 263 at r6 (raw file):

Previously, Marsella8 wrote…

I think this should be set_union instead of intersection right? After changing the query_edges in HashmapUndirectedGraph another test using as_digraph was failing, I'd assume it's due to this.

(very unsure about this)

lockshaw

Reviewed 107 of 111 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: 128 of 131 files reviewed, 87 unresolved discussions (waiting on @Marsella8)

lib/substitutions/src/substitutions/unlabelled/multidigraph_pattern_match.cc line 3 at r6 (raw file):

#include "substitutions/unlabelled/multidigraph_pattern_match.h"
// #include "substitutions/unlabelled/edge_splits.h"
// #include "substitutions/unlabelled/pattern_edge.h"

Delete (yes I know you didn't add these, but may as well remove them now)

Code quote:

// #include "substitutions/unlabelled/edge_splits.h"
// #include "substitutions/unlabelled/pattern_edge.h"

lib/utils/include/utils/join_strings.h line 27 at r6 (raw file):

  }
  return oss.str();
}

Suggestion:

template <typename InputIt, typename F>
std::string join_strings(InputIt first,
                         InputIt last,
                         std::string const &delimiter,
                         F const &f) {
  std::ostringstream oss;
  bool first_iter = true;

  for (; first != last; first++) {
    if (!first_iter) {
      oss << delimiter;
    }
    oss << f(*first);
    first_iter = false;
  }
  return oss.str();
}

lib/utils/include/utils/bidict/algorithms/merge_bidicts.h line 16 at r6 (raw file):

  if (!are_disjoint(left_entries(lhs), left_entries(rhs))) {
    throw mk_runtime_error(
        "Left entries of merge_bidicts parameters are non-disjoint");

Including debugging info in exception messages where possible is nice

Suggestion:

  if (!are_disjoint(left_entries(lhs), left_entries(rhs))) {
    throw mk_runtime_error(
        fmt::format("Left entries of merge_bidicts parameters are non-disjoint: lhs = {}, rhs = {}", lhs, rhs));

lib/utils/include/utils/bidict/algorithms/merge_bidicts.h line 20 at r6 (raw file):

  if (!are_disjoint(right_entries(lhs), right_entries(rhs))) {
    throw mk_runtime_error(
        "Right entries of merge_bidicts parameters are non-disjoint");

Suggestion:

    throw mk_runtime_error(
        fmt::format("Right entries of merge_bidicts parameters are non-disjoint: lhs = {}, rhs = {}", lhs, rhs));

lib/utils/include/utils/containers/are_all_same.h line 10 at r6 (raw file):

template <typename C>
bool are_all_same(C const &c) {
  if (c.size() == 0) {

Suggestion:

  if (c.empty()) {

lib/utils/include/utils/containers/are_all_same.h line 11 at r6 (raw file):

bool are_all_same(C const &c) {
  if (c.size() == 0) {
    throw mk_runtime_error("input to are_all_same must be non-empty container");

Suggestion:

   throw mk_runtime_error(fmt::format("are_all_same expected non-empty container but receieved {}", c));

lib/utils/include/utils/containers/index_of.h line 6 at r6 (raw file):

#include <algorithm>
#include <optional>
namespace FlexFlow {

Suggestion:

#include <algorithm>
#include <optional>

namespace FlexFlow {

lib/utils/include/utils/containers/index_of.h line 10 at r6 (raw file):

/**
 * @details If multiple `e` are present within the container, the function
 *returns the index of the first appearance

Suggestion:

 * @details If multiple `e` are present within the container, the function
 * returns the index of the first appearance

lib/utils/include/utils/containers/is_submap.h line 14 at r6 (raw file):

               std::unordered_map<K, V> const &sub) {
  return restrict_keys(m, keys(sub)) == sub;
}

For consistency with is_subseteq_of

Suggestion:

bool is_submapeq_of(
               std::unordered_map<K, V> const &sub,
               std::unordered_map<K, V> const &m) {
  return restrict_keys(m, keys(sub)) == sub;
}

lib/utils/include/utils/containers/map_keys.h line 13 at r6 (raw file):

 * returns the updated map.
 *
 * @details Note that if multiple original keys are transformed to the same new

Is this a good behavior or should this throw an error? I'm not sure this is a great property to have, and I don't think we're relying on it anywhere?

lib/utils/include/utils/containers/map_keys.h line 14 at r6 (raw file):

 *
 * @details Note that if multiple original keys are transformed to the same new
 * key, which value will be associated to the new key is undefined.

Suggestion:

 * @note If multiple original keys are transformed to the same new
 * key, which value will be associated to the new key is undefined.

lib/utils/include/utils/containers/maximum.h line 14 at r6 (raw file):

    throw mk_runtime_error(
        "input to function maximum must be non-empty container");
  }

Suggestion:

  if (c.empty()) {
    throw mk_runtime_error(
        fmt::format("maximum expected non-empty container but received {}", c));
  }

lib/utils/include/utils/containers/merge_maps.h line 18 at r6 (raw file):

    throw mk_runtime_error(
        "Key sets of merge_maps parameters are non-disjoint");
  }

Suggestion:

  if (!are_disjoint(keys(lhs), keys(rhs))) {
    throw mk_runtime_error(
        fmt::format("Key sets of merge_maps parameters are non-disjoint: lhs = {}, rhs = {}", lhs, rhs));
  }

lib/utils/include/utils/containers/optional_all_of.h line 9 at r6 (raw file):

template <typename Container, typename Function>
std::optional<bool> optional_all_of(Container const &container,

Are we using this anywhere? It's kinda a weird function and I wouldn't object to removing it if we're not using it

lib/utils/include/utils/containers/product.h line 9 at r6 (raw file):

/**
 * @details An empty container vacuously has product 1

We don't really have a convention set for doxygen yet, but might be good to use @note for documenting edge cases like this? Or idk, maybe we should only use @note for really weird/bug-prone edge cases? Thoughts?

Suggestion:

 * @note An empty container vacuously has product 1

lib/utils/include/utils/containers/product.h line 32 at r6 (raw file):

          typename ConditionF,
          typename Element = typename Container::value_type>
Element product_where(Container const &container, ConditionF const &condition) {

Move to separate utils/containers/product_where.h

lib/utils/include/utils/containers/subvec.h line 19 at r6 (raw file):

      typename std::vector<T>::iterator::difference_type {
        if (idx < 0) {
          return std::max(0, (int)v.size() + idx);

Avoid c-style casts

Suggestion:

static_cast<int>(v.size())

lib/utils/include/utils/containers/subvec.h line 19 at r6 (raw file):

      typename std::vector<T>::iterator::difference_type {
        if (idx < 0) {
          return std::max(0, (int)v.size() + idx);

Why std::max(0, ...)?

lib/utils/include/utils/containers/sum.h line 21 at r6 (raw file):

          typename ConditionF,
          typename Element = typename Container::value_type>
Element sum_where(Container const &container, ConditionF const &condition) {

Move to separate utils/containers/sum_where.h

lib/utils/include/utils/containers/value_all.h line 15 at r6 (raw file):

    return unwrap(element, [] {
      throw mk_runtime_error(
          "Encountered element without value in call to value_all");

Suggestion:

          fmt::format("value_all expected all elements to have values, but received {}", v));

lib/utils/include/utils/containers/value_all.h line 25 at r6 (raw file):

    return unwrap(element, [] {
      throw mk_runtime_error(
          "Encountered element without value in call to value_all");

Suggestion:

          fmt::format("value_all expected all elements to have values, but received {}", v));

lib/utils/include/utils/containers/vector_split.h line 15 at r6 (raw file):

                                                       int idx) {
  if (idx < 0 || idx > static_cast<int>(v.size())) {
    throw std::out_of_range("Index out of range in vector_split");

Suggestion:

    throw std::out_of_range(fmt::format("Index out of range in vector_split: index = {}, vector = {}", idx, v));

lib/utils/include/utils/fmt/optional.h line 67 at r6 (raw file):

};

template <>

FYI once 1490 merges this will have to be moved to test/utils/doctest/fmt/optional.h (not changes needed now)

lib/utils/include/utils/graph/README.md line 21 at r6 (raw file):

- `DiGraph`: at most one edge allowed between every ordered pair of nodes, edges are directed (i.e., have a source node and a destination node)
- `MultiDiGraph`: arbitrary numbers of directed edges allowed between every pair of nodes.
- `DataflowGraph`: similar to `MultiDiGraph`, but the edges entering, exiting a given nodes now have a well-defined order. Due to the interface used to construct them (where essentially a node can only be added to the graph after all of its predecessor nodes have been added) `DataflowGraph`s are directed acyclic graphs. Each node has an associated ordered sequence of inputs and outputs, with the restriction that one and only one edge can enter an individual input.

Suggestion:

similar to `MultiDiGraph` but with the following differences: ...

lib/utils/include/utils/graph/README.md line 135 at r6 (raw file):

- `OpenLabelledDataflowGraph` allows for labelling of `Node`s and `OpenDataflowValue`s, which is a variant describing both `DataflowOutput`s and `DataflowGraphInput`s, which represent the open inputs to the graph (i.e. the inputs for which their corresponding output is not present in the graph).

While the interfaces of these graphs differ slightly from the core graph variants, they still have the corresponding `add_node`/`add_edge` methods, and `query_nodes`/`query_edges` methods.

DataflowGraphs don't have add_edge

Code quote:

`add_edge` m

lib/utils/include/utils/graph/digraph/algorithms/get_dominators.h line 10 at r6 (raw file):

/**
 * @brief A node `d` is said to dominate a node `n` if every path from the root
 * node to `n` must go through `d`.

Just link to https://en.wikipedia.org/wiki/Dominator_(graph_theory) as it gives a better explanation than any docstring we write will

Code quote:

 * @brief A node `d` is said to dominate a node `n` if every path from the root
 * node to `n` must go through `d`.

lib/utils/include/utils/graph/digraph/algorithms/get_dominators.h line 19 at r6 (raw file):

/**
 * @brief Returns the intersection of the dominators of the input nodes.

Conceptually I think this is equivalent to merging this set of nodes and then finding the set of dominators of the new merged node? Confirm this is correct and, if so, add to the explanation here

Suggestion:

of the given set of nodes.

lib/utils/src/utils/graph/instances/hashmap_undirected_graph.cc line 41 at r5 (raw file):

Previously, Marsella8 wrote…

BUG? Intuitively this should be symmetric, something like:

Yeah I think you're right, though FYI your suggested code has a typo (both use .max()) and in your code assigning references of max_map and min_map probably isn't necessary

lib/utils/src/utils/graph/instances/hashmap_undirected_graph.cc line 53 at r6 (raw file):

Previously, Marsella8 wrote…

(unsure of correctness, but should make sense)

I believe the indented semantics of UndirectedGraph::query_edges is all edges connected to the a node in the input set, not all edges only connected to the nodes in the input set--maybe this is a fix that should be added in the subgraph code?

lib/utils/src/utils/graph/views/views.cc line 263 at r6 (raw file):

Previously, Marsella8 wrote…

(very unsure about this)

I don't think so, see https://reviewable.io/reviews/flexflow/FlexFlow/1464#-O6aRRAsCc8-SUiQ2l9V

lib/utils/test/src/utils/join_strings.cc line 10 at r6 (raw file):

TEST_SUITE(FF_TEST_SUITE) {

  TEST_CASE("join_strings") {

Probably would be good to check the edge case that v is empty

lib/utils/test/src/utils/join_strings.cc line 24 at r6 (raw file):

      auto add_exclamation = [](std::string const &str) { return str + "!"; };
      CHECK(join_strings(v, " ", add_exclamation) == "Hello! world! !!");
    }

Suggestion:

    SUBCASE("join_strings with transforming function") {
      auto add_exclamation = [](std::string const &str) { return str + "!"; };
      
      std::string result = join_strings(v, " ", add_exclamation);
      std::string correct = "Hello! world! !!";
      
      CHECK(result == correct);
    }

lib/utils/test/src/utils/bidict/bidict.cc line 19 at r6 (raw file):

    SUBCASE("bidict::contains_r") {
      CHECK(dict.contains_r(std::string("one")));

Why is explicit std::string(...) needed here?

lib/utils/test/src/utils/bidict/bidict.cc line 31 at r6 (raw file):

      CHECK(bd.contains_r(3));
      CHECK_FALSE(bd.contains_r(1));
    }

Suggestion:

    SUBCASE("bidict::contains_l") {
      SUBCASE("L type is not the same as R type") {
        CHECK(dict.contains_l(1));
        CHECK_FALSE(dict.contains_l(3));
      }
    
      SUBCASE("L type is the same as R type") {
        bidict<int, int> bd;
        bd.equate(1, 3);

        CHECK(bd.contains_l(1));
        CHECK_FALSE(bd.contains_l(3));
      }
    }

    SUBCASE("bidict::contains_r") {
      SUBCASE("L type is not the same as R type") {
        CHECK(dict.contains_r(std::string("one")));
        CHECK_FALSE(dict.contains_r(std::string("three")));
      }
      
      SUBCASE("L type is the same as R type") {
        bidict<int, int> bd;
        bd.equate(1, 3);

        CHECK(bd.contains_r(3));
        CHECK_FALSE(bd.contains_r(1));
      }
    }

lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc line 8 at r6 (raw file):

TEST_SUITE(FF_TEST_SUITE) {

  TEST_CASE("merge_bidicts") {

What happens if they have overlapping keys but all of the kv pairs are the same, i.e., merging {{1, "one"}, {2, "two"}} and {{2, "two"}, {3, "three"}}?

lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc line 25 at r6 (raw file):

      bidict<int, std::string> bd2 = {{2, "three"}, {3, "four"}};

      CHECK_THROWS_AS(merge_bidicts(bd1, bd2), std::runtime_error);

Suggestion:

      CHECK_THROWS(merge_bidicts(bd1, bd2));

lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc line 32 at r6 (raw file):

      bidict<int, std::string> bd2 = {{3, "two"}, {4, "four"}};

      CHECK_THROWS_AS(merge_bidicts(bd1, bd2), std::runtime_error);

Suggestion:

    CHECK_THROWS(merge_bidicts(bd1, bd2));

lib/utils/test/src/utils/containers/are_all_same.cc line 21 at r6 (raw file):

    SUBCASE("Empty Container") {
      std::vector<int> input = {};
      CHECK_THROWS_AS(are_all_same(input), std::runtime_error);

Suggestion:

      CHECK_THROWS(are_all_same(input));

lib/utils/test/src/utils/containers/as_vector.cc line 15 at r6 (raw file):

    std::vector<int> result = as_vector(input);
    std::vector<int> correct_sorted = {1, 2, 3};
    CHECK(sorted(result) == correct_sorted);

Suggestion:

    std::vector<int> correct = {1, 2, 3};
    CHECK(sorted(result) == sorted(correct));

lib/utils/test/src/utils/containers/enumerate.cc line 55 at r6 (raw file):

    CHECK(keys(result) == correct_keys);
    CHECK(unordered_multiset_of(values(result)) == correct_values);

If you're already doing unordered comparison of the keys and values, why not just compare std::unordered_multiset<std::pair<...>> rather than two separate comparisons?

Code quote:

    std::unordered_set<int> correct_keys = {0, 1, 2, 3};
    std::unordered_multiset<std::string> correct_values = {
        "zero", "one", "two", "three"};
    std::map<int, std::string> result = enumerate(input);

    CHECK(keys(result) == correct_keys);
    CHECK(unordered_multiset_of(values(result)) == correct_values);

lib/utils/test/src/utils/containers/find.cc line 13 at r6 (raw file):

  TEST_CASE("find") {

    SUBCASE("vector") {

Ideally also add a test where the desired element occurs multiple times in the list

lib/utils/test/src/utils/containers/find.cc line 15 at r6 (raw file):

    SUBCASE("vector") {
      std::vector<int> v = {1, 2, 3, 4, 5};
      CHECK_WITHOUT_STRINGIFY(find(v, 3) == std::find(v.begin(), v.end(), 3));

Add subcase names for these, i.e., SUBCASE("element is in vector"), SUBCASE("element is not vector")

lib/utils/test/src/utils/containers/flatmap.cc line 9 at r6 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("Test for flatmap function on vectors") {

Add a check for type changing (e.g., instead of get_factors having type int -> std::vector<int>, you'd have int -> std::vector<string>)

lib/utils/test/src/utils/containers/index_of.cc line 13 at r6 (raw file):

    SUBCASE("unique elements") {
      std::vector<int> v = {1, 2, 3, 4, 5};
      CHECK(index_of(v, 3).value() == 2);

SUBCASE("element is present in vector"), SUBCASE("element is not present in vector")

lib/utils/test/src/utils/containers/map_keys.cc line 10 at r6 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("map_keys") {

Add check for duplicate keys (even if the value is undefined, check whether or not it throws an error, or check that it's one of the allowed return values, etc.)

lib/utils/test/src/utils/containers/maximum.cc line 22 at r6 (raw file):

      std::vector<int> input = {};

      CHECK_THROWS_AS(maximum(input), std::runtime_error);

Suggestion:

      CHECK_THROWS(maximum(input));

lib/utils/test/src/utils/containers/merge_maps.cc line 27 at r6 (raw file):

      std::unordered_map<int, std::string> rhs = {{2, "three"}, {3, "four"}};

      CHECK_THROWS_AS(merge_maps(lhs, rhs), std::runtime_error);

Suggestion:

      CHECK_THROWS(merge_maps(lhs, rhs));

lib/utils/test/src/utils/containers/product.cc line 8 at r6 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("product") {

Probably a good idea where not too inconvenient to test the various overloads of these functions, either using TEST_CASE_TEMPLATE or using multiple TEST_CASEs (e.g., TEST_CASE("product(std::vector<int>)") { ... }, TEST_CASE("product(std::unordered_set<int>)") { ... }, etc.).

Obviously weigh implementation difficulty, how many other tasks you have going on, etc. This is not super-high priority, so if you have more important tasks you could also be working on feel free to not worry about this--my general heuristic with testing code is that if it's an improvement over the current situation and doesn't actively violate a bunch of code style, etc. I'll merge it. I'll probably still leave comments pointing it out on PRs, but in those cases feel free to just mention in the response that you'd prefer not to do those fixes as part of the current PR--this is totally fine, and often a good way to keep PRs from getting too large

lib/utils/test/src/utils/containers/product.cc line 24 at r6 (raw file):

  }

  TEST_CASE("product_where") {

Move to separate product_where.cc

lib/utils/test/src/utils/containers/reversed.cc line 9 at r6 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("Testing the 'reversed' function") {

Suggestion:

TEST_CASE("reversed(std::vector<int>)") {

lib/utils/test/src/utils/containers/sorted_by.cc line 10 at r6 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("sorted_by") {

Might be good to test what happens if there are duplicates in the list?

lib/utils/test/src/utils/containers/sum.cc line 24 at r6 (raw file):

  }

  TEST_CASE("sum_where") {

Move to separate sum_where.cc

lib/utils/test/src/utils/containers/value_all.cc line 14 at r6 (raw file):

    SUBCASE("With nullopt") {
      std::vector<std::optional<int>> input = {1, 2, std::nullopt, 4, 5};
      CHECK_THROWS_AS(value_all(input), std::runtime_error);

I haven't really put much though into what error types things throw, so for things that throw mk_runtime_error it's probably better to just check that they throw for now

Suggestion:

    CHECK_THROWS(value_all(input));

lib/utils/test/src/utils/containers/vector_split.cc line 25 at r6 (raw file):

    }

    SUBCASE("Boundary case: idx = 5") {

Suggestion:

    SUBCASE("Boundary case: idx is last index in list") {

lib/utils/test/src/utils/containers/vector_split.cc line 35 at r6 (raw file):

    }

    SUBCASE("Out of bounds case: idx = 6") {

Suggestion:

    SUBCASE("Out of bounds case: idx == list size") {

lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc line 35 at r4 (raw file):

Previously, Marsella8 wrote…

Done.

Just FYI if you leave a trailing comma the formatter will insert a linebreak. Not saying you need to do this here, just making sure you're aware.

For example,

std::pair<int, int> x = {
  3,
  5
};

->

std::pair<int, int> x = {3, 5};

while

std::pair<int, int> x = { 3, 5, };

->

std::pair<int, int> x = {
  3, 
  5,
};

lib/utils/test/src/utils/graph/digraph/directed_edge_query.cc line 67 at r5 (raw file):

Previously, Marsella8 wrote…

Do they match all edges, even if

Probably simpler to just check resultequals DirectedEdgeQuery correct = DirectedEdgeQuery{query_set{n.at(1), n.at(2)}, query_set{n.at(3), n.at(4)}};

lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 23 at r6 (raw file):

    SUBCASE("single node") {
      Node node = n[2];

Suggestion:

Node node = n.at(2);

lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 24 at r6 (raw file):

    SUBCASE("single node") {
      Node node = n[2];
      std::unordered_set<Node> correct = {n[0], n[2]};

Suggestion:

      std::unordered_set<Node> correct = {n.at(0), n.at(2)};

lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 30 at r6 (raw file):

    SUBCASE("multiple nodes") {
      std::unordered_set<Node> nodes = {n[1], n[3]};

Suggestion:

      std::unordered_set<Node> nodes = {n.at(1), n.at(3)};

lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 32 at r6 (raw file):

      std::unordered_set<Node> nodes = {n[1], n[3]};
      std::unordered_set<Node> result = get_dominators(g, nodes);
      std::unordered_set<Node> correct = {n[0]};

Suggestion:

      std::unordered_set<Node> correct = {n.at(0)};

lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 61 at r6 (raw file):

      correct = {n.at(0), n.at(1), n.at(3)};

      CHECK(result == correct);

You're missing a check (also separate subcases please, at the very least so you have to justify why you want to do two checks here instead of one)

Code quote:

      std::unordered_set<Node> result = get_dominators(g, n.at(1));
      std::unordered_set<Node> correct = {n.at(0), n.at(1)};

      result = get_dominators(g, n.at(3));
      correct = {n.at(0), n.at(1), n.at(3)};

      CHECK(result == correct);

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 34 at r6 (raw file):

    }

    SUBCASE("add_edge") {

Add a subcase in here to test adding a duplicate edge

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 42 at r6 (raw file):

    }

    SUBCASE("add_edges") {

Why does subcase this exist?

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 60 at r6 (raw file):

      std::unordered_set<MultiDiEdge> edge_result =
          g.query_edges(MultiDiEdgeQuery({n0}, {n0}));

This is only edges from n0 to n0, which I don't think is what you want?

Code quote:

      std::unordered_set<MultiDiEdge> edge_result =
          g.query_edges(MultiDiEdgeQuery({n0}, {n0}));

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 73 at r6 (raw file):

    }

    SUBCASE("remove_edges") {

What is the point of this SUBCASE?

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc line 23 at r4 (raw file):

Previously, Marsella8 wrote…

MultiDiEdges cannot be constructed directly, since they are essentially wrappers for an uid (and the graph itself does the bookkeping about which edge has which source and sink).

Yes, but can't you add DirectedEdge and then get back a list of MultiDiEdge? DirectedEdge is just more readable and specific than std::pair<Node, Node>

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc line 27 at r6 (raw file):

    std::vector<MultiDiEdge> edges = add_edges(g, input);

    SUBCASE("node has incoming edges") {

Suggestion:

    SUBCASE("node has outgoing edges") {

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc line 34 at r6 (raw file):

    }

    SUBCASE("node has no incoming edges") {

Suggestion:

SUBCASE("node has no outgoing edges") {

lib/utils/test/src/utils/graph/views/views.cc line 48 at r1 (raw file):

Previously, Marsella8 wrote…

Fixed it, issue had to do with the query_edges function which was including all open edges coming out of the subgraph

I don't think your current fix is correct, see my comments on your modifications to HashmapUndirectedGraph

lib/utils/test/src/utils/graph/views/views.cc line 88 at r1 (raw file):

Previously, Marsella8 wrote…

std::unordered_set<Node> JoinedNodeView::query_nodes(NodeQuery const &query) in views.cc has not yet been implemented.

Interesting, that should be trivial to implement, can you add it?

lib/utils/test/src/utils/graph/views/views.cc line 52 at r4 (raw file):

Previously, Marsella8 wrote…

get_graph_data I think is only implemented for DataflowGraphs (I can add it for the other classes as well if needed).

Adding it for other classes would be fine

lib/utils/test/src/utils/containers.cc line 140 at r4 (raw file):

Previously, Marsella8 wrote…

ended up putting a throw for an empty container (i think for other functions it makes sense to vacuously define how it behaves for an empty container (e.g. sum, optional_all_of), but for are_all_same I can't really think of a situation where somebody would want to pass an empty container. If you want to be pedantic, you could argue that you can define is_all_same recursively and the empty container would have to evaluate to true, but I feel like in practice in doesn't make much sense.

I actually think are_all_same should be return true if passed an empty list--like, as a user of this function, based on the name I would expect it to behave like (\forall x, y \in L . x == y). If you want to add an are_all_same1 function that throws on empty list I don't think I would object (though I'm not convinced it's necessary), but I think having are_all_same fail on an empty input is rather unintuitive.

Note that I think in one of the PRs (I think it was associated with concat, so probably either inceptionv3 or candle-uno) I added a function that is similar but returns the value if they are all the same, which throws an error if the list is empty (as makes sense), so there is support for this case as well

lib/utils/include/utils/containers/get_one_of.h line 9 at r6 (raw file):

template <typename T>
T get_one_of(std::unordered_set<T> const &s) {

Add check that s is not empty

lib/utils/test/src/utils/disjoint_set.cc line 27 at r6 (raw file):

      element = generate_element<T>(2);
      CHECK_WITHOUT_STRINGIFY(ds.find(element) == element);

Why no stringify? optional can be fmted, and so can std::string and int

lib/utils/test/src/utils/random_utils.cc line 12 at r4 (raw file):

Previously, Marsella8 wrote…

i think in this case it ends up being a bit clunky to compare the 2 vectors (since we are testing for the values being approximately equal), it would be something like in the code snippet.
In general, why is it preferrable to have a single check rather than multiple ones? To have a more defined point of where the check is being done / more awareness of the entire container rather than just a single value?

The second case--CHECK(a == b) prints a and b in the error message if it fails, so if you have for (int i = 0; ...) { CHECK(a[i] == b[i]); } then you only get the diverging elements rather than getting to see all of a and b which is often helpful

Also, simplicity--just having one check is usually more readable.

That said, all of these are not hard rules--if due to doctest's interface making it a single check would be too inconvenient, then we can do the looped version, I just prefer to avoid it where possible.

lib/utils/test/src/utils/stack_map.cc line 49 at r6 (raw file):

      std::vector<std::pair<int, int>> expected = {{1, 10}, {2, 20}, {3, 30}};
      std::vector<std::pair<int, int>> actual = map;
      CHECK_WITHOUT_STRINGIFY(actual == expected);

Why? std::vector<std::pair<int, int>> should 100% be fmtable

lib/utils/test/src/utils/stack_string.cc line 52 at r6 (raw file):

    StackString str3{"abc"};

    CHECK_WITHOUT_STRINGIFY(str1 == str1);

stack_string should be fmtable, why no stringify?

lib/utils/test/src/utils/stack_vector.cc line 10 at r6 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE_TEMPLATE("PushBack", T, int, double, char) {

And same for other test cases in this file

Suggestion:

  TEST_CASE_TEMPLATE("stack_vector<T, MAXSIZE>::push_back", T, int, double, char) {

lib/utils/test/src/utils/stack_vector.cc line 67 at r6 (raw file):

    vector2.push_back(20);

    CHECK_WITHOUT_STRINGIFY(vector1 == vector2);

Why no stringify?

lib/utils/test/src/utils/tuple.cc line 46 at r6 (raw file):

    auto result = tuple_prepend(value, t1);
    std::tuple<int, float, double> expected(42, 3.14f, 2.71828);
    CHECK(tuple_compare(result, expected));

Suggestion:

    std::tuple<int, float, double> result = tuple_prepend(value, t1);
    std::tuple<int, float, double> correct = {42, 3.14f, 2.71828};
    CHECK(result == correct);

lib/utils/test/src/utils/type_index.cc line 22 at r6 (raw file):

  }

  TEST_CASE("matches function") {

Suggestion:

  TEST_CASE("matches<T>(std::type_index)") {

lib/utils/test/src/utils/containers/get_one_of.cc line 11 at r6 (raw file):

    std::unordered_set<int> s = {1, 2, 3};
    CHECK(contains(s, get_one_of(s)));
  }

What happens if s is empty?

lib/utils/test/src/utils/containers/is_subset_eq_of.cc line 0 at r6 (raw file):
Rename to is_subseteq_of.cc to match the corresponding .h file (i.e., utils/containers/is_subseteq_of.h)

Marsella8

Reviewable status: 72 of 148 files reviewed, 83 unresolved discussions (waiting on @lockshaw)

lib/substitutions/src/substitutions/unlabelled/multidigraph_pattern_match.cc line 3 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Delete (yes I know you didn't add these, but may as well remove them now)

Done.

lib/utils/include/utils/join_strings.h line 27 at r6 (raw file):

  }
  return oss.str();
}

Done.

lib/utils/include/utils/bidict/algorithms/merge_bidicts.h line 16 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Including debugging info in exception messages where possible is nice

see are_all_same.h comment

lib/utils/include/utils/bidict/algorithms/merge_bidicts.h line 20 at r6 (raw file):

  if (!are_disjoint(right_entries(lhs), right_entries(rhs))) {
    throw mk_runtime_error(
        "Right entries of merge_bidicts parameters are non-disjoint");

see are_all_same.h comment

lib/utils/include/utils/containers/are_all_same.h line 10 at r6 (raw file):

template <typename C>
bool are_all_same(C const &c) {
  if (c.size() == 0) {

Done.

lib/utils/include/utils/containers/are_all_same.h line 11 at r6 (raw file):

bool are_all_same(C const &c) {
  if (c.size() == 0) {
    throw mk_runtime_error("input to are_all_same must be non-empty container");

This requires to have all formatting for all possible containers that might be passed, probably not worth it.

lib/utils/include/utils/containers/index_of.h line 6 at r6 (raw file):

#include <algorithm>
#include <optional>
namespace FlexFlow {

Done.

lib/utils/include/utils/containers/index_of.h line 10 at r6 (raw file):

/**
 * @details If multiple `e` are present within the container, the function
 *returns the index of the first appearance

Done.

lib/utils/include/utils/containers/map_keys.h line 13 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Is this a good behavior or should this throw an error? I'm not sure this is a great property to have, and I don't think we're relying on it anywhere?

Updated

lib/utils/include/utils/containers/map_keys.h line 14 at r6 (raw file):

 *
 * @details Note that if multiple original keys are transformed to the same new
 * key, which value will be associated to the new key is undefined.

Done.

lib/utils/include/utils/containers/maximum.h line 14 at r6 (raw file):

    throw mk_runtime_error(
        "input to function maximum must be non-empty container");
  }

See are_all_same comment

lib/utils/include/utils/containers/merge_maps.h line 18 at r6 (raw file):

    throw mk_runtime_error(
        "Key sets of merge_maps parameters are non-disjoint");
  }

Done.

lib/utils/include/utils/containers/product.h line 9 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

We don't really have a convention set for doxygen yet, but might be good to use @note for documenting edge cases like this? Or idk, maybe we should only use @note for really weird/bug-prone edge cases? Thoughts?

I'd say as a loose standard:

brief for one line descriptions
details for more comprehensive descriptions and specific behavior (e.g. how get_distance_from_root behaves if there are multiple roots).
note for unlikely edge cases / more unimportant properties which are nice to know (e.g. product is vacuously one)
example for small examples if the function is complex.

Obviously 2 and 3 are not conceptually well separated.

lib/utils/include/utils/containers/product.h line 32 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Move to separate utils/containers/product_where.h

Done.

lib/utils/include/utils/containers/subvec.h line 19 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Avoid c-style casts

Done.

lib/utils/include/utils/containers/subvec.h line 19 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why std::max(0, ...)?

modelling it after python's splits/slices, so if you have {1,2,3,4} and do [0, 40] you get {1,2,3,4} (before it was just reading junk, I can also change it to throw errors if it's out of bounds. Ditto for std::min(idx, (int)v.size());

lib/utils/include/utils/containers/sum.h line 21 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Move to separate utils/containers/sum_where.h

Done.

lib/utils/include/utils/containers/value_all.h line 15 at r6 (raw file):

    return unwrap(element, [] {
      throw mk_runtime_error(
          "Encountered element without value in call to value_all");

see are_all_same.h comment

lib/utils/include/utils/containers/value_all.h line 25 at r6 (raw file):

    return unwrap(element, [] {
      throw mk_runtime_error(
          "Encountered element without value in call to value_all");

Done.

lib/utils/include/utils/containers/vector_split.h line 15 at r6 (raw file):

                                                       int idx) {
  if (idx < 0 || idx > static_cast<int>(v.size())) {
    throw std::out_of_range("Index out of range in vector_split");

Done.

lib/utils/include/utils/graph/README.md line 21 at r6 (raw file):

- `DiGraph`: at most one edge allowed between every ordered pair of nodes, edges are directed (i.e., have a source node and a destination node)
- `MultiDiGraph`: arbitrary numbers of directed edges allowed between every pair of nodes.
- `DataflowGraph`: similar to `MultiDiGraph`, but the edges entering, exiting a given nodes now have a well-defined order. Due to the interface used to construct them (where essentially a node can only be added to the graph after all of its predecessor nodes have been added) `DataflowGraph`s are directed acyclic graphs. Each node has an associated ordered sequence of inputs and outputs, with the restriction that one and only one edge can enter an individual input.

Done.

lib/utils/include/utils/graph/README.md line 135 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

DataflowGraphs don't have add_edge

Done.

lib/utils/include/utils/graph/digraph/algorithms/get_dominators.h line 10 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Just link to https://en.wikipedia.org/wiki/Dominator_(graph_theory) as it gives a better explanation than any docstring we write will

Done.

lib/utils/include/utils/graph/digraph/algorithms/get_dominators.h line 19 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Conceptually I think this is equivalent to merging this set of nodes and then finding the set of dominators of the new merged node? Confirm this is correct and, if so, add to the explanation here

yes correct

lib/utils/src/utils/graph/instances/hashmap_undirected_graph.cc line 53 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I believe the indented semantics of UndirectedGraph::query_edges is all edges connected to the a node in the input set, not all edges only connected to the nodes in the input set--maybe this is a fix that should be added in the subgraph code?

got it (see the associated fix in std::unordered_set<UndirectedEdge> UndirectedSubgraphView::query_edges()

lib/utils/test/src/utils/join_strings.cc line 10 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Probably would be good to check the edge case that v is empty

Done.

lib/utils/test/src/utils/join_strings.cc line 24 at r6 (raw file):

      auto add_exclamation = [](std::string const &str) { return str + "!"; };
      CHECK(join_strings(v, " ", add_exclamation) == "Hello! world! !!");
    }

Done.

lib/utils/test/src/utils/bidict/bidict.cc line 19 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why is explicit std::string(...) needed here?

Done.

lib/utils/test/src/utils/bidict/bidict.cc line 31 at r6 (raw file):

      CHECK(bd.contains_r(3));
      CHECK_FALSE(bd.contains_r(1));
    }

Done.

lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc line 8 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

What happens if they have overlapping keys but all of the kv pairs are the same, i.e., merging {{1, "one"}, {2, "two"}} and {{2, "two"}, {3, "three"}}?

Done.

lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc line 25 at r6 (raw file):

      bidict<int, std::string> bd2 = {{2, "three"}, {3, "four"}};

      CHECK_THROWS_AS(merge_bidicts(bd1, bd2), std::runtime_error);

Done.

lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc line 32 at r6 (raw file):

      bidict<int, std::string> bd2 = {{3, "two"}, {4, "four"}};

      CHECK_THROWS_AS(merge_bidicts(bd1, bd2), std::runtime_error);

Done.

lib/utils/test/src/utils/containers/are_all_same.cc line 21 at r6 (raw file):

    SUBCASE("Empty Container") {
      std::vector<int> input = {};
      CHECK_THROWS_AS(are_all_same(input), std::runtime_error);

Done.

lib/utils/test/src/utils/containers/as_vector.cc line 15 at r6 (raw file):

    std::vector<int> result = as_vector(input);
    std::vector<int> correct_sorted = {1, 2, 3};
    CHECK(sorted(result) == correct_sorted);

Done.

lib/utils/test/src/utils/containers/enumerate.cc line 55 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

If you're already doing unordered comparison of the keys and values, why not just compare std::unordered_multiset<std::pair<...>> rather than two separate comparisons?

set is unordered, we are not guaranteed to have 0 map to "zero", 1 to "one", ... (changed the string labels, before it was pretty confusing).

lib/utils/test/src/utils/containers/find.cc line 13 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Ideally also add a test where the desired element occurs multiple times in the list

Done.

lib/utils/test/src/utils/containers/find.cc line 15 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Add subcase names for these, i.e., SUBCASE("element is in vector"), SUBCASE("element is not vector")

Done.

lib/utils/test/src/utils/containers/flatmap.cc line 9 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Add a check for type changing (e.g., instead of get_factors having type int -> std::vector<int>, you'd have int -> std::vector<string>)

Done.

lib/utils/test/src/utils/containers/index_of.cc line 13 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

SUBCASE("element is present in vector"), SUBCASE("element is not present in vector")

Done.

lib/utils/test/src/utils/containers/map_keys.cc line 10 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Add check for duplicate keys (even if the value is undefined, check whether or not it throws an error, or check that it's one of the allowed return values, etc.)

Done.

lib/utils/test/src/utils/containers/maximum.cc line 22 at r6 (raw file):

      std::vector<int> input = {};

      CHECK_THROWS_AS(maximum(input), std::runtime_error);

Done.

lib/utils/test/src/utils/containers/merge_maps.cc line 27 at r6 (raw file):

      std::unordered_map<int, std::string> rhs = {{2, "three"}, {3, "four"}};

      CHECK_THROWS_AS(merge_maps(lhs, rhs), std::runtime_error);

Done.

lib/utils/test/src/utils/containers/product.cc line 8 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Probably a good idea where not too inconvenient to test the various overloads of these functions, either using TEST_CASE_TEMPLATE or using multiple TEST_CASEs (e.g., TEST_CASE("product(std::vector<int>)") { ... }, TEST_CASE("product(std::unordered_set<int>)") { ... }, etc.).

Obviously weigh implementation difficulty, how many other tasks you have going on, etc. This is not super-high priority, so if you have more important tasks you could also be working on feel free to not worry about this--my general heuristic with testing code is that if it's an improvement over the current situation and doesn't actively violate a bunch of code style, etc. I'll merge it. I'll probably still leave comments pointing it out on PRs, but in those cases feel free to just mention in the response that you'd prefer not to do those fixes as part of the current PR--this is totally fine, and often a good way to keep PRs from getting too large

got it thanks! Added it since its relatively easy to add, will keep it in mind for future containers functions.

lib/utils/test/src/utils/containers/product.cc line 24 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Move to separate product_where.cc

Done.

lib/utils/test/src/utils/containers/reversed.cc line 9 at r6 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("Testing the 'reversed' function") {

Done.

lib/utils/test/src/utils/containers/sorted_by.cc line 10 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Might be good to test what happens if there are duplicates in the list?

Done.

lib/utils/test/src/utils/containers/sum.cc line 24 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Move to separate sum_where.cc

Done.

lib/utils/test/src/utils/containers/value_all.cc line 14 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I haven't really put much though into what error types things throw, so for things that throw mk_runtime_error it's probably better to just check that they throw for now

Done.

lib/utils/test/src/utils/containers/vector_split.cc line 25 at r6 (raw file):

    }

    SUBCASE("Boundary case: idx = 5") {

Done.

lib/utils/test/src/utils/containers/vector_split.cc line 35 at r6 (raw file):

    }

    SUBCASE("Out of bounds case: idx = 6") {

Done.

lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 23 at r6 (raw file):

    SUBCASE("single node") {
      Node node = n[2];

Done.

lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 24 at r6 (raw file):

    SUBCASE("single node") {
      Node node = n[2];
      std::unordered_set<Node> correct = {n[0], n[2]};

Done.

lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 30 at r6 (raw file):

    SUBCASE("multiple nodes") {
      std::unordered_set<Node> nodes = {n[1], n[3]};

Done.

lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 32 at r6 (raw file):

      std::unordered_set<Node> nodes = {n[1], n[3]};
      std::unordered_set<Node> result = get_dominators(g, nodes);
      std::unordered_set<Node> correct = {n[0]};

Done.

lib/utils/test/src/utils/graph/digraph/algorithms/get_dominators.cc line 61 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

You're missing a check (also separate subcases please, at the very least so you have to justify why you want to do two checks here instead of one)

Done.

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 34 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Add a subcase in here to test adding a duplicate edge

Done.

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 42 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why does subcase this exist?

removed

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 60 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

This is only edges from n0 to n0, which I don't think is what you want?

fixed

lib/utils/test/src/utils/graph/multidigraph/multidigraph.cc line 73 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

What is the point of this SUBCASE?

removed

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_incoming_edges.cc line 23 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Yes, but can't you add DirectedEdge and then get back a list of MultiDiEdge? DirectedEdge is just more readable and specific than std::pair<Node, Node>

as yes sorry, done

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc line 27 at r6 (raw file):

    std::vector<MultiDiEdge> edges = add_edges(g, input);

    SUBCASE("node has incoming edges") {

Done.

lib/utils/test/src/utils/graph/multidigraph/algorithms/get_outgoing_edges.cc line 34 at r6 (raw file):

    }

    SUBCASE("node has no incoming edges") {

Done.

lib/utils/test/src/utils/graph/views/views.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Interesting, that should be trivial to implement, can you add it?

Added a tentative implementation, crashes for get_edges, not sure what I'm doing wrong, mind looking at it and seeing if anything pops out?

lib/utils/test/src/utils/graph/views/views.cc line 52 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Adding it for other classes would be fine

I think for now we don't need it for these classes for now, i think it makes tests less atomic.

lib/utils/include/utils/containers/optional_all_of.h line 9 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Are we using this anywhere? It's kinda a weird function and I wouldn't object to removing it if we're not using it

no it's not being used, agree that it should be removed

lib/utils/test/src/utils/containers.cc line 140 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I actually think are_all_same should be return true if passed an empty list--like, as a user of this function, based on the name I would expect it to behave like (\forall x, y \in L . x == y). If you want to add an are_all_same1 function that throws on empty list I don't think I would object (though I'm not convinced it's necessary), but I think having are_all_same fail on an empty input is rather unintuitive.

Note that I think in one of the PRs (I think it was associated with concat, so probably either inceptionv3 or candle-uno) I added a function that is similar but returns the value if they are all the same, which throws an error if the list is empty (as makes sense), so there is support for this case as well

Yeah makes sense, changed it accordingly.

lib/utils/include/utils/containers/get_one_of.h line 9 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Add check that s is not empty

Done.

lib/utils/include/utils/containers/is_submap.h line 14 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For consistency with is_subseteq_of

Done.

lib/utils/test/src/utils/disjoint_set.cc line 27 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why no stringify? optional can be fmted, and so can std::string and int

Done.

lib/utils/test/src/utils/random_utils.cc line 12 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

The second case--CHECK(a == b) prints a and b in the error message if it fails, so if you have for (int i = 0; ...) { CHECK(a[i] == b[i]); } then you only get the diverging elements rather than getting to see all of a and b which is often helpful

Also, simplicity--just having one check is usually more readable.

That said, all of these are not hard rules--if due to doctest's interface making it a single check would be too inconvenient, then we can do the looped version, I just prefer to avoid it where possible.

got it thanks, I think in this case the looped version is more comprehensible

lib/utils/test/src/utils/stack_map.cc line 49 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why? std::vector<std::pair<int, int>> should 100% be fmtable

Done.

lib/utils/test/src/utils/stack_string.cc line 52 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

stack_string should be fmtable, why no stringify?

added docstring support to the stack_ types

lib/utils/test/src/utils/stack_vector.cc line 10 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

And same for other test cases in this file

Done.

lib/utils/test/src/utils/stack_vector.cc line 67 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why no stringify?

Done.

lib/utils/test/src/utils/tuple.cc line 46 at r6 (raw file):

    auto result = tuple_prepend(value, t1);
    std::tuple<int, float, double> expected(42, 3.14f, 2.71828);
    CHECK(tuple_compare(result, expected));

Done.

lib/utils/test/src/utils/type_index.cc line 22 at r6 (raw file):

  }

  TEST_CASE("matches function") {

Done.

lib/utils/test/src/utils/containers/get_one_of.cc line 11 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

What happens if s is empty?

now throws

lib/utils/test/src/utils/containers/is_subset_eq_of.cc line at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Rename to is_subseteq_of.cc to match the corresponding .h file (i.e., utils/containers/is_subseteq_of.h)

Done.

lockshaw

Reviewed 1 of 111 files at r5, 61 of 73 files at r7, all commit messages.
Reviewable status: 134 of 148 files reviewed, 51 unresolved discussions (waiting on @Marsella8)

lib/utils/include/utils/stack_map.h line 142 at r7 (raw file):

};

} // namespace doctest

Shouldn't be necessary as long as stack_map has a std::ostream &operator<<(std::ostream &, stack_map<K, V, MAXSIZE> const &) defined, as doctest will use that if it's defined.

Also, doctest things shouldn't be in utils (they were for a bit, but if you look at the current repo-refactor they've all been moved to test/utils

Code quote:

namespace doctest {

template <typename K, typename V, std::size_t MAXSIZE>
struct StringMaker<FlexFlow::stack_map<K, V, MAXSIZE>> {
  static String convert(FlexFlow::stack_map<K, V, MAXSIZE> const &map) {
    return toString(fmt::to_string(map));
  }
};

} // namespace doctest

lib/utils/include/utils/stack_string.h line 140 at r7 (raw file):

template <size_t MAXSIZE>
struct StringMaker<FlexFlow::stack_string<MAXSIZE>> {

See https://reviewable.io/reviews/flexflow/FlexFlow/1464#-O7kp-4g9JgF07PcV2Y1

lib/utils/include/utils/stack_vector.h line 374 at r7 (raw file):

namespace doctest {

template <typename T, std::size_t MAXSIZE>

See https://reviewable.io/reviews/flexflow/FlexFlow/1464#-O7kp-4g9JgF07PcV2Y1

lib/utils/include/utils/containers/are_all_distinct.h line 6 at r7 (raw file):

#include "utils/containers/unordered_multiset_of.h"
#include "utils/containers/unordered_set_of.h"
#include "utils/exception.h"

Remove? Doesn't seem to be used.

Code quote:

#include "utils/exception.h"

lib/utils/include/utils/containers/are_all_same.h line 11 at r6 (raw file):

Previously, Marsella8 wrote…

This requires to have all formatting for all possible containers that might be passed, probably not worth it.

It's a good concern, though peronally I'd lean toward that being fine as I think it's reasonable to expect that most things should be printable (or that if they're not, that they should be made printable) unless there's some cases we know of where this is impossible. What happens if you change it to require printability--are there any nontrivial cases, or is it just a bunch of #include "utils/fmt/....h (which I don't have much of an objection to)?

lib/utils/include/utils/containers/map_keys.h line 24 at r7 (raw file):

                                   F const &f) {
  std::unordered_multiset<K2> transformed_keys =
      transform(unordered_multiset_of(keys(m)), f);

Probably better not to evaluate f more times than necessary if possible, could probably get moved into the loop and use the f call there?

lib/utils/include/utils/containers/product.h line 9 at r6 (raw file):

Previously, Marsella8 wrote…

I'd say as a loose standard:

brief for one line descriptions

details for more comprehensive descriptions and specific behavior (e.g. how get_distance_from_root behaves if there are multiple roots).

note for unlikely edge cases / more unimportant properties which are nice to know (e.g. product is vacuously one)

example for small examples if the function is complex.

Obviously 2 and 3 are not conceptually well separated.

My only objection is that I think @note gets highlighted (correct me if I'm wrong), so using it for "more unimportant properties" is a bit odd

lib/utils/include/utils/containers/subvec.h line 19 at r6 (raw file):

Previously, Marsella8 wrote…

modelling it after python's splits/slices, so if you have {1,2,3,4} and do [0, 40] you get {1,2,3,4} (before it was just reading junk, I can also change it to throw errors if it's out of bounds. Ditto for std::min(idx, (int)v.size());

Probably better to throw an error if it's still out-of-bounds after the negative-to-positive-index conversion

lib/utils/include/utils/containers/transform.h line 38 at r7 (raw file):

template <typename F,
          typename In,
          typename Out = decltype(std::declval<F>()(std::declval<In>()))>

Suggestion:

          typename Out = std::invoke_result_t<F, In>>

lib/utils/include/utils/containers/transform.h line 50 at r7 (raw file):

template <typename F,
          typename In,
          typename Out = decltype(std::declval<F>()(std::declval<In>()))>

Suggestion:

          typename Out = std::invoke_result_t<F, In>>

lib/utils/include/utils/fmt/unordered_map.h line 22 at r7 (raw file):

    : formatter<::std::string> {
  template <typename FormatContext>
  auto format(::std::unordered_map<K, V> const &m, FormatContext &ctx) const

Can you make sure this is also applied to the other files in utils/fmt?

lib/utils/test/src/utils/bidict/bidict.cc line 19 at r6 (raw file):

Previously, Marsella8 wrote…

Done.

I still see the explicit std::string(...)?

lib/utils/test/src/utils/bidict/bidict.cc line 41 at r7 (raw file):

        CHECK_FALSE(bd.contains_r(1));
      }
    }

Nest these as

SUBCASE("L type is the same as R type") {
  SUBCASE("bidict::contains_l") {
    ...
  }

  SUBCASE("bidict::contains_r") {

  }
}

SUBCASE("L type is not the same as R type") {
  SUBCASE("bidict::contains_l") {
    ...
  }

  SUBCASE("bidict::contains_r") {

  }
}

Code quote:

    SUBCASE("bidict::contains_l") {
      SUBCASE("L type is not the same as R type") {
        CHECK(dict.contains_l(1));
        CHECK_FALSE(dict.contains_l(3));
      }

      SUBCASE("L type is the same as R type") {
        bidict<int, int> bd;
        bd.equate(1, 3);

        CHECK(bd.contains_l(1));
        CHECK_FALSE(bd.contains_l(3));
      }
    }

    SUBCASE("bidict::contains_r") {
      SUBCASE("L type is not the same as R type") {
        CHECK(dict.contains_r(std::string("one")));
        CHECK_FALSE(dict.contains_r(std::string("three")));
      }

      SUBCASE("L type is the same as R type") {
        bidict<int, int> bd;
        bd.equate(1, 3);

        CHECK(bd.contains_r(3));
        CHECK_FALSE(bd.contains_r(1));
      }
    }

lib/utils/test/src/utils/bidict/bidict.cc line 52 at r7 (raw file):

      CHECK(bd.contains_r(3));
      CHECK_FALSE(bd.contains_r(1));
    }

Delete?

Code quote:

    SUBCASE("bidict::contains_r, bidict::contains_r - same type") {
      bidict<int, int> bd;
      bd.equate(1, 3);
      bd.equate(2, 4);

      CHECK(bd.contains_l(1));
      CHECK_FALSE(bd.contains_l(3));
      CHECK(bd.contains_r(3));
      CHECK_FALSE(bd.contains_r(1));
    }

lib/utils/test/src/utils/bidict/algorithms/merge_bidicts.cc line 32 at r7 (raw file):

      bidict<int, std::string> bd2 = {{2, "two"}, {3, "three"}};

      CHECK_THROWS(merge_bidicts(bd1, bd2));

If this is the case, then it should probably be renamed merge_disjoint_bidicts

lib/utils/test/src/utils/containers/enumerate.cc line 55 at r6 (raw file):

Previously, Marsella8 wrote…

set is unordered, we are not guaranteed to have 0 map to "zero", 1 to "one", ... (changed the string labels, before it was pretty confusing).

std::unordered_multiset<std::pair<int, std::string>> correct = {{0, "A"}, {1, "B"}, {2, "C"}, {3, "D"}}; would check that, right?

lib/utils/test/src/utils/containers/find.cc line 32 at r7 (raw file):

      std::unordered_set<int> s = {1, 2, 3, 4, 5};

      SUBCASE("element found") {

Would probably be a better name, but not a big deal

Suggestion:

      SUBCASE("element in container") {

lib/utils/test/src/utils/containers/index_of.cc line 14 at r7 (raw file):

    std::vector<int> v = {1, 2, 3, 4, 3, 5};

    SUBCASE("unique element") {

Suggestion:

    SUBCASE("element occurs once in container") {

lib/utils/test/src/utils/containers/index_of.cc line 19 at r7 (raw file):

    SUBCASE("duplicate elements") {
      CHECK(index_of(v, 3).value() == 2); // Returns first occurrence
    }

Suggestion:

    SUBCASE("if element appears multiple times, returns the first occurence") {
      CHECK(index_of(v, 3).value() == 2);
    }

lib/utils/test/src/utils/containers/product_where.cc line 17 at r7 (raw file):

      CHECK(correct == result);
    }
    SUBCASE("empty filtered container") {

Add subcase for if the container started empty

lib/utils/test/src/utils/containers/sorted_by.cc line 27 at r7 (raw file):

    }

    SUBCASE("duplicates") {

Suggestion:

    SUBCASE("container contains duplicate elements") {

lib/utils/test/src/utils/containers/sum_where.cc line 17 at r7 (raw file):

      int result = sum_where(input, condition);
      CHECK(correct == result);
    }

Add subcase for if the container starts empty

lib/utils/test/src/utils/containers/vector_split.cc line 35 at r7 (raw file):

    }

    SUBCASE("Out of bounds case: idx == list_size + 1") {

Isn't the first out of bounds index list_size and not list_size+1, e.g., if I have a list l of 1 element, l.at(1) would throw an error

lib/utils/test/src/utils/random_utils.cc line 28 at r7 (raw file):

    SUBCASE("correct distribution") {
      auto check_probabilities = [](std::vector<int> values,

Suggestion:

      auto check_probabilities = [](std::vector<int> const &values,

lib/utils/test/src/utils/containers/get_one_of.cc line 16 at r7 (raw file):

    SUBCASE("empty set") {
      std::unordered_set<int> s = {};
      CHECK_THROWS_AS(get_one_of(s), std::runtime_error);

Suggestion:

      CHECK_THROWS(get_one_of(s));

lockshaw

Reviewed 3 of 73 files at r7.
Reviewable status: 137 of 148 files reviewed, 47 unresolved discussions (waiting on @Marsella8)

Pietro Max Marsella added 3 commits August 9, 2024 22:42

removed GraphInternal

932979d

minor changes

cbc3de4

Added views.cc testing

018bb40

Marsella8 requested a review from lockshaw August 11, 2024 06:01

Marsella8 commented Aug 11, 2024

View reviewed changes

Fixed containers testing

72d61c7

Marsella8 commented Aug 11, 2024

View reviewed changes

Pietro Max Marsella added 7 commits August 11, 2024 16:39

views.cc fix

b47b801

test reorganizing

3cabb66

fmt

f3ae8a9

test rearranging

2d148db

rearranging

86e7dd0

graph fixes

c14a34b

undirected graph test fix

b870a44

Marsella8 marked this pull request as ready for review August 12, 2024 01:41

Updated docs

14c930b

jiazhihao added the repo-refactor Topics related to the repo and search refactors label Aug 22, 2024

Merge branch 'repo-refactor' into graph-tests

d6b287e

lockshaw requested changes Aug 27, 2024

View reviewed changes

Pietro Max Marsella added 2 commits September 10, 2024 00:53

merge with repo-refactor

007ffe9

Test updates + bug fixes

5f1bd6d

Marsella8 commented Sep 12, 2024

View reviewed changes

minor fix

a371c02

Marsella8 commented Sep 12, 2024

View reviewed changes

lockshaw requested changes Sep 13, 2024

View reviewed changes

lockshaw mentioned this pull request Sep 17, 2024

Add unit tests for utils/graph library #1224

Closed

2 tasks

PR fixes

475ace5

Marsella8 commented Sep 24, 2024

View reviewed changes

lockshaw requested changes Sep 27, 2024

View reviewed changes

lockshaw reviewed Sep 27, 2024

View reviewed changes

This was referenced Oct 9, 2024

Reenable utils tests #1336

Open

Add unit tests for labelled graphs in utils #986

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Graph Tests #1464

Graph Tests #1464

Marsella8 commented Aug 11, 2024 •

edited by wmdi

Loading

Marsella8 left a comment

codecov bot commented Aug 11, 2024 •

edited

Loading

Marsella8 left a comment

lockshaw left a comment

Marsella8 left a comment

Marsella8 left a comment

Marsella8 left a comment

lockshaw left a comment

Marsella8 left a comment

lockshaw left a comment

lockshaw left a comment

	// example from
	// https://en.wikipedia.org/w/index.php?title=Dominator_(graph_theory)&oldid=1189814332

	DiGraph g = DiGraph::create<AdjacencyDiGraph>();

	std::vector<Node> n = add_nodes(g, 6);

	add_edges(g,
	{
	DirectedEdge{n.at(0), n.at(1)},
	DirectedEdge{n.at(1), n.at(2)},
	DirectedEdge{n.at(1), n.at(3)},
	DirectedEdge{n.at(1), n.at(5)},
	DirectedEdge{n.at(2), n.at(4)},
	DirectedEdge{n.at(3), n.at(4)},
	DirectedEdge{n.at(4), n.at(1)},
	});

Graph Tests #1464

Are you sure you want to change the base?

Graph Tests #1464

Conversation

Marsella8 commented Aug 11, 2024 • edited by wmdi Loading

Marsella8 left a comment

Choose a reason for hiding this comment

codecov bot commented Aug 11, 2024 • edited Loading

Codecov Report

Marsella8 left a comment

Choose a reason for hiding this comment

lockshaw left a comment

Choose a reason for hiding this comment

Marsella8 left a comment

Choose a reason for hiding this comment

Marsella8 left a comment

Choose a reason for hiding this comment

Marsella8 left a comment

Choose a reason for hiding this comment

lockshaw left a comment

Choose a reason for hiding this comment

Marsella8 left a comment

Choose a reason for hiding this comment

lockshaw left a comment

Choose a reason for hiding this comment

lockshaw left a comment

Choose a reason for hiding this comment

Marsella8 commented Aug 11, 2024 •

edited by wmdi

Loading

codecov bot commented Aug 11, 2024 •

edited

Loading