Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance improvement for A* algorithm #68

Merged
merged 19 commits into from
Apr 23, 2024

Conversation

bcrobin03
Copy link
Contributor

@bcrobin03 bcrobin03 commented Mar 25, 2024

Changes

The changes made are:

  • replacement of the datastructure used for open nodes in the A* algorithm. I replaced the ordered set by a heap which allow to fetch the closest nodes in O(1) instead of O(n)
  • adapted the code to account for the datastructure change (minor change)
  • to further accelerate the code I change the np.clip to a max(min(... which is faster due to the nature of the input (no need of vectorization therefore
  • finally replaced forbidden_cells (found in rail_generators, cell where the path cannot go) list by a set which allow to check if the cell is forbidden in O(1) in general
  • Deactivate the 2 following tests in test_flatland_envs_sparse_rail_generators.py : test_sparse_rail_generator() and test_sparse_rail_generator_deterministic() because they relied heavily on the datastructure used in A*

Checklist

*I used heapq which is from the standard library so I did not change requirements

  • I encountered some trouble with the A* test, indeed as I use a heap now the path taken is not the one expected by the test as it does not solve ties the same way as the ordered set, however the length, and destination are equivalent (algorithm does indeed find a way that is the least costly). I considered that it would maybe be interesting to not test to a particular path but only test on the length of the path. Therefore this code does not fullfil the requirements to pass the tests, but maybe we can discuss it ?

  • I claim that it is faster but how much ? I tested different set of parameters and found the following gain of performance (this is in seconds), by timing the function env.reset():
    here is an idea of the gain of performance of A* heap VS orderedSet:

image

Here is an idea of the gain of performance respective to the size of the grid with 5 cities:
Careful I labeled the x-axis as number of cities where it is in fact the grid size (square)
execution_time_plot

@bcrobin03 bcrobin03 requested a review from a team as a code owner March 25, 2024 18:25
aiAdrian
aiAdrian previously approved these changes Mar 25, 2024
Copy link
Contributor

@aiAdrian aiAdrian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work !

@aiAdrian
Copy link
Contributor

@gdalle
Copy link
Contributor

gdalle commented Mar 26, 2024

I encountered some trouble with the A* test, indeed as I use a heap now the path taken is not the one expected by the test as it does not solve ties the same way as the ordered set, however the length, and destination are equivalent (algorithm does indeed find a way that is the least costly). I considered that it would maybe be interesting to not test to a particular path but only test on the length of the path. Therefore this code does not fullfil the requirements to pass the tests, but maybe we can discuss it ?

@aiAdrian the failed tests are an expected consequence of this change in the underlying algorithm, because the new priority queue breaks ties in a different way.
Hence the suggestion to rethink the tests, and restrict them to invariant properties of the generation algorithm (length and endpoints of each path between cities) instead of implementation-dependent details (the exact path taken). What do you think?

@aiAdrian
Copy link
Contributor

aiAdrian commented Mar 26, 2024

I suggest fixing or disabling the tests. then merge and open a new issue to rethink and rebuild the tests, respectively. I think it's a good idea to build the test as you suggested.

Please create the issue and disable the test with a link to the new issue - or just fix the test in a first run. A soon a this is done, i will do the review again.

Thank you (@gdalle ) for this very nice work

@gdalle
Copy link
Contributor

gdalle commented Mar 26, 2024

Most of the credit goes to @bcrobin03 :) I'll let him choose between fixing the tests or selectively disabling them, and open a new issue to discuss a redesign

@gdalle
Copy link
Contributor

gdalle commented Mar 26, 2024

@bcrobin03 can you also clean up this branch to remove the notebook diffs and keep only the core code? The number of lines changed is alarming.

@bcrobin03
Copy link
Contributor Author

@bcrobin03 can you also clean up this branch to remove the notebook diffs and keep only the core code? The number of lines changed is alarming.

I am looking into it

@gdalle gdalle requested a review from aiAdrian April 16, 2024 06:26
@gdalle
Copy link
Contributor

gdalle commented Apr 16, 2024

@aiAdrian I think this is ready!

Copy link
Contributor

@aiAdrian aiAdrian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for this really great performance improvement!

@aiAdrian aiAdrian merged commit 8c13fa2 into flatland-association:main Apr 23, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants