Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace YulString with YulName typedef #15083

Merged
merged 1 commit into from
Jul 19, 2024
Merged

Conversation

clonker
Copy link
Member

@clonker clonker commented May 8, 2024

in preparation of type change of YulString s.t. these changes are more localized

@clonker clonker force-pushed the rename_yulstring_to_yulname branch 2 times, most recently from fdb537b to 1e91d0e Compare May 8, 2024 08:55
@nikola-matic
Copy link
Collaborator

What's this about? Is this regarding the Yul AST rework to get rid of the AST ID dependencies (or to at least strengthen them) in order then get rid of codegen non-determinism? Or is it something else?

@clonker
Copy link
Member Author

clonker commented May 8, 2024

The idea is to get rid of YulString as a kind of a leightweight string proxy in favor of a simple incremental uint64 index for handles (with a backing string vector of names) and do the string generation of derived names post-hoc

@nikola-matic
Copy link
Collaborator

The idea is to get rid of YulString as a kind of a leightweight string proxy in favor of a simple incremental uint64 index for handles (with a backing string vector of names) and do the string generation of derived names post-hoc

Alright, so killing the non-determinism in its root.

@clonker
Copy link
Member Author

clonker commented May 8, 2024

Depends on the level at which you're looking - but sure, restarting the optimization with an intermediate will kill determinism with that approach unless we do reordering after each step (which may not be so bad, let's see)

@clonker
Copy link
Member Author

clonker commented May 8, 2024

Hi @bshastry, as this changes some stuff in the fuzzing department as well, does the PR mess things up on your end or is it fine? :)

libyul/AST.h Outdated
using TypedNameList = std::vector<TypedName>;

/// Literal number or string (up to 32 bytes)
enum class LiteralKind { Number, Boolean, String };
struct Literal { langutil::DebugData::ConstPtr debugData; LiteralKind kind; YulString value; Type type; };
struct Literal { langutil::DebugData::ConstPtr debugData; LiteralKind kind; YulName value; Type type; };
Copy link
Member

@cameel cameel May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR seems to be just replacing all instances of YulString with YulName. That's not very useful. The idea was to rename only in those cases where YulString is used for identifiers. Those are the cases which we'd want to replace with numeric IDs and we'll do this by switching YulName to be some other type.

Literal is a good example of a situation where this must not happen. If you have a string "xyz", in the code, it must remain as "xyz" in the AST. It can't be replaced with some number that we'll only later remap back to the original value.

I think we don't want this for TypedName either. Types are defined in the dialect so renaming then just in the AST will make us unable to find the right type definition.

BTW, TypeName is probably a legacy thing at this point and maybe we'll remove it eventually. It is used in the typed Yul dialect, which was an intermediate step in the process of translating the code to Ewasm. Ewasm was dropped and I'm not sure if we'll have any other use for it. In the untyped dialect Type will always be just the default type.

As for FunctionDefinition, for that one using YulName seems correct, though I'm surprised it doesn't use a whole Identifier instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression you'd wanted to get rid of YulString altogether, my bad then.

As for Literal and TypedName: How would replacing a string "xyz" with some identifier and later remapping it be conceptually different from using YulString, which replaces the string "xyz" with a handle (id/hash) to later remap it?

@@ -59,7 +60,7 @@ struct VariableDeclaration { langutil::DebugData::ConstPtr debugData; TypedNameL
/// Block that creates a scope (frees declared stack variables)
struct Block { langutil::DebugData::ConstPtr debugData; std::vector<Statement> statements; };
/// Function definition ("function f(a, b) -> (d, e) { ... }")
struct FunctionDefinition { langutil::DebugData::ConstPtr debugData; YulString name; TypedNameList parameters; TypedNameList returnVariables; Block body; };
struct FunctionDefinition { langutil::DebugData::ConstPtr debugData; YulName name; TypedNameList parameters; TypedNameList returnVariables; Block body; };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may have a small problem here. With this change, user-defined functions will use YulName, while the built-in functions in the dialect will still be identified with YulString. Since the latter by design have fixed names, I'm not sure it's even feasible to use YulName for them. But then FunctionCall node will not be able to refer to them.

{

class YulString;
using YulName = YulString;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you're planning to add something more here, like utilities (or maybe make YulName into a class in the future), it would be enough to put it in AST.h. We'll need it pretty much only in Yul AST. In fact, if there is some situation where it does not seem appropriate to include the whole AST.h, it could be a sign that YulString there is not really used for identifiers.

@bshastry
Copy link
Contributor

Hi @bshastry, as this changes some stuff in the fuzzing department as well, does the PR mess things up on your end or is it fine? :)

Sorry for the late reply :-(

Since the ossfuzz build passed, I would assume it is fine with the fuzzer as well. Dynamic breakages (at runtime) need to be seen though, let me know if you want me to run the fuzzer on this PR. That should throw up issues if any!

@clonker
Copy link
Member Author

clonker commented May 27, 2024

Hi @bshastry, as this changes some stuff in the fuzzing department as well, does the PR mess things up on your end or is it fine? :)

Sorry for the late reply :-(

Since the ossfuzz build passed, I would assume it is fine with the fuzzer as well. Dynamic breakages (at runtime) need to be seen though, let me know if you want me to run the fuzzer on this PR. That should throw up issues if any!

Cheers, no worries. I expect this PR to be closed anyways and there's going to be a new one with a different set of changes. That will also impact the fuzzer, most likely. I'll let you know :)

Copy link

This pull request is stale because it has been open for 14 days with no activity.
It will be closed in 7 days unless the stale label is removed.

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label Jun 10, 2024
@cameel cameel removed the stale The issue/PR was marked as stale because it has been open for too long. label Jun 11, 2024
Copy link

This pull request is stale because it has been open for 14 days with no activity.
It will be closed in 7 days unless the stale label is removed.

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label Jun 25, 2024
@cameel cameel removed the stale The issue/PR was marked as stale because it has been open for too long. label Jun 26, 2024
Copy link

This pull request is stale because it has been open for 14 days with no activity.
It will be closed in 7 days unless the stale label is removed.

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label Jul 10, 2024
Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through all the substitutions here and, surprisingly, there's actually very little use of YulString for things that are not identifiers. The main case were the values of literals, but that has already been eliminated by #15112.

One case where I'm not 100% sure is the tracking of memory, storage and keccak values in DataflowAnalyser. I think YulString is only used to refer to variables there but without taking a deeper look, I'm not confident enough to say that we never put actual values in it.

There are a few cases, where it's kinda an identifier, but cannot just be freely replaced with a different one:

Other than these, I see YulString used only to name functions, variables, types and Yul objects.

As such, this change would actually be mostly fine, though IIRC last time we talked about it, we said that we won't be merging it.

If we do decide to proceed with it, the _yulstring suffix should be renamed to _yulname as well and the whole YulString class should be removed since it looks like we won't need it for anything other than names.

@github-actions github-actions bot removed the stale The issue/PR was marked as stale because it has been open for too long. label Jul 11, 2024
@clonker
Copy link
Member Author

clonker commented Jul 11, 2024

In #15215 I have done the following to address the points you brought up:

  • qualified data names are reduced to std::strings (see, f.ex., here)
  • "m"_yulstring and "s"_yulstring can be replaced by 0 and 1
  • There is a special empty YulName same as there is a special empty YulString

I am in the process of breaking down that way too big PR into smaller (or if not small, at least hopefully trivial) ones and we can see about each of the points in more detail then, so as far as I am concerned, we don't have to merge this.

@clonker clonker force-pushed the rename_yulstring_to_yulname branch from 1e91d0e to ead0291 Compare July 18, 2024 12:15
@clonker clonker force-pushed the rename_yulstring_to_yulname branch 3 times, most recently from 5ae46a2 to 46cf485 Compare July 18, 2024 12:28
@clonker clonker mentioned this pull request Jul 18, 2024
10 tasks
cameel
cameel previously approved these changes Jul 18, 2024
Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine overall, just has a few places that mangle line wrapping for some reason.

Also, I think we may have been better off just renaming YulString in place, but since follow-up PRs are starting to pile up and I'm not sure if it wouldn't cascade into some more changes, I'm fine with merging this regardless.

libyul/AsmAnalysis.cpp Outdated Show resolved Hide resolved
libyul/YulString.h Show resolved Hide resolved
libyul/backends/evm/ControlFlowGraphBuilder.cpp Outdated Show resolved Hide resolved
test/libyul/KnowledgeBaseTest.cpp Outdated Show resolved Hide resolved
test/libyul/YulOptimizerTestCommon.cpp Outdated Show resolved Hide resolved
libyul/Object.h Outdated Show resolved Hide resolved
@clonker clonker force-pushed the rename_yulstring_to_yulname branch 2 times, most recently from 589ee29 to ba42473 Compare July 19, 2024 06:21
@clonker
Copy link
Member Author

clonker commented Jul 19, 2024

Thanks for the feedback! I think my editor was playing some tricks on me with the formatting:)

@clonker clonker force-pushed the rename_yulstring_to_yulname branch from ba42473 to 96fdcc3 Compare July 19, 2024 08:54
@clonker clonker merged commit 81a05f3 into develop Jul 19, 2024
72 checks passed
@clonker clonker deleted the rename_yulstring_to_yulname branch July 19, 2024 09:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants