Performance enhancements for Model.agents #2251

quaquel · 2024-08-27T07:26:24Z

This PR is a performance enhancement for Model.agents. It emerged out of a discussion on the weird scaling performance of the Boltzman wealth model.

Key changes

model.agents now returns the agentset as maintained by the model, rather than a new copy based on the hard references
agent registration and deregistration have been moved from the Agent into the model. The agent now calls model.register and model.deregister. This encapsulates everything cleanly inside the model class and makes Agent less dependent on the inner details of how Model manages the hard references to agents
the setup of the relevant datastructures is moved into its own helper method, again this cleans up code.

for more information, see https://pre-commit.ci

github-actions · 2024-08-27T07:31:13Z

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
Schelling	small	🔴 +77.0% [+76.3%, +77.6%]	🔵 -0.4% [-0.7%, -0.1%]
Schelling	large	🔴 +94.2% [+92.9%, +95.3%]	🔵 -0.8% [-2.9%, +0.7%]
WolfSheep	small	🔴 +130.1% [+128.1%, +131.9%]	🟢 -97.5% [-97.5%, -97.4%]
WolfSheep	large	🔴 +130.8% [+129.1%, +132.4%]	🟢 -99.9% [-99.9%, -99.9%]
BoidFlockers	small	🔴 +83.6% [+82.2%, +84.7%]	🔵 -0.3% [-1.0%, +0.6%]
BoidFlockers	large	🔴 +83.1% [+82.0%, +84.1%]	🔵 -0.6% [-1.2%, +0.0%]

EwoutH

Thanks, looks interesting!

mesa/model.py

EwoutH · 2024-08-27T07:45:40Z

@quaquel can you reproduce the speedup of WolfSheep locally?

Edit: on of the benchmark models also gives a warning now:

FutureWarning: The Mesa Model class was not initialized. In the future, you need to explicitly initialize the Model by calling super().init() on initialization.
self.model.register_agent(self)

quaquel · 2024-08-27T07:59:02Z

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
Schelling small 🔴 +77.0% [+76.3%, +77.6%] 🔵 -0.4% [-0.7%, -0.1%]
Schelling large 🔴 +94.2% [+92.9%, +95.3%] 🔵 -0.8% [-2.9%, +0.7%]
WolfSheep small 🔴 +130.1% [+128.1%, +131.9%] 🟢 -97.5% [-97.5%, -97.4%]
WolfSheep large 🔴 +130.8% [+129.1%, +132.4%] 🟢 -99.9% [-99.9%, -99.9%]
BoidFlockers small 🔴 +83.6% [+82.2%, +84.7%] 🔵 -0.3% [-1.0%, +0.6%]
BoidFlockers large 🔴 +83.1% [+82.0%, +84.1%] 🔵 -0.6% [-1.2%, +0.0%]

I'll try to figure out what is happening with the init times here. Its probably overhead from setting up the additional datastructures, which we get back when running the model (at least for wolf sheep?).

Corvince · 2024-08-27T08:09:17Z

Thanks for this PR, I like the way it encapsulates the logic much better.

I think it is a good default to return the actual agentset and not just a copy. I think this is actually more intuitive.

Regarding the copy function, doesn't agentset.select() already do that? I'm not arguing against adding an additional method but I just want to mention that it's already possible and it should rightfully be discussed separately from this PR

quaquel · 2024-08-27T09:02:32Z

Regarding the copy function, doesn't agentset.select() already do that? I'm not arguing against adding an additional method but I just want to mention that it's already possible and it should rightfully be discussed separately from this PR

Yes, select does this as a corner case. But fair enought to seperate copy into a seperate PR

quaquel · 2024-08-27T13:31:07Z

@quaquel can you reproduce the speedup of WolfSheep locally?

Reran benchmarks locally multiple times. Behavior is rather varied. In general inits are allways up, but not by the same percentage as shown above. Typicaly it is between 5% and 15% slower init times. I cannot see the massive speedups on wolf sheep. Rather, I see sometimes mosest increases or modest decreases. I am going to dig a bit more to see what is going on.

github-actions · 2024-08-28T10:02:50Z

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
BoltzmannWealth	small	🔴 +16.5% [+14.0%, +19.0%]	🟢 -12.1% [-12.2%, -11.9%]
BoltzmannWealth	large	🔴 +47.5% [+21.2%, +84.9%]	🟢 -17.8% [-20.0%, -15.5%]
Schelling	small	🔴 +10.5% [+10.1%, +10.9%]	🔵 -0.7% [-1.0%, -0.4%]
Schelling	large	🔴 +13.7% [+12.8%, +14.7%]	🔵 +0.6% [-1.5%, +2.6%]
WolfSheep	small	🔴 +21.8% [+20.3%, +23.3%]	🔵 -0.2% [-3.6%, +3.3%]
WolfSheep	large	🔴 +20.0% [+19.1%, +20.9%]	🔵 +4.3% [+2.1%, +6.7%]
BoidFlockers	small	🔴 +12.9% [+12.0%, +13.8%]	🔵 +0.6% [-0.2%, +1.2%]
BoidFlockers	large	🔴 +12.4% [+11.7%, +13.2%]	🔵 -0.1% [-0.3%, +0.1%]

mesa/model.py

rht · 2024-08-28T10:17:25Z

mesa/model.py

    @property
    def agent_types(self) -> list[type]:
        """Return a list of different agent types."""
-        return list(self.agents_.keys())
+        return list(self._agents_by_type.keys())


The previous and the new change have different meaning altogether (list of all type of each agents vs list(set(agent types))). I suppose the docstring is ambiguous even though it is closer to the former meaning.

That's not correct. In the old version self.agents_ was the exact same datastructure as what is now _agents_by_type. In the old version: self.agents_: defaultdict[type, dict] = defaultdict(dict). In the new version: self._agents_by_type: dict[type, AgentSet] = {}. So, in either case, keys is a list of types.

That means it still stands that the docstring is not clear enough.

This docstring has not been changed by me in this PR, so it is already in the existing code. If you suggest changing it, that would be fine, but why would that hold up this PR?

Also, what do you suggest changing? The return is a list, and this list contains the types of the agents in the model. The only, minor, ambiguity is that it is a list of the unique types of agents, not the type for each agent in the model.

EwoutH · 2024-08-28T10:33:47Z

Let me review it in depth tonight (I'm trying to stay in the model-building flow)

EwoutH · 2024-08-28T17:35:11Z

Do we need to modernize BoltzmannWealth first to see the performance changes? Because 20% speedup in the large case with 10.000 agents doesn't seem that much.

quaquel · 2024-08-28T17:56:15Z

This version of the model is on a grid. #2224 is with all agents in the model. So, I am not sure what to expect in the first place.

EwoutH · 2024-08-28T18:41:00Z

Yeah so the Grid operations might be the bottleneck. Sounds logical.

I will try to update the benchmarks later tonight or tomorrow, and then we can bring this one in.

I'm also curious if we can speed up shuffle() at some point. With a model with 60.000 agents it's getting slow (also with inplace=True), but most notably, it's about the only AgentSet operation that's getting slow.

quaquel · 2024-08-28T18:50:33Z

I'm also curious if we can speed up shuffle() at some point. With a model with 60.000 agents it's getting slow (also with inplace=True), but most notably, it's about the only AgentSet operation that's getting slow.

In the case of inplace=True, the slowdown happens either when creating the list from the keys, when applying random.shuffle, or rebuilding the dict from the shuffled keys. None of these is easy to speedup any further.

EwoutH

I really like the model.register / deregister. It's very clear and cleanly separated, and the right way around.
model.agents now is the ground truth right?
I would really appreciate an overview of all the variables and private variables we now keep around agents, agent types and AgentSets.

EwoutH · 2024-08-28T20:02:34Z

mesa/model.py

+        self._agents = {}
+        self._agents_by_type: dict[type, AgentSet] = {}
+        self._all_agents = AgentSet([], self)


Okay, so now we're keeping 4 records of all our agents (if we count self.agents)? Could you explain a bit why each is needed?

I think this can be a place where some performance is lost, in models that frequently kill and create agents (like wolf-sheep)?

self._agents contains the hard refs, so this is essential.

self._all_agents is the agentset with all agents. The main performance motivation for this PR. So again essential.

The only one you could debate is agents_by_type. This you could also do via self._all_agents.groupby(types).groups and thus as a property. What is better from a performance standpoint is hard to say. adding and removing stuff from dicts is relatively cheap to do.

Thanks, appreciated. Could you add this in a comment or docstring behind/above each?

self.agents is the only thing we formally expose, all others are private and use-at-you-own risk.

For now I see the use case of a separate _agents_by_type. We can dive into the performance later.

And at least we got rid of self.agents_!

self.agents is the only thing we formally expose, all others are private and use-at-you-own risk.

Which is exactly why I moved everything else into private. It keeps the public side the same as before.

I'll add a few comments on the datastructures later today

quaquel · 2024-08-28T20:12:03Z

model.agents now is the ground truth right?

no, model._agents is the hardref dict and the ground truth.

I would really appreciate an overview of all the variables and private variables we now keep around agents, agent types and AgentSets.

can we wait with that until we remove all the old 2.x stuff from the model?

mesa/model.py

github-actions · 2024-08-28T20:31:59Z

Performance benchmarks:

Model	Size	Init time [95% CI]	Run time [95% CI]
BoltzmannWealth	small	🔴 +31.7% [+29.3%, +34.0%]	🟢 -12.2% [-12.3%, -12.0%]
BoltzmannWealth	large	🔴 +53.7% [+25.6%, +93.0%]	🟢 -16.4% [-18.6%, -14.1%]
Schelling	small	🔴 +10.3% [+9.9%, +10.7%]	🔵 +0.4% [+0.2%, +0.6%]
Schelling	large	🔴 +14.4% [+13.8%, +15.1%]	🔵 +1.7% [+0.9%, +2.4%]
WolfSheep	small	🔴 +23.8% [+22.3%, +25.3%]	🔵 +0.6% [-2.8%, +4.1%]
WolfSheep	large	🔴 +23.6% [+22.6%, +24.6%]	🔴 +10.1% [+9.2%, +10.8%]
BoidFlockers	small	🔴 +11.0% [+10.4%, +11.5%]	🔵 -0.0% [-0.8%, +0.7%]
BoidFlockers	large	🔴 +9.2% [+8.4%, +10.1%]	🔵 +0.5% [-0.1%, +1.0%]

for more information, see https://pre-commit.ci

This PR is a performance enhancement for Model.agents. It emerged from a discussion on [the weird scaling performance of the Boltzman wealth model](projectmesa#2224). model.agents now returns the agentset as maintained by the model, rather than a new copy based on the hard references agent registration and deregistration have been moved from the Agent into the model. The agent now calls model.register and model.deregister. This encapsulates everything cleanly inside the model class and makes Agent less dependent on the inner details of how Model manages the hard references to agents the setup of the relevant datastructures is moved into its own helper method, again, this cleans up code.

EwoutH · 2024-09-24T11:51:00Z

Should register_agent() and deregister_agent() be private methods?

quaquel · 2024-09-24T12:36:34Z

If you want to get technical, I guess in a java sense they would be protected rather than private. They are not private because that would mean that agent could not call model.register_agent. They are not public in the sense of user facing.

Corvince · 2024-09-24T13:07:04Z

In Python speak, however, it is either a public or an internal method. Public is everything user-faced, documented and guaranteed to be stable. So if there isn't any user facing use case for these functions I think they should indeed be made non-public by a leading underscore. This means its meant to be used internally, which is exactly whats happening here.

EwoutH · 2024-09-24T13:15:46Z

Thanks for the explanation Jan, and I agree with Corvince.

quaquel added 15 commits January 31, 2024 18:59

further updates

d90b0f7

Update benchmarks/WolfSheep/__init__.py

9586490

Merge remote-tracking branch 'upstream/main'

4aaa35d

Merge remote-tracking branch 'upstream/main'

d31478c

Merge remote-tracking branch 'upstream/main'

6e4c72e

Merge remote-tracking branch 'upstream/main'

70fbaf5

Merge remote-tracking branch 'upstream/main'

724c8db

Merge remote-tracking branch 'upstream/main'

45184a4

Merge remote-tracking branch 'upstream/main'

3d75d30

test

2bbdcab

more plots

7f29095

ongoing

cee56ad

ongoing changes

e31170d

ongoing work

931967b

complete moving registration and deregistration into Model

af30fe4

quaquel added 2 - WIP Performance breaking Release notes label labels Aug 27, 2024

pre-commit-ci bot and others added 2 commits August 27, 2024 07:26

[pre-commit.ci] auto fixes from pre-commit.com hooks

f6d85fe

for more information, see https://pre-commit.ci

cleanup

c783316

quaquel requested a review from rht August 27, 2024 07:28

EwoutH reviewed Aug 27, 2024

View reviewed changes

mesa/model.py Show resolved Hide resolved

mesa/model.py Outdated Show resolved Hide resolved

quaquel added 2 commits August 27, 2024 13:52

fix when user warning is raised

8eb93c2

cleanup

4a0e36a

rht reviewed Aug 28, 2024

View reviewed changes

mesa/model.py Show resolved Hide resolved

rht reviewed Aug 28, 2024

View reviewed changes

EwoutH reviewed Aug 28, 2024

View reviewed changes

EwoutH approved these changes Aug 28, 2024

View reviewed changes

mesa/model.py Outdated Show resolved Hide resolved

EwoutH added trigger-benchmarks Special label that triggers the benchmarking CI and removed trigger-benchmarks Special label that triggers the benchmarking CI labels Aug 28, 2024

quaquel and others added 2 commits August 29, 2024 08:49

minor docstring modifications

fbce5e3

[pre-commit.ci] auto fixes from pre-commit.com hooks

2c73f34

for more information, see https://pre-commit.ci

quaquel removed the 2 - WIP label Aug 29, 2024

quaquel merged commit de01c3f into projectmesa:main Aug 29, 2024
10 of 12 checks passed

quaquel deleted the boltzman branch August 29, 2024 07:01

EwoutH added the deprecation When a new deprecation is introduced label Aug 30, 2024

This was referenced Aug 30, 2024

Write Mesa 3.0 migration guide #2233

Closed

Clean-up private variables (_agents, _step) #2212

Closed

EwoutH added enhancement Release notes label and removed breaking Release notes label labels Aug 30, 2024

Corvince mentioned this pull request Sep 25, 2024

Migration guide: Add note about removing Agents from the model #2326

Merged

Performance enhancements for Model.agents #2251

Performance enhancements for Model.agents #2251

Uh oh!

Conversation

quaquel commented Aug 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Key changes

Uh oh!

github-actions bot commented Aug 27, 2024

Uh oh!

EwoutH left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

EwoutH commented Aug 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

quaquel commented Aug 27, 2024

Uh oh!

Corvince commented Aug 27, 2024

Uh oh!

quaquel commented Aug 27, 2024

Uh oh!

quaquel commented Aug 27, 2024

Uh oh!

github-actions bot commented Aug 28, 2024

Uh oh!

Uh oh!

rht Aug 28, 2024

Choose a reason for hiding this comment

Uh oh!

quaquel Aug 28, 2024

Choose a reason for hiding this comment

Uh oh!

rht Aug 28, 2024

Choose a reason for hiding this comment

Uh oh!

quaquel Aug 28, 2024

Choose a reason for hiding this comment

Uh oh!

EwoutH commented Aug 28, 2024

Uh oh!

EwoutH commented Aug 28, 2024

Uh oh!

quaquel commented Aug 28, 2024

Uh oh!

EwoutH commented Aug 28, 2024

Uh oh!

quaquel commented Aug 28, 2024

Uh oh!

EwoutH left a comment

Choose a reason for hiding this comment

Uh oh!

EwoutH Aug 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

quaquel Aug 28, 2024

Choose a reason for hiding this comment

Uh oh!

EwoutH Aug 28, 2024

Choose a reason for hiding this comment

Uh oh!

quaquel Aug 29, 2024

Choose a reason for hiding this comment

Uh oh!

quaquel commented Aug 28, 2024

Uh oh!

Uh oh!

github-actions bot commented Aug 28, 2024

Uh oh!

Uh oh!

EwoutH commented Sep 24, 2024

Uh oh!

quaquel commented Sep 24, 2024

Uh oh!

Corvince commented Sep 24, 2024

Uh oh!

EwoutH commented Sep 24, 2024

quaquel commented Aug 27, 2024 •

edited

Loading

EwoutH commented Aug 27, 2024 •

edited

Loading

EwoutH Aug 28, 2024 •

edited

Loading