Skip to content

Stat events#36

Open
Magentah wants to merge 29 commits intomasterfrom
stat-events
Open

Stat events#36
Magentah wants to merge 29 commits intomasterfrom
stat-events

Conversation

@Magentah
Copy link
Copy Markdown
Contributor

@Magentah Magentah commented May 12, 2025

W3C Events

Implements a W3CEvents Lua library that can be used to add custom events to WC3 replays.

The purpose of this is to allow map makers to add custom events that can then be used for enhanced replay analysis. This also allows W3C to add enhanced replay analysis to all melee games on the W3C ladder.

Below is an example of a CreepKill event that can be used to identify where the creep was killed, which unit killed it, which player killed it, and what time in game it was killed.

{
  "event": "CreepKill",
  "index": 92,
  "name": "Apprentice Wizard",
  "player": 1,
  "time": 215,
  "sequence": 92,
  "killingUnit": "Blademaster",
  "dyingUnitX": 616.25,
  "dyingUnitY": 3527
}

Implementation

The implementation is split across a few different files with different responsibilities

Utilities

  • base64.lua is a lua base64 implementation used for encoding/decoding payloads
  • debugUtils.lua is used to print error messages and logs during games to help with debugging
  • json.lua is used for testing and debugging by printing lua tables to json format for easier readability
  • libDeflate.lua is a lua implementation of the deflate compression algorithm, used for compressing and decompressing payloads
  • syncedTable.lua is a multiplayer synced table library that is not currently used.

w3cbitbuffer.lua

This file is used to handle all reading and writing of data to payloads that will be sent using BlzSendSyncData. The payloads go through a series of compression steps.

  • Signed integers use zigzag encoding
  • Non-string values are bit-packed into a byte array
  • Float values are not comressed. They're byte-aligned then written directly
  • String values are byte-aligned, then compressed with Deflate, then stored as:
    [ 2-byte big-endian length ] [ compressed bytes ]

The purpose of this is to help reduce the size of payloads to as low as possible, as the WC3 client has a hard limit of 255 bytes for BlzSendSyncData, and a hard limit of 4kb/s data transfer. Exceeding these values will cause high latency and potentially crash to desktop

w3cChecksum.lua

CRC32 implementation in lua. Used to generate checksum values that are periodically sent to ensure all players in a game have the same events

w3cdata.lua

This file is used to handle the transport layer and so implements the schema registry, compressing payloads using w3cbitbuffer.lua, splitting large payloads in to smaller chunked payloads, generating checksums, and decoding payloads back in to schema / values.

Schemas are used to define the fields for a singular event. Each event must correspond to a schema before it is sent.

Schemas are sent in their own payloads, allowing events themselves to only require specifying the schema id for the event. This reduces the size of each event payload.

COBS encoding is used to remove null bytes as BlzSendSyncData will truncate on null bytes

w3cEvents.lua

The main library file that users will use. This exports a minimal set of functions required to use the library. Users will primarily use the initialize, event and track functions. Internally this also buffers events to reduce the number of BlzSendSyncData events that are sent, as well as periodically send checksum events to ensure consistent events between all players.

w3cschema.lua

This file is used to manage the schemas that define events and their fields.

Typescript

There is also a typescript definition file for w3cEvents.lua, allowing use of the library with typescript.

Events

The MR also introduces a number of events that can be added to maps. These are all defined in the typescript files under src/metrics.

Comment thread src/lua/w3cMetrics.lua Outdated
Comment thread src/lua/w3cMetrics.lua Outdated
Comment thread src/lua/w3cMetrics.lua Outdated
Comment thread src/lua/w3cMetrics.lua Outdated
Comment thread src/lua/w3cMetrics.lua Outdated
Comment thread src/lua/w3cMetrics.lua Outdated
Comment thread src/lua/w3cMetrics.lua Outdated
Comment thread src/metrics/heros.ts Outdated
Comment thread src/metrics/heros.ts Outdated
Comment thread src/metrics/playerHeroDamage.ts Outdated
TriggerAddCondition(heroDamageTaken, Condition(checkIsPlayerHeroTarget));
TriggerAddAction(heroDamageTaken, onHeroDamaged);

W3CMetrics.track("HeroDamage", () => heroDamageMap, 5.0);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That feels like it will EXPLODE in size. You're retrieving the same map every 5 seconds and you're never cleaning it during a game. And even if for a specific hero it has not changed since the last time, you're still emitting it.

It seems like the library (not this code here) should be able to detect diffs between maps/tables and remove data that has not changed.

Or rather, I'd say that we should reset this map on emit and submit the difference since the last time (it hence becomes a rate, rather than the absolute values).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably can, and only sending deltas / changes in the base lua library is something I want to add which should handle this

Comment thread src/metrics/playerUnits.ts
Comment thread src/metrics/research.ts Outdated
Comment thread src/metrics/unitDeaths.ts
import * as W3CMetrics from "../lua/w3cMetrics"
import { Players } from "w3ts/globals";

export function trackUnitDeaths() {
Copy link
Copy Markdown
Member

@marcoabreu marcoabreu May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be aware that custom maps do unit deaths VASTLY different. They basically never ever really kill a unit, because it leaks memory.

Not relevant for melee, but just for you to keep in mind in terms of allowing to specify custom events for "UNIT_DEATH" which are not actually EVENT_PLAYER_UNIT_DEATH. Basically to register more than 1 event that corresponds to the same thing you want to track - in melee, the size of that list would be 1 (EVENT_PLAYER_UNIT_DEATH). In customs, it might be more.

The same applies to dmg calculation and unit creation btw.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything in TS is only going to be used for W3C maps and not custom maps. Only the lua library would actually be used for custom maps, so I don't think it'll be a problem, but good to know!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that something we would want to change?

Of course not everything would make sense, but I think it would be good if we could have a library of defaults collectors that custom maps could use, rather than them having to define it themselves.

Comment thread src/metrics/unitDeaths.ts Outdated
const killingPlayer = GetOwningPlayer(killingUnit);
const killingPlayerId = GetPlayerId(killingPlayer);

if (IsUnitType(killingUnit, UNIT_TYPE_HERO)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels .. odd. Is that the right way to do this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea, I don't know anything about WC3 scripting. It works though, and at least makes sense to me that it's probably a right way to tell if a unit is a hero or not

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I initially had my comment at a different line. Should've been more specific.

I mean, this whole approach of starting a timer and then reading the XP; is that the way to read the XP gained?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timer is used here because there's a slight delay before a unit receives xp after the unit dies. If you read it immediately, the xp hasn't updated yet. I dunno if there's a better way to do this, there's no event for xp gain from what I saw when searching.

@diewolke9
Copy link
Copy Markdown
Member

diewolke9 commented May 14, 2025

I am generally very impressed and satisfied with how quickly @Magentah made something quite comprehensive and mature. Obviously you're a far better and focused coder than I am, kudos to you for that.

Two things that I am concerned about here:

  • encoding, as I said above, this is probably not the most effecient encoding. I would think that there's a logt of space for optimization - for example, storing time as strings is a huge waste of bandwidth, same applies to event names (as I mentioned above). I think it's important to really focus on optimizing the encoding to the maximum, because that would mean a big improvement in performance under higher load (although I have not analysed what HIGH load means in this context)
  • the batching of events - I would make it opt-in. One reason for that is, again, that you can save bandwidth, because if you send each event whenever it fires, you don't need to store the time, as you can get that from the replay then, and the time will be more reliable than a timer (although I cannot attest to how reliable or unreliable timers are).

@marcoabreu
Copy link
Copy Markdown
Member

We 100% need to keep batching on the communication level. We can't allow the map to just decide when it wants to emit data to our server. Remember, these are statistical events that have zero time pressure to be emitted.

@Magentah Magentah force-pushed the stat-events branch 10 times, most recently from 05f5946 to 21ae1f8 Compare May 16, 2025 16:35
@Magentah
Copy link
Copy Markdown
Contributor Author

I am generally very impressed and satisfied with how quickly @Magentah made something quite comprehensive and mature. Obviously you're a far better and focused coder than I am, kudos to you for that.

Two things that I am concerned about here:

  • encoding, as I said above, this is probably not the most effecient encoding. I would think that there's a logt of space for optimization - for example, storing time as strings is a huge waste of bandwidth, same applies to event names (as I mentioned above). I think it's important to really focus on optimizing the encoding to the maximum, because that would mean a big improvement in performance under higher load (although I have not analysed what HIGH load means in this context)

The w3cdata.lua should hopefully be sufficient for handling this now.

  • the batching of events - I would make it opt-in. One reason for that is, again, that you can save bandwidth, because if you send each event whenever it fires, you don't need to store the time, as you can get that from the replay then, and the time will be more reliable than a timer (although I cannot attest to how reliable or unreliable timers are).

I think this is less of a concern with w3cdata.lua. The time can be 2 bytes (for up to 65k, or 18 hours if the time is by second), or set to a specific bit length (e.g. 12 bits for up to 4k, or a bit over 1 hour). I think my approach at the moment will likely be to buffer all events until it hits a max size (180 probably, vaguely remember seeing that number somewhere for being a soft limit) then flush, with some limit to how quickly we flush to prevent being able to send dozens of SyncData packets all at once.

@Magentah Magentah force-pushed the stat-events branch 3 times, most recently from 59feaa7 to 0602caa Compare May 17, 2025 15:11
W3CData:
- requires type=int to set bit sizes
- added 'number' type that allows setting minimum and maximum values
instead
- Updated checksum payloads and added test
- Added schema registry payload and added test
- Made signed default instead of unsigned
- Added function to register multiple schemas at once
- Added assertions that numbers are integers when expected

W3CEvents:
- Init now only inits once
- Added end_game that sends all remaining events and prevents any more
events. Also adds players results as the final events.
- Schemas are now required to be registered at once

json.lua added so we can easily compress and send schemas as we don't
support nested objects
@Magentah Magentah marked this pull request as ready for review April 30, 2026 11:38
@Magentah
Copy link
Copy Markdown
Contributor Author

Think this is finally no longer a draft @marcoabreu @diewolke9

Comment thread src/metrics/schemas.ts
Comment on lines +98 to +99
f("dyingUnitX", "float"),
f("dyingUnitY", "float"),
Copy link
Copy Markdown
Contributor

@francislavoie francislavoie May 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be x/y for consistency?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was to make clear it's the position of the dying unit not the killing unit, but I don't mind changing it

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's pretty clear given the event name. But we can just document that kind of thing I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants