Age | Commit message (Collapse) | Author |
|
|
|
It is way cheaper to retrieve the correct agent first (especially with
the faster agent_by_addr) and then check if it is a boss, rather than
iterating over all bosses (which iterates through all characters), and
then iterate through the found bosses to check if any address matches.
The new code is a bit longer (and doesn't combine the functions as
nicely), but it is still readable and more performant - which is more
important.
|
|
We know that the way we construct the Log, the agents are always sorted
by their address. This invariant cannot be broken, as the only way to
construct a Log is in evtclib itself, and there is no way to obtain a
mutable view on the agent vector or change the address of an agent.
Since Log::agent_by_addr is used by other functions, this speedup (even
if it is small) can show in a lot of different places.
Note that if we change the interface of Log in the future to allow
creating logs from different sources that processing::process, we need
to make sure we adjust this function.
|
|
Comparing the int is a very cheap operation, and it is also a very good
indicator already that we've found the right event. Due to the short
circuiting behaviour of &&, it is better to check that first before
doing the Log::is_boss check - which is relatively costly. Remember that
we do this check for every buff application event!
This brings a speedup of around 50x:
new: 654.1±25.18µs
old: 34.9±0.69ms
|
|
|
|
This may be useful for downstream applications and it fits into the
pattern of implementing it for Boss/Encounter.
|
|
If we already have an Encounter, it might be nice to determine the game
mode from it as well - without needing to go through the whole log
first.
This is especially useful for raidgrep, where we can use the early
filters - which don't have access to the whole Log item.
|
|
For a lot of applications, it can be useful to distinguish between logs
made in raids, fractals, ...
Note that we probably don't want further categorization (as for example
done in ezau).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Overall, evtclib is doing quite well on the .unwrap()/.expect()/panic!()
calls, except for some doctests (which can be changed at some point) and
the actual tests.
One case where we do panic (and should document it!) is Log::boss. The
documentation has been added there.
Another (rare if not impossible for proper evtc files) case was the
conversion of the language event, which assumed that we will definitely
be able to convert the u64 to the right language. In all normal cases
this should be true, but if evtclib deals with untrusted input, we might
not want to panic a whole program because someone smuggled in a
malicious file.
|
|
The wording on the evtc README is
> an npcid of 1 indicates log is generic - triggers by squadmember
> entering combat and not as a result of a boss species id.
It is not quite clear whether "generic" implies WvW or if there are PvE
generic logs as well if you manage to set up arcdps in the right way.
Therefore, the wording in evtclib is rather unspecific as well.
|
|
|
|
|
|
The old code seemed to choke on WvW players, as the subgroup-calculating
code did 0 - b'0', which underflowed. The proper way to parse a subgroup
is not to take a single character anyway, because subgroups can be
bigger than 10.
The new code fixes that by properly extracting the "subgroup str
literal" and then parsing it as an integer, with some special logic to
detect an "empty" subgroup as it is in WvW.
|
|
|
|
|
|
It turns out that arcDPS gives each boon stack an ID which is also
re-used in the BuffRemoval or StackReset events.
|
|
As it turns out, the padding bytes are not just padding, but for some
events they contain useful information. Therefore, we've adjusted the
parser to save those bytes (if available).
|
|
|
|
|
|
|
|
When this function was written, it was done under the assumption that
a) There are not a lot of agents, so linear search is fast enough
and
b) We just want it to work for now
However, it turns out that there can be a lot of agents, close to 1000
for the Qadim log for example. This means that there is quite some time
saving that we can do here, as get_agent_by_addr is used a lot in
set_agent_awares and set_agent_masters, so speeding this part up is
good!
We could build a HashMap, mapping the address to the agent (index), but
that would mean that we have to carry the hash map around. This patch
provides a simpler yet already good improvement: We invest a bit of time
after converting all agents to sort them by their address (as the agent
order is implementation defined anyway), so we can later use a binary
search to get the right agent. It's not O(1), as a hash map would be,
but it works in logarithmic time and already provides a big benefit:
Before
process Qadim time: [39.444 ms 39.501 ms 39.561 ms]
After
process Qadim time: [18.699 ms 18.744 ms 18.788 ms]
change: [-52.672% -52.546% -52.413%] (p = 0.00 < 0.05)
That is half of the processing time saved by a 3 line patch!
|
|
No particular reason other than it's more idiomatic and shorter.
inline has been added to get_agent_by_addr since it "feels" like a short
function that can and should be inlined, but it doesn't matter too much.
|
|
|
|
This adds some simple benchmarks to test the speed of
evtclib::process_{file,stream}, just so we can quickly assess on a
high-level if changes make a big impact.
I'd like to add some more benchmarks in the future as well, mostly
- on the higher level when we go from RawEvtc to a log, benchmarking the
process function itself (thus not benchmarking all the byte-twiddling
deserialization code).
- on the Analyzer level
|
|
|
|
It makes sense to expose this logic as a function, as other programs
like raidgrep might want to use the same logic when dealing with partial
evtc files.
|
|
|
|
|
|
|
|
|
|
|
|
Just like with Event, we now have Agent defined in its own submodule.
The amount of code that it entailed was a lot, so it made sense to split
it off, especially with the deserialization being another big chunk of
Agent related code in lib.rs
The main issue was that the processing submodule accessed private fields
of the Agent struct, which is now no longer possible (since processing
is no longer a submodule of the module in which Agent is defined).
Therefore, some simple crate-public setters for those fields have been
added. Those setters are not public because we do not want outside
crates to mess with the innards of Agent (yet).
Although with (de)serialization being a thing, we need to ensure that we
can handle nonsensical values anyway, since we can no longer guarantee
that we have full control over all of the values, even without setters.
|
|
The rationale is included in the comment below. The gist is that we want
to avoid deserializing Agent<Player> (and others) directly, as that
would circumvent the checks.
As a small bonus, we now skip the phantom_data field in serialization,
as that's just an implementation detail that other consumers shouldn't
worry about.
|
|
If the reward has been given out, we can be 99.9% sure that the fight
succeeded, in which case we don't need to do any other convuluted
checking. This has the benefit of catching some false-negatives (edge
cases in success detection), at the cost of making the detection a bit
... weirder, in the sense that a log's success might now depend on
whether it was the first kill in the week or not.
However, given that our sucess detection works pretty well overall, I'd
say it's worth to catch a few more false-negatives and try to classify
as many logs correctly as possible. At least, this does not introduce
any false-positives.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|