Commit Graph

43 Commits

Author SHA1 Message Date
disinvite
5a4c9234a9 Allow prepending space for exact marker match 2023-12-01 15:12:33 -05:00
disinvite
75802101ac Merge from parser2 branch 2023-12-01 15:10:32 -05:00
Christian Semmler
4920ea9a9e Adjustments based on new suggestions 2023-11-30 09:25:32 -05:00
Christian Semmler
78173990c7 Fix order 2023-11-28 09:30:35 -05:00
Christian Semmler
1ba3b7f0a7 Adjustments to "decomp" language 2023-11-28 09:00:57 -05:00
Christian Semmler
2fa70d233f Update README.md [skip ci] 2023-11-26 14:57:19 +01:00
Thomas Phillips
b14116cc93 Python Linting and Code Formatting (#298)
* Create common print_diff function

* Add pylint and black

* Fix linting, move classes to utils

* Add black/pylint to github actions

* Fix linting

* Move Bin and SymInfo into their own files

* Split out format

* Tidy up workdlows and pip, add readme

* Lint tests, add tests to readme
2023-11-25 13:27:42 -05:00
MS
abcc3afb31 Fix reccmp html output for template functions (#296) 2023-11-22 02:52:57 -05:00
MS
1ae3b07dc2 Checkorder tool to keep functions in original binary order (#228)
* First commit of order tool

* More flexible match on module name. Bugfix on blank_or_comment

* Report inexact offset comments in verbose mode. Bugfix for exact regex

* Refactor checkorder into reusable isledecomp module

* Find bad comments in one pass, add awareness of TEMPLATE

* Refactor of state machine to prepare for reccmp integration

* Use isledecomp lib in reccmp

* Build isledecomp in GH actions, fix mypy complaint

* Ensure unit test cpp files will be ignored by reccmp

* Allow multiple offset markers, pep8 cleanup

* Remove unused variable

* Code style, remove unneeded module and TODO

* Final renaming and type hints

* Fix checkorder issues, add GH action and enforce (#2)

* Fix checkorder issues

* Add GH action

* Test error case

* Works

* Fixes

---------

Co-authored-by: Christian Semmler <mail@csemmler.com>
2023-11-21 09:44:45 +01:00
Thomas Phillips
dff410d87a Use templates instead of replacing (#292)
* Use templates instead of replacing

* Use Renderer to avoid loading templates ourselves

---------

Co-authored-by: Thomas Phillips <thomas@teknique.com>
2023-11-19 13:55:01 +01:00
Thomas Phillips
bd85abaf2a Improve python tools (#273)
* Use python3 features

* Use `with` statement for file access
* Use f-strings instead of modulo string formatting
* Single quotes in most places

Fix typo in 'with' statement

* Add files into missing messages

* Fix can_resolve_register_differences and round percentages

* Return modified value instead of relying on in-place modification
2023-11-08 10:47:11 +01:00
MS
8a528e4146 Big performance gain to reccmp (#271) 2023-11-06 10:07:02 +01:00
Nathan M Gilbert
d232c82e70 Update reccmp.py (#236)
Support indented comments for 'TEMPLATE'd functions.
2023-10-23 13:17:28 +02:00
Angel
5ac6cf55a9 Corrected typo in reccmp.py (#169) 2023-10-05 22:26:48 -07:00
pewpew
b77cd067d3 reccmp: template compare annotations (#88)
* reccmp: Add ability to compare template instantiations

* Add example of template instantiation comparison.

* merge

* Add template compare annotations for MxList instances

---------

Co-authored-by: Christian Semmler <mail@csemmler.com>
2023-09-29 11:40:46 -07:00
Christian Semmler
b1a2aeaed6 Print recompiled address when using --verbose 2023-09-13 10:39:35 -04:00
Mark Langen
694045abd8 Implement MxVector2/3/4 and MxMatrix (#100)
* All of the MxVectors share an inheritance chain. MxVector4 inherits
  from MxVector3 which inherits from MxVector2.

* They all operate on a shared `float*` data member which points to the
  underlying storage.

* There are also MxVector3/4Data classes, which inherit from Vector3/4,
  but add concrete storage for the Vector data rather than just an
  abstract data pointer.

* The same is true for MxMatrix, with there being an abstract and a
  concrete variant of it.

* Also improve reccmp.py register matching algorithm. It previously
  could not recognize an effective match when a swap had to take place
  between two registers used on the same line. It turns out this happens
  a lot in floating point math code so I adjusted the implementation to
  break the disassembly lines on spaces rather than just linebreaks
  allowing the existing effective match code to handle that case too.
2023-08-03 11:25:29 -07:00
Mark Langen
f247e10b7e reccmp.py improvements (#82)
* Rather than using <OFFSET> as a replacement for all offsets in a
  function, label the offsets as <OFFSET1>, <OFFSET2>, etc. Doing this
  will avoid false-positive 100% matches resulting from the same
  function being called in two times where a different on should have
  been called or vice versa. And the same for globals. I already
  encountered one case of this in the wild.

* When a 100% match initially fails, try to make the functions match by
  swapping register allocations. This makes it possible to get a 100%
  match where the generated machine code differs only in register
  allocation.

* Only apply the above when it is possible to reach a 100% match in that
  way. Otherwise show the developer the unadultrated diff to avoid
  complicating decompilation.

* In the result listing, show the functions which are "effective
  matches" in this way as "100%*" instead of "100%".
2023-07-15 23:13:34 -07:00
Anonymous Maarten
40dd0a93d4 Faster reccmp.py on linux (#62)
* reccmp: avoid repeated execution of winepath

Executing winepath many times is slow,
so try we like to avoid it as much as possible.

When the path start with a known prefix, replace it with
a cached prefix and do some string manipulation.

This change reduces execution time of reccmp.py from 90s to 2s.

Which is nice.

m

* reccmp: continue looking when source cannot be found

Most often, the reasons is mismatched sources.

* reccmp: add basic logging + optional debug

* Read the addresses in the exe headers as little endian
2023-07-01 23:52:47 -07:00
itsmattkc
e929d76f3c reccmp: use "monospace" in svg font
For some reason Inkscape made this "mono", but it seems like "monospace" is the right attribute here
2023-06-30 16:12:22 -07:00
itsmattkc
4c9e138cbf implement all DLL exports (as TODOs)
Now we can use our own compiled LEGO1.LIB rather than one generated from the original. Also implements a script that tests them to help ensure future commits don't break them.
2023-06-30 11:34:39 -07:00
itsmattkc
566e107290 reccmp: only show recompiled address on request
Improves comparisons between diffs because the addresses shifting around leads to false positives
2023-06-29 09:02:52 -07:00
Cydra
07912eb05a Class layout for LEGO1 classes (#43)
* Stubbed a bunch of classes and annotated them for later use. Heavily wip and more of pseudocode right now.

* Converted pseudocode into real code!

* Created a bunch more classes and added more information to exisiting ones
Did not error check, this was pushed just for reference

* More classes and implementation details. Still not checked for any errors

* Fixed code and decided on a way to handle virtual table stubs

* Some additional fixes

* More smaller fixes

* Added classes to project and made it compile

* Fixed function adresses that caused the python script to fail

* More classes and virtual function resolves. Builds and compares fine.

* Again more classes and virtual function resolves. Builds and compares fine.

* No clue, I guess forced update for line endings

* Finished up some work, compiles fine. All functions are STUB annotated to not pollute reccmp.py output.

* line ending change

* rename GetClassName/IsClass

Mirroring recent changes from master

* further conform to current master

* update project

* cleanup

* project only updates when you close msdev

---------

Co-authored-by: Cydra <cydra95@gmail.com>
Co-authored-by: itsmattkc <34096995+itsmattkc@users.noreply.github.com>
2023-06-29 01:10:08 -07:00
itsmattkc
8e6e2a3962 reccmp: fix SVGs on light backgrounds 2023-06-27 19:46:04 -07:00
itsmattkc
f7c84d719b reccmp: use bold font for easier readability 2023-06-27 18:25:38 -07:00
itsmattkc
b393851ebd reccmp: change svg canvas size 2023-06-27 18:10:36 -07:00
itsmattkc
1ea15e6478 reccmp: use entire canvas for progress images 2023-06-27 18:04:30 -07:00
itsmattkc
f03cee6b6e reccmp: improve progress bar text rendering 2023-06-27 18:00:53 -07:00
itsmattkc
f9e9723a67 reccmp: give svg template background color 2023-06-27 16:12:04 -07:00
itsmattkc
4a1e3a5b7e reccmp: fixed typo 2023-06-27 16:01:49 -07:00
itsmattkc
b080766321 generate progress SVGs 2023-06-27 15:59:44 -07:00
Mark Langen
0b47f3fff3 Improve reccmp.py (#49)
* Improve reccmp.py

* Now only shows the info for a single function when a specific function
  is specified via -v

* Now colors the output by default

* Percentages are shown as green/yellow/red depending on the percentage
  completed.

* Diff +/- lines are shown as green/red.

* Includes standard --no-color argument in case we need no color for
  some tooling which consumes the output.

* Feedback
2023-06-25 19:01:40 -07:00
itsmattkc
749a1f419b reccmp: support inlined functions that may have been compiled into both files 2023-06-22 01:05:00 -07:00
itsmattkc
12395ac41a reccmp: further improve accuracy 2023-06-22 00:44:28 -07:00
itsmattkc
c4b4555b80 reccmp: revert using debug offsets 2023-06-21 17:31:54 -07:00
MS
4d531d1de5 reccomp: add option to hide 100% matching functions (#35)
* add option to hide 100% matching functions

* slight formatting improvement

---------

Co-authored-by: itsmattkc <34096995+itsmattkc@users.noreply.github.com>
2023-06-21 14:43:01 -07:00
itsmattkc
fa63d7e341 rename reccomp to reccmp
Sorry to everyone's muscle memory, but I think this is better. The idea for the name was "recomp compare", but it's too easy to read it as "recomp with a typo". This should fix that, as well as be slightly easier to write since it's shorter.
2023-06-21 14:36:09 -07:00
Anonymous Maarten
da3ad91b20 recomp.py: use argparse to parse arguments (#30)
* recomp.py: use argparse to parse arguments

* Address code revew comments

* reccomp.py: -h/--help for help -H/--htmp for html

* update CI to use new arg

* slight string updates

---------

Co-authored-by: itsmattkc <34096995+itsmattkc@users.noreply.github.com>
2023-06-21 14:33:08 -07:00
itsmattkc
290c006d14 use offsets from PDB to only diff instructions
Also ensure empty functions aren't falsely identified as matching due to no comparison occurring
2023-06-20 13:09:48 -07:00
itsmattkc
66dd2cdeb9 improved reccomp reliability even further, added html summary generator
Will probably host the summary somewhere for easy access
2023-06-19 12:52:21 -07:00
itsmattkc
ec12b8f30f improved compare script performance and reliability 2023-06-19 10:57:13 -07:00
itsmattkc
319b52f248 added more definitions
Also clarify .exe on script because Wine cares about that
2023-06-18 20:50:32 -07:00
MattKC
5aa7921e90 Add CI script to compare recompiled assembly with original code (#24)
* add test to compare assembly between functions

* ci: use abs path of wget

* ci: fix shell disambiguity

* ci: ensure capstone is installed

* ci: ensure correct filenames

* use better source for lego island files

* give me an idea of what the dir structure looks like

* make wine path function

* improved script and project

* fixed script on windows

* print debug info because now it literally only doesn't work on fucking github actions

* better source path resolving

For some reason, nmake compiles produce different symbols. I wonder if this affects the accuracy of the decomp.
2023-06-18 20:28:18 -07:00