Skip to content

[WIP] Export tool starter, trial and error code#154

Draft
Joejiong wants to merge 8 commits intop4lang:mainfrom
Joejiong:export-dev
Draft

[WIP] Export tool starter, trial and error code#154
Joejiong wants to merge 8 commits intop4lang:mainfrom
Joejiong:export-dev

Conversation

@Joejiong
Copy link
Copy Markdown
Contributor

@Joejiong Joejiong commented Apr 7, 2025

[https://github.com//issues/151]
I have another experimental code in progress not yet uploaded, and it uses another approach.

There are a couple of paths to eventually make BMV2 JSON.

This code only served as the first part of the two-step JSON paths.
p4hir-> p4hir-iso-json (debug+correctness+extensibililty)->bmv2-json (deal with the p4c-bmv2 json generator logic, with ID and others)

@Joejiong Joejiong force-pushed the export-dev branch 4 times, most recently from ff60155 to 6e88c08 Compare April 7, 2025 19:21
@@ -0,0 +1,69 @@
// Simplest p4 program to be processed, which can go through
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the mlir code for this file empty? Does it not compile? What is the expected json output for it? Maybe you should add it as a reference.

Copy link
Copy Markdown
Contributor Author

@Joejiong Joejiong Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh thx for the reminder, forgot to upload. Let me upload it

Copy link
Copy Markdown
Contributor Author

@Joejiong Joejiong Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I'm facing with this method is looped reference and typedefs, which can form a dead loop, for complex real p4hir code with includes v1 arch and core. I'm trying to build a set to store the KV mapping. However, there is another completely different methodology. For this skeleton, I'm experimenting with this starter code, building a set for typedefs, and traversing nodes could probably prevent the dead loop. But fundamentally, a simple static DAG cannot represent a loop in the current frontend. This will be left for future discussions.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try this code as example: https://github.com/p4lang/p4c/blob/main/backends/p4tools/modules/testgen/targets/bmv2/test/p4-programs/bmv2_extract_1.p4

I know this already compiles to mlir and it is a very simple example of a BMV2 program. Then compile this program and generate the json file using p4c-bm2-ss. You should be able to generate basic elements already.

I am not sure what looped references or typedefs are.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to produce text output from this particular MLIR dialect you can check https://github.com/p4lang/p4mlir-incubator/blob/fruffy/top4/lib/Utilities/mlir_to_p4.h#L20. It writes MLIR to P4 again. This code is WiP, by no means complete, and will change. But it is a start and shows that it can be done.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't worry about for loops for now, there is very little Bmv2 code written that way. Just throw an error that it is not implemented,

got you, thx

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Joejiong Could you also check in the expected BMv2 .json file? I still don't see it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let me include it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Joejiong Could you also check in the expected BMv2 .json file? I still don't see it.

The output also has a p4i file, I probably only upload the .json

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, including .json file is enough.

class JsonEmitter {
public:

// Emit a P4HIR-ISO json, given MLIR module.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As explained above, let's not invent this P4HIR-ISO json format here. Instead, try to emit the BMv2 JSON format directly.

I'd suggest that you only keep the emitModule function, and start with returning an empty or constant JSON value. Once you have that working (in the sense that the added test passes, see another comment), we can discuss the next steps.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test in the test/ directory that invokes the binary on this file as input.

@@ -0,0 +1,69 @@
// Simplest p4 program to be processed, which can go through
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Joejiong Could you also check in the expected BMv2 .json file? I still don't see it.

@qobilidop
Copy link
Copy Markdown
Member

qobilidop commented Apr 8, 2025

This code only served as the first part of the two-step JSON paths.
p4hir-> p4hir-iso-json (debug+correctness+extensibililty)->bmv2-json (deal with the p4c-bmv2 json generator logic, with ID and others)

@Joejiong I think it makes sense to have multiple steps in converting from P4HIR to BMv2 JSON, but:

  1. For this PR, we are not solving that problem. I'd keep the scope of this PR to be just setting up some starter code without assuming how to do the conversion.
  2. I don't think it's necessary to introduce an intermediate textual format for solving the conversion problem. We are planning to create a BMv2 dialect for this instead.
  3. Without introducing any intermediate steps, in the starter code, we are going to directly convert some easy parts of P4HIR to BMv2 JSON directly. To see which parts are easy, we need to see the expected BMv2 JSON file first. If there is no easy parts, outputing a skeleton BMv2 JSON file is also fine.

@Joejiong
Copy link
Copy Markdown
Contributor Author

Joejiong commented Apr 8, 2025

  1. I don't think it's necessary to introduce an intermediate textual format for solving the conversion problem. We are planning to create a BMv2 dialect for this instead.

I also feel it is too difficult to transfer; there are so many corner cases, which bmv2 dialect would better capture.

Conversion p4hirdialect->bmv2dialect would be more accurate than my JSON converter. So much manual work.

@Joejiong Joejiong force-pushed the export-dev branch 4 times, most recently from 1991bd8 to 495de5d Compare April 8, 2025 02:27
Copy link
Copy Markdown
Member

@qobilidop qobilidop Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have this expected BMv2 JSON file, we can discuss the next step. The exercise I'd like to see you implement in this PR is to generate the "header_types" section automatically.

The implementation shall be very straight-forward. You just need to traverse the input P4HIR program in some way and collect all header type related definitions. Make your best guesses at what is counted as header types. Reference the existing P4C BMv2 backend code for inspirations.

You don't have to generate code that look exactly the same (e.g. assigned IDs could be different). If they look roughly the same, that's good enough for this PR.

@Joejiong Any questions?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

header_types is like a struct declaration? Got to look at it later, EST time is late. Thx for the elaboration.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Joejiong Joejiong force-pushed the export-dev branch 7 times, most recently from 5974b22 to a2bece2 Compare April 8, 2025 23:44
@Joejiong Joejiong force-pushed the export-dev branch 4 times, most recently from 90e5e20 to 41160e6 Compare April 9, 2025 05:22
@Joejiong
Copy link
Copy Markdown
Contributor Author

I finally found the error, but it stopped at my newly added test. Why could I run it on my local machine?

/mlirTojson/p4mlir-incubator/third_party/llvm-project/build/bin/llvm-lit -v /mlirTojson/p4mlir-incubator/third_party/p4c/build/extensions/p4mlir/test/Export/P4HIR/bool_test.mlir
-- Testing: 1 tests, 1 workers --
PASS: P4MLIR :: Export/P4HIR/bool_test.mlir (1 of 1)

Testing Time: 0.14s

Total Discovered Tests: 1
  Passed: 1 (100.00%)
+ PASSED_TESTS=35
+ echo PASSED
PASSED

@qobilidop
Copy link
Copy Markdown
Member

qobilidop commented Apr 11, 2025

In general your local environment and CI environment are not exactly the same. So things could be different. I usually use Docker to create a reproducible local environment that's very close to the CI environment to rule out such possibilities.

Other than that, here are my suggestions on what you could try to debug:

  1. Rebase to or merge from HEAD to rule out the possibility that something already fixed in HEAD has been causing your issue.
  2. Replace the test file by an existing passing test.
    1. If this works in CI, it means your added test has some issue.
    2. If this also doesn't work in CI, then it must be something else.

@asl
Copy link
Copy Markdown
Collaborator

asl commented Apr 11, 2025

Other than that, here are my suggestions on what you could try to debug:

GC is likely culprit. It is not compatible with MLIR as the latter uses TLS and GC is not looking over TLS causing premature frees (and double frees). GC should be explicitly turned off if P4C libraries are linked in.

This is why it is extremely important to ensure everything is "pure" MLIR w/o P4C libraries.

@Joejiong Joejiong force-pushed the export-dev branch 3 times, most recently from 185cb25 to ad568d6 Compare April 11, 2025 04:12
@fruffy
Copy link
Copy Markdown
Collaborator

fruffy commented Apr 11, 2025

GC is likely culprit. It is not compatible with MLIR as the latter uses TLS and GC is not looking over TLS causing premature frees (and double frees). GC should be explicitly turned off if P4C libraries are linked in.

If you are referring to thread-local storage with TLS BDWGC should support that. Probably just an option to tweak.

@asl
Copy link
Copy Markdown
Collaborator

asl commented Apr 11, 2025

If you are referring to thread-local storage with TLS BDWGC should support that. Probably just an option to tweak.

No, it does not. See https://github.com/ivmai/bdwgc/blob/master/README.md:

WARNING: the collector does not guarantee to scan thread-local storage (e.g. of the kind accessed with pthread_getspecific). The collector does scan thread stacks, though, so generally the best solution is to ensure that any pointers stored in thread-local storage are also stored on the thread's stack for the duration of their lifetime. (This is arguably a longstanding bug, but it hasn't been fixed yet.)

@fruffy
Copy link
Copy Markdown
Collaborator

fruffy commented Apr 11, 2025

If you are referring to thread-local storage with TLS BDWGC should support that. Probably just an option to tweak.

No, it does not. See https://github.com/ivmai/bdwgc/blob/master/README.md:

WARNING: the collector does not guarantee to scan thread-local storage (e.g. of the kind accessed with pthread_getspecific). The collector does scan thread stacks, though, so generally the best solution is to ensure that any pointers stored in thread-local storage are also stored on the thread's stack for the duration of their lifetime. (This is arguably a longstanding bug, but it hasn't been fixed yet.)

It works in practice, but the guarantees aren't good, see https://github.com/ivmai/bdwgc/blob/ef01762da61fc803f430751506f6ee2176fe9221/docs/platforms/README.linux#L59.

@asl
Copy link
Copy Markdown
Collaborator

asl commented Apr 11, 2025

It works in practice, but the guarantees aren't good, see https://github.com/ivmai/bdwgc/blob/ef01762da61fc803f430751506f6ee2176fe9221/docs/platforms/README.linux#L59.

Well, I definitely saw crashes and double frees with MLIR with GC enabled :) So, I disabled it by purpose.

@fruffy
Copy link
Copy Markdown
Collaborator

fruffy commented Apr 11, 2025

Well, I definitely saw crashes and double frees with MLIR with GC enabled :) So, I disabled it by purpose.

I guess the question is, does GC_disable() work properly? Or did you have to disable GC entirely? The MLIR code doesn't need it anyway and we only really need GC for the front-end for large programs.

@asl
Copy link
Copy Markdown
Collaborator

asl commented Apr 11, 2025

I guess the question is, does GC_disable() work properly?

It stops all mark&sweep threads, so it means no automatic memory frees anymore. Manual allocation / deallocation work (albeit slow).

need GC for the front-end for large programs.

Sadly, "large" is just couple of thousands of LOCs, if entire frontend is run :)

@fruffy
Copy link
Copy Markdown
Collaborator

fruffy commented Apr 11, 2025

It stops all mark&sweep threads, so it means no automatic memory frees anymore. Manual allocation / deallocation work (albeit slow).

So, in the case of the translate utility, if we call GC_disable() after the front end passes, the only leakage we should have is the P4 Program plus whatever leaks from the call to toMLIR, right?

@asl
Copy link
Copy Markdown
Collaborator

asl commented Apr 11, 2025

So, in the case of the translate utility, if we call GC_disable() after the front end passes, the only leakage we should have is the P4 Program plus whatever leaks from the call to toMLIR, right?

plus whatever is in the static variables of the frontend, some some data structures (we still rely on type map, sadly in few places), plus whatever is not cleared after the frontend, yes.

Signed-off-by: joejiong <1004691415qq@gmail.com>
Signed-off-by: joejiong <1004691415qq@gmail.com>
Signed-off-by: joejiong <1004691415qq@gmail.com>
Signed-off-by: joejiong <1004691415qq@gmail.com>
Signed-off-by: joejiong <1004691415qq@gmail.com>
Signed-off-by: joejiong <1004691415qq@gmail.com>
Signed-off-by: joejiong <1004691415qq@gmail.com>

ignore warnings

Signed-off-by: joejiong <1004691415qq@gmail.com>

ignore warnings

Signed-off-by: joejiong <1004691415qq@gmail.com>

separate test and build

Signed-off-by: joejiong <1004691415qq@gmail.com>

modify rights

Signed-off-by: joejiong <1004691415qq@gmail.com>

Add cmakelist export-bmv2 test target

Signed-off-by: joejiong <1004691415qq@gmail.com>

Folder structure

Signed-off-by: joejiong <1004691415qq@gmail.com>

align test

Signed-off-by: joejiong <1004691415qq@gmail.com>

thread 1

Signed-off-by: joejiong <1004691415qq@gmail.com>

clean cache

Signed-off-by: joejiong <1004691415qq@gmail.com>

more

Signed-off-by: joejiong <1004691415qq@gmail.com>

more

Signed-off-by: joejiong <1004691415qq@gmail.com>

more

Signed-off-by: joejiong <1004691415qq@gmail.com>

modify lit config

Signed-off-by: joejiong <1004691415qq@gmail.com>

fine tests

Signed-off-by: joejiong <1004691415qq@gmail.com>

recover cache

Signed-off-by: joejiong <1004691415qq@gmail.com>

git action cache changed script path, so I change the base address to workspace address

Signed-off-by: joejiong <1004691415qq@gmail.com>

clean

Signed-off-by: joejiong <1004691415qq@gmail.com>

new test

Signed-off-by: joejiong <1004691415qq@gmail.com>

clean

Signed-off-by: joejiong <1004691415qq@gmail.com>

disable gc, cause in future it is still need to merge to translate, somehow preserve the flow

Signed-off-by: joejiong <1004691415qq@gmail.com>
Signed-off-by: joejiong <1004691415qq@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants