Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should I2E cloned packets in simple_switch avoid re-parsing? #795

Open
jafingerhut opened this issue Jun 24, 2019 · 12 comments
Open

Should I2E cloned packets in simple_switch avoid re-parsing? #795

jafingerhut opened this issue Jun 24, 2019 · 12 comments

Comments

@jafingerhut
Copy link
Contributor

These lines of code: https://github.com/p4lang/behavioral-model/blob/master/targets/simple_switch/simple_switch.cpp#L551-L554

copied below for convenience:

        // we need to parse again
        // the alternative would be to pay the (huge) price of PHV copy for
        // every ingress packet
        parser->parse(packet_copy.get());

are part of the code in method ingress_thread, inside of an if statement for handling ingress packets that should be I2E cloned.

One minor thing I noticed is that standard_metadata.ingress_port of the cloned packet is 0. If a packet arrived on some non-0 port, and the P4 parser code parsed packets differently for packets arriving on that port vs packets arriving on port 0, then the I2E-cloned packet would have different parsed results than the original packet.

Some possibilities would be:
(1) document this as the expected behavior
(2) change the implementation so that standard_metadata.ingress_port of the cloned packet is copied from the original packet, before the cloned packet is re-parsed.
(3) change the code above so that the packet is not re-parsed, but the PHV of the original packet is copied.

Now the current comment implies to me that approach (3) was avoided intentionally, by design. However, the comment text confuses me. Wouldn't it be possible to pay the (huge) price of PHV copy only for packets that had an I2E clone operation performed, and not for all ingress packets?

@antoninbas
Copy link
Member

For the question regarding (3): the ingress pipeline modifies the PHV object (in particular packet header fields are very likely be modified). We don't know whether the packet will be cloned until the pipeline terminates, but by that time it is too late to copy the PHV object, since it has already been modified. We would need to copy it before applying the ingress pipeline, pessimistically, and would only use that copy if I2E cloning was actually requested.

Personally, I'm fine with (2). I think it makes the most sense, unless someone thinks the cloned packet should not carry a valid ingress port value.

@jafingerhut
Copy link
Contributor Author

OK, adding another quick thought on approach (2), in case that ends up being a change made to behavioral-model. It would be good to scan through all of the v1model standard metadata to see which of them are most likely to be used in P4 parser code, and consider making all of them equal to the values of the packet that caused this I2E clone to be created, so its parsing step ends up with the same results.

Here are fields where it seems fairly reasonable that someone might think to write P4 parser code that reads them, at least in some cases:

  • ingress_port
  • instance_type - You may want to parse resubmitted packets differently than when they first arrived, e.g. parse more headers than before, because the previous ingress processing pass determined that this is a good idea.
  • packet_length - seems maybe useful in some cases
  • ingress_global_timestamp - making parsing results dependent on the precise arrival time seems pretty odd to me, but maybe someone would want to assign it to some other user-defined metadata during parsing?
  • priority - I do not know how simple_switch assigns values to this field, so putting it here as a "maybe" to consider.

These fields seem pretty useless or counter-productive to use in a P4 parser, to me. It would seem reasonable to say that for I2E clone packets (and perhaps even packets arriving from ports), these values are all always 0 (or error.NoError for parser_error) when the v1model parser begins:

  • egress_spec
  • egress_port
  • enq_timestamp
  • enq_qdepth
  • deq_timedelta
  • deq_qdepth
  • egress_global_timestamp
  • mcast_grp
  • egress_rid
  • checksum_error
  • parser_error

These fields currently in v1model.p4 are deprecated, and/or may be removed at some point:

  • clone_spec
  • drop
  • recirculate_port
  • lf_field_list
  • resubmit_flag
  • recirculate_flag

@smolkaj
Copy link
Member

smolkaj commented Jan 10, 2023

We are running into this exact issue in the context of implementing P4Runtime packet-IO.

We use BMv2's CPU port feature by passing in --cpu_port=SAI_P4_CPU_PORT. This causes BMv2 to inject packets received as a P4Runtime packet-out on the SAI_P4_CPU_PORT port of BMv2's packet processing pipeline, in the following wire format:

+------------------------+---------------+
| P4RT packet-out header | Actual packet |
+------------------------+---------------+

Accordingly, our parser looks as follows:

state start {
  transition select(standard_metadata.ingress_port) {
    SAI_P4_CPU_PORT: parse_packet_out_header;
    _              : parse_ethernet;
  }
}

state parse_packet_out_header {
  packet.extract(headers.packet_out_header);
  transition parse_ethernet;
}
...

Unfortunately, this breaks when we I2E-clone a packet, since the clone will be re-parsed with standard_metadata.ingress_port == 0.

I tried working around this by branching on standard_metadata.instance_type, but as was already pointed out by @jafingerhut above, that field is set to 0 as well and only takes on its correct value during egress processing.

@smolkaj
Copy link
Member

smolkaj commented Jan 10, 2023

I agree with solution (2) proposed above.

@jafingerhut @antoninbas is this a thing that can be changed easily, or a deeper issue?

@jafingerhut
Copy link
Contributor Author

jafingerhut commented Jan 10, 2023

As far as I know approach (2) should be fairly easy to create a quick experimental implementation of to try out locally on your side, but I have not made such changes myself and tested them. I would try changing this line:

parser->parse(packet_copy.get());

to something like this:

// add lines here that assign desired values to selected metadata fields of pkt_copy, e.g.
// to instance_type and ingress_port
parser->parse(packet_copy.get());

Then test it to see whether it behaves as desired.

@antoninbas
Copy link
Member

@jafingerhut is describing the correct approach, and IMO it's a very simple change.
The field can be set with packet_copy->get_phv()->get_field("standard_metadata.ingress_port").set(ingress_port);, right before the call to parse.

@smolkaj
Copy link
Member

smolkaj commented Jan 12, 2023

Will give it a try, thank you!

@smolkaj
Copy link
Member

smolkaj commented Jan 12, 2023

Works!

@antoninbas
Copy link
Member

@smolkaj any plan to open a PR? I think it's a change that generally makes sense, not just for your specific use case

@smolkaj
Copy link
Member

smolkaj commented Jan 17, 2023

Yes, I am planning to submit a fix for at least the ingress port.

I can't promise a more complete solution (#795 (comment)) since there seem to be some open question, AFAICT

@jafingerhut
Copy link
Contributor Author

@smolkaj Even if you submit a change that does what you need for now, i.e. only initialize the ingress_port field before re-parsing, that is still useful. It would be good to put a comment in the code to the other issue for anyone that wants more information about additional enhancements they might wish to make in the future.

smolkaj added a commit to smolkaj/behavioral-model that referenced this issue Jan 20, 2023
@smolkaj
Copy link
Member

smolkaj commented Jan 20, 2023

Sounds good! Here is such a patch: #1175. PTAL.

antoninbas pushed a commit that referenced this issue Jan 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants