Description
In talking with internal Google ML frameworks teams, one theme has come up repeatedly when discussing ML execution: the need for a predictable, core set of operators with precisely defined behavior. Without this, frameworks can't provide predictable behavior, and can't reliably express higher level concepts if they are missing from a given execution runtime. We've seen work by internal and external ML frameworks towards defining these core operator sets, and believe this concept is important for WebNN to adopt, and ideally align with any emerging standards for core op sets.
We'd like to build consensus on the following:
- WebNN should define a core op set, which focuses on the low level ops that are indecomposable and captures functional completeness of the API.
- Implementations of WebNN MUST (in the RFC 2119 sense) implement this core op set.
- The behavior of these core ops will be specified precisely with conformance tests in WPT.
- We must validate with multiple ML frameworks that the identified core op set meets their needs.
Follow-up work:
- Actually define the core op set - both the list of ops and their behavior
- Have at least 2 implementations to make sure the interface including constraints specified can be supported by multiple platforms
- Come up with a rubric for how rigorously the core op set is limited
- E.g. Would we include both sin() and cos(), even though you can define one in terms of the other? Do we only need nand() because then you can make and/or/no/xor ?
- Determine if a subset of a "standard" core op set is acceptable for v1 (i.e. do we need control flow Control flow operations: if, while #559 and bitwise operators Bitwise operators and logical operators naming (rename not to logicalNot) #496 ?)
- Define core op set standardization / evolution over time (e.g. in conjunction with frameworks)
Related questions, but maybe out of scope for this issue:
- What do we call non-core ops? (Composite? High-level? …)
- Should all non-core ops be defined in terms of these core ops?
- How should we structure the spec to make core vs non-core ops clear?
- How precisely should the behavior of non-core ops be constrained?
There are some high level questions that need to be hashed out:
- Similar to GPUSupportedLimits, whether/how to expose limits that are backend specific? This probably needs more implementation experience before answering.
See also: