-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path109
14 lines (7 loc) · 9.63 KB
/
109
1
2
3
4
5
6
7
8
9
10
11
12
13
14
So we are going to start off with just giving you the high-level overview of what we have been doing in the CMS subproject. And we'll also cover what the plans are going into next year. So I am with Manoj. Manoj is my co-lead in this subproject. She is getting ready for the next presentation, so I am going to cover this by myself.
So I don't know how many of you actually tuned into the Future Technologies Initiative, which is -- we used to call this as a software-defined memory. So we started this as a very small project, you know, to essentially focus on -- there is a lot happening within the context of memory. Compute Express Link is one good example. We said we need to do something in this space under OCP umbrella, so we started FDI Initiative, which is called software-defined memory. We developed the architecture. We felt that it is time for us to actually graduate into the OCP, you know, subproject. So we started this -- we moved into the OCP subproject beginning of this year, with the focus of essentially broadening the, you know, the scope, not just the foundational architecture, but focus on the form factors and fabric manageability and so on, and deliver white papers, specifications, and all that stuff. So that has been the focus, you know, so far this year.
And we used to have right around a very small group when we started, and then it grew into 15 attendees or so last year. And then right now, if you attend the weekly calls, you can actually see there are approximately 40-plus attendees actually dialing in, actively participating in the discussions.
And the membership happens to be fairly diverse. You know, this includes all the way from silicon vendors, as well as the memory vendors, Interconnect, SerDes, Cabling, obviously the hyperscalers as well, driving the end-user use cases. Focus -- from the focus perspective, we are looking at the server side of the composable memory as really that, you know, initial focus. It doesn't mean that is the only focus going into next year. Then the memory fabric designs, primarily targeting the hyperscaler enterprise and telco use cases. There has been a lot of discussion about AI systems need. So there has been, you know, quite a bit of interest in this space. So you're going to see more and more in the AI fabric and, you know, what is the best way for us to deliver specifications plus, you know, the reference architecture solutions around it going into next year. We have recently finished four white papers. Actually these four white papers most likely be available on the portal by tomorrow, if not by Monday. There is essentially a two-week window for reviews. So we are targeting these to be available on the OCP, you know, the website. We are also looking at, you may not see this in most of the other subprojects and projects. We are also looking at core contributions. If you, in fact, if you actually visit the experience center, there are actually a few demos that are actually targeting specific set of workloads. This includes all the way from simple workloads to caching workloads and AI workloads. And we are looking at actually taking all these workloads and then moving that into the GitHub. So anyone actually wants to try on their system, they can actually take those workloads and run them. That's really the intent of those GitHub contributions. We also have specifications in the pipeline that will start probably right after this summit. We do work, you know, we are not a standards body. We have to, anytime we look at, you know, certain set of standards, we need to essentially work with our co-travellers. So for example, we are currently doing hotness tracking, essentially looking at the hotness tracking requirements. And we look at hotness tracking requirements and we essentially work with Compute Express Link Consortium to actually drive the standardization part. And we will be looking at JEDEC on the memory module specification and all that stuff. So it's the form factors. DMTF for Redfish profiles. You will see more and more work in the Redfish profile specification for the data center memory fabric manageability. So we will end up working with DMTF as well as co-travellers. This is not a complete list. You know, there are other co-travellers as well, but, you know, we will likely engage with these co-travellers more in the next six months. That's really the intent.
The CMS has essentially five work streams. We started off with, let's focus on the native memory expansion. That is really something that we can actually, you know, get to the reality, primarily because we have silicon support, we have memory buffer vendor support. The goal is to essentially start with that. And then essentially look at the fabric side of the story. That includes AI fabric as well. Not to forget, there is a dire need and focus for the fabric manageability. This includes all the way from discovering the fabric topologies and then essentially figuring out what is the best way to provision the memory and assign that to the specific, you know, the node and be able to consume it and then retire when it is done, including, you know, the security on, you know, encryption, all those aspects as well. So there is quite a bit of scope in front of us when it comes to fabric management and defining the profiles plus POCs and architecture spec and so on. Near memory compute. This is something that we kind of barely scratched the surface this year. And the goal is to essentially look at how do we bring in the compute closer to memory and what are the constructs needed to be able to actually do this, you know, cooperatively with the host side processing and the device side processing. And there is a memory element in the host and there is memory element in the device and there is caching aspects. So we essentially have to look at what do we need to do in OCP for near memory compute. And then last but not least, we want to keep an eye on the academy. All the, you know, work that is going on, we essentially want to provide feedback, essentially understand what they are doing, bring that into the open compute, you know, scope. So it goes both ways. So there is that specific focus on that aspect as well. Like I said, we have finished the hotness tracking requirements that is actually currently going through the ECN in the CXL consortium. And the OCP team is actually working with the CXL team to essentially drive that ECN spec. And all the things needed to essentially get the ECN out the door so that the device vendors can actually start implementing the hotness tracking. So as you can imagine, this is a full day schedule for CMS track. So we have a packed agenda. In the experience center, if you do get time, please do go and visit the CMS experience center. There are about 12 demos, 13 companies participating, very broad set of demos. This includes form factors all the way to, you know, implementation solutions and workloads and a real solution that has the CXL memory and working solutions. So I strongly recommend you guys to take a look at the experience center CMS station demos.
Going into next year, we will essentially focus on all the facets of how do we bring in the memory into an end-to-end solution. This includes form factors. So CXL, just a simple memory buffer, memory module, memory expansion board, form factors, we need to have them as the specification so that the rest of the industry can benefit and build on top. So is the case with the switches. There is an emerging need to look at NVM media that is being exposed as CXL. This is mostly looking at what are the performance characteristics, what workloads can benefit, how do we ensure that we have the right set of requirements driven through the OCP CMS subproject that meets the hyperscaler plus telco and edge customer needs. We will bring in the GPU focus as well. We started with the server-based memory expansion as the initial target, but the goal is to essentially bring in the GPU element. There is also P2P, peer-to-peer computing that we need to look into, and then AI fabric as well as the memory fabric. We need to look at that. So the computational memory is a big focus area. Then there is the CXL or alternate transport. So there is quite a bit of activity happening in the other industry forums. So we need to look at what is the best way to collaborate with them and figure out the best way to actually look at CXL or alternate transport as a way to expand the reachability beyond the rack. So that's another focus area. Not to forget that whatever we do on the foundational ingredients, form factors and profiles and specifications, we need to have a solution that ties both the hardware designs, software elements that includes operating system and everything else. For that, we need to focus on the solution recipes. Without that, it's not really consumable. So we will not forget the fact that that has to be the end state goal. And then data center fabric management, we are looking at specifically the missing APIs in the Redfish profiles and making sure that we work with DMTF to ensure that those are covered and we do the POCs as well as solution architectures. Workloads we need to continue to focus on refining. This includes AI workloads and all the workloads that can benefit from memory expansion, fabric manageability use cases. And we'll target a few webinars going into next year.
So I'm going to leave with one simple request. Please do join the CMS subproject. Contribute wherever you can. It could be active participation. It could be reviewing the white papers. It could be reviewing the GitHub contributions and all that stuff. So we would love your participation. And if you can join, that will be really helpful. Thank you. With that, I will transition it over to Manoj and Sameer. Thank you.