-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path124
46 lines (23 loc) · 11 KB
/
124
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
All right, perfect. All right, it's great to see some familiar faces. For those who don't know me, I joined Meta last November. I support the optics team and mechanical thermal. So I do get to see some visibility across the needs for our data center.
I wanted to start out here with the evolution of our pluggables. We all know about the 100Gig CWDM4. We moved on the front end with 200Gig FR4. And then on the back end side, we supported 400Gig FR4. And then now with the 2x400Gig FR4, we're supporting both front end and back end with this pluggable. The key here for all of our deployment is interoperability. We have to support all the backwards compatibility through all the ports and forwards compatibility. Areas that we're looking into could be the FR8 if we wanted the duplex and reduced fiber. And then the obvious approach for next, as Andy discussed, is if we wanted to talk about 1.6T, 200Gig electrical is also in play. But that's for the future. One of the key areas that we're investigating are the FR4 Lite and the LPO. I'll talk about the Lite in a second. Just to bring you back, I think what's special about Meta is that we don't over-engineer our optics. We look at the excess margin that we have associated with our network. We're able to reduce the temperature range, change the link budget, and really allow for optimized cost into our data center. And the same thing with 2x400Gig FR4 Lite. We've analyzed our link budget and our use case. Mostly 3 kilometers, but you can get down to almost a majority of all deployment down to 500 meters. You get the link loss down, and you can support a lighter spec.
So what do I mean? So if you do the analysis of our link budget, you can get down to 2.5 dB insertion loss versus 4 dB. You can support a relaxed Tx OMA spec, and you can allow reduced cost, which is always something we need for our data center.
So what are we talking about here? Plugables. We have the broad ecosystem, highly interoperable, straightforward to validate. 200 gig per lane is relatively viable. We're going to have to change how the switch looks like, all the loss associated at that data rates. But you're always associated with cost and power. We talked about the great numbers associated with all the connectivity. Linear pluggable, right? Linear pluggable. There's power savings there. You remove the DSP. You can always manipulate the data on how much power you'll save, how much cost you'll save associated with this. Right now, it's a niche proprietary system. There's not really an ecosystem here. Most solutions are designed for MVPs, minimal viable products. And right now, when you scale for Meta, link accountability is a huge issue for us. We have to know-- when you're doing pluggables, when you're validating, you're qualifying, that's in the lab. But to scale millions, understand all the connectivity, we have to understand when something goes down, why it's going down. We can't just replace, replace, replace. So that is an issue for LPO, removing the diagnostic capability. We're also looking at, how does it look from a future proof, from a 200 gig per lane? You can look at a half-retime solution, like a TX retime solution. There that might make sense. But then you're also losing half your cost savings, half your power savings. But if you look at it from a 100 gig series perspective, you could balance the ROI between, does it make sense to look at this from a stopgap solution with 100 gig series? And then how fast you're moving to 200 gig series will decide how fast you want to move or not move to LPO. And then we move to CPO. I'm talking about highly integrated solutions, not like silicon photonics in a transceiver, but like silicon photonics across a platform, where we really can get advantage of the area size reduction. And that's where you can see meaningful costs and power savings. But you're going to have to go to integrated optics much further than a transceiver. And of course, again, back to lab demos, we've got to actually get it into the ecosystem. We've got to get it in the data center to understand what would be the deployment strategy for such a new paradigm shift.
I think you guys have seen a lot about AI models here. Doubling every four months, right? I think it's not controversial to talk about all the components associated around the ecosystem and how they're doubling every two years. And that's not sustainable. But that creates an opportunity for integrated optics or optical I/O. And so that's what we're here for today.
So I believe both of you saw the keynotes yesterday. I'm not going to go into how large language models, inference, training, pre-fill, decodes, they all have different requirements that push and tax the limits of our system development. But the key is like, how do optics kind of impact these parameters?
Kind of obvious ones are scale, model size, compute, network bandwidth. That's all going to create large, large growth of optics and greater data rate bandwidth all require push the limits, as we just heard. The back end network is really going to focus on driving data rates faster and faster. The front end network is going to be lagging behind, but there's going to be volume differences, obviously. But the back end network is really what's driving the faster data rates for the connectivity. Obviously network latency, you're going to look at bit error rate floors. You're going to look at modulation formats that affect FEC. Those are all parts of how optics affect it. And then from a memory capacity perspective, that's where you're going to have an opportunity for pooled memory, disaggregated GPUs, all enabled by future optical I/O. Lastly, memory bandwidth. If you're actually going to try to attack this parameter, you're going to have to have shoreline bandwidth densities that are comparable. And that's what it'll take to impact that parameter.
So I think some of you guys have seen this slide before. But basically, in today's network, scale out is at a cost. And the node size is constrained by cooling and power delivery. And that's why we have whole sections on liquid cooling and cooling in the data center. All what we're doing right now is constrained by power delivery and cooling delivery. So that's where we are now. And in the future, we're going to be going to a disaggregated solution, which is enabled by optical I/O, where the nodes are going to be fragmented. Power constrained will be relaxed. And that will be the approach that we use in the future.
So areas for collaboration and partnership. Chiplet die disaggregation, advanced packaging, we know about the cooling, about warping, anything that will impact the flatness of chips and chiplets, high power thermal solutions. We're going to be moving to efficient fabric interconnects for optical electrical. And so how would we actually do that? What is the required optical parameters that we need to drive this innovation?
Today we're at, you could argue, 10, 15 picojoules per bit from a pluggable. But for chiplets, we're going to have to approach 3 picojoules per bit. Latency-wise, down to 10 nanoseconds. Shoreline density. Shoreline density has got to approach the network memory bandwidth density. And that's going to be greater than 2 terabits per millimeter per second. On the last two, we're going to shorten the reach. So we're going to shorten the reach. And as a result, you can lower the insertion loss. Go to 2 and 1/2 dB and down 100 meters. But we're going to be looking at racks that are so large now, clusters are so large, that you're really going to need the bandwidth density to really connect everything together.
So this is my last slide. Call to action. We all know the large size and growth of AI models and applications. That's going to take technical innovation from the ecosystem here that we have at OCP. And compute disaggregation and scale up, scale out are the two major opportunities that we see. But we're going to require the whole industry to look at different ways-- comb lasers, multi-wavelength, all those things that we're going to need to move this forward. And we support these efforts through UCIe and energy efficient optical interfaces. Thank you.
All right. Thank you. We have a couple of minutes for questions. So please use the microphones in the middle if you have a question. Go ahead.
Thank you and good morning. You mentioned three PicoJoules per bit for future network. Can you elaborate how would we even get there? Because if you just take the optics out, just the host service actually consumes more than that. So it just seemed to me that untenable request.
I mean, these are challenging times. So yeah, it's going to take a lot of vision to do it.
Thank you.
Any more questions from the audience? I have a question, Darrell. You know, you're talking about developing a new ecosystem for solutions like LPO or CPO. And it's understandable because you have a very big infrastructure to support. Are you concerned that some of your competitors that may be starting from scratch will be forced to use new optical connectivity and get ahead of more established players in the performance of their AI clusters and the size of their AI clusters eventually?
Yeah, I don't think there's a... No, I mean, that's why we have this team, right? To look at future technologies, get it into hacks and future proof it, right? So I'm not worried that we're going to close our doors to future technologies and kind of not make the chasm jump because we're not looking. So I think we do have enough bandwidth on the team to look at future technologies. So I'm not too worried in that sense. But we do have to look at it in sizable chunks, right? You can't say, well, it has to be deployed immediately into our data center. But we do have to have the ability to do pilots, hacks, to look at new technologies. And that's, I think, addressing your point about a team just focused on one innovative approach and not be constrained to deployment. And I think we have access and resources to support that.
Okay, perfect. All right. Okay, let's take another question from the audience. Go ahead.
Okay. Darren, thank you very much. A very nice talk. So my question is about the AOPO. And you're talking about the challenge for deploying AOPO is one of the things like the field, the debug or the troubleshooting. So my question is that if you look at your dock cable, right now, if you have the problem, you throw away no problem. So my question is that at what kind of a price point for optics, then you don't need to do any troubleshooting?
I think we'll always need troubleshooting, right? I mean, even with great retina pluggable modules, right, we'll always need troubleshooting, right? And I mean, we have a great talk on LPO specifically in about an hour. So they'll go into more detail, all the work we've done around LPO.
Okay, one last question. Go ahead.
So Google is investing in all optical circuit switches and everything, right? So are you guys also into that? Optical circuit switches in aggregation devices.
Oh, optical switching. Yeah, no, we obviously, some large players that do that. We haven't, that would change our complete topology on our network. So we haven't reviewed that.