Spanning Across the Data Center or Breakouts Within the Rack
Short Reach (SR) multi-mode optics are the lowest priced optical interconnects available today that use optical connectors to separate the transceiver from the optical fibers. Although both support 100m reach, AOCs are much less expensive that SR optics but the former functions as a complete cable and the transceiver end cannot be separated from the fibers. Multi-mode fibers have a large light carrying fiber core and are easier and less expensive to manufacture compared to single-mode optics with tiny fiber cores that are difficult and expensive to build with. For these reason, multi-mode, short reach optics are very popular in modern hyperscale, enterprise and storage data center applications.
While mostly utilized to link Top-of-Rack (ToR) switches to other remote switches and storage subsystems in the network, as the short reach transceiver prices continue to fall, more data center operators are using SR optics to connect ToR switches down to servers and local storage – within a single rack. This is due to the configuration flexibility connectorized optics provides and the tiny fiber diameters compared to DAC cabling. All the optical cables supporting a 32-port switch has a diameter less than 2.5cm (1-inch) compared to about 100-125cm (4-5 inches) with copper DAC cables. Thirty-two optical fiber cables would blow off the table with a sneeze but 32 DAC cables could qualify as exercise equipment!
This blog is Part 2 in a 3-part series on Mellanox’s LinkX branded, high-speed interconnects products. Mellanox sells short reach multi-mode optics in 10Gb/s and 25Gb/s line rates single and four-channel configurations enabling 10Gb/s to 100Gb/s of link bandwidth. These are available in SFP and QSFP connector form-factors that use LC and MPO optical connector respectively.
Short reach optics is not new and has a long history of different fibers, connectors and transceiver types at different data rates but modern data centers have zeroed in on SFP/LC and QSFP/MPO form-factors. Mellanox offers both 10Gb/s/25Gb/s single-channel SR transceivers with LC optical connectors as well as four-channel 40G/100Gb/s SR4 transceivers with MPO optical connectors.
VCSEL, Multi-mode …Huh?
SR transceivers employ a large core diameter, 50-um, optical fiber that is easy to interface lasers and detector to, so the costs are much lower than single-mode optics with a tiny 9-um core diameter fiber. But the SR laser pulse tends to scatter into multiple transmission “modes” in the large diameter fiber and becomes unusable after about 100m so the IEEE standards body sets the limit at 100m, assuming four connectors in the run. Multi-mode can reach to 400m, but requires specialized lasers, fibers and connectors.
Multi-mode optics uses a laser called a VCSEL or “Vix-Sell” (Vertical Cavity, Surface Emitting Laser) This laser is created in a vertical cavity in a semiconductor wafer and emits perpendicular to the surface of the chip, hence the name. Multi-mode optics use the 850nm wavelength of infrared laser light which is at an optical transparent window in the glass fibers.
Key SR/SR4 Transceiver features:
Many data centers have structured cabling where the fiber infrastructure is fixed and installed in cabling pipes, under raised floors and integrated into optical patch panels used to manually reconfigure the fiber run end points. Sometimes, fibers run to other system rows, rooms, floors, or even other buildings necessitating the ability to disconnect the fibers from the transceivers installed in the systems. This is something that DAC and AOCs integrated cables cannot do as the wires or fibers are integrated into the plug or transceiver end.
Point-to-Point Applications: ToR-to-Leaf/spine EOR switches
One of the main applications for SR and SR4 transceivers are to link Top-of-Rack (ToR) switches to other parts of the network such as aggregation switches, middle and end-of-row switches, and to leafs in a leaf-spine network. These are typically used as high bandwidth busses that are four-channel SR4s at 40G or 100Gb/s bandwidths. For 1GbE based servers, a 10G or 25Gb/s SR link may be adequate for the ToR uplink. Multi-mode optics is well suited to this application as the reaches in these designs are typically are short spanning a single row or perhaps a few rows.
While several enormous hyperscale operators have made a lot of noise in the press around moving to single-mode fiber, many big hyperscale and enterprise installation still operate as groups of small system clusters where all the systems are well within the reach of 100m multi-mode fiber. Interestingly multi-mode fiber is about three times more expensive than single-mode fiber, the single-mode transceivers are 50 percent to 10X more expensive than multi-mode transceivers. Single-mode transceivers are difficult to build but offer reaches up to 10Km vs only 100m of multi-mode.
Breakout Application: ToR QSFP Breakouts to SFP Servers & Storage
Linking Top-of-Rack switches down to servers and storage subsystems within the same rack is another popular use for SR and SR4 optics. In the past, SR4 transceivers only transfer at 4-channels at a time to another SR4. New models can split the four into individual single-channels that can be connected to different systems and operate independently. This is important when the link reach needed is greater than the 3-meter capability of DAC copper cables and perhaps spanning more than one rack. The passive fiber break out cable has a single 4-channel MPO on one end connecting to the SR4 transceiver and four Duplex LC optical connectors on the other end connecting to four separate SFP transceivers each with their own 100m fiber run.
Similarly, two 50Gb/s links can be created from one 100Gb/s/ using an MPO breakout cable with two MPOs connected to 50Gb/s SR2 transceivers using only two channels each (2x25G).
Since each link can be 100m long, a single SR4 port broken out into four SR links can have each end located 100m from the SR4 port so theoretically able to link to anything within a 200m (645 foot) diameter circle with the SR4 in a switch located in the middle and one SR transceivers at North, South, East, and West ̶ each 100m apart!
How 50Gb/s Can Costs Less Than 40Gb/s?
Top-of-Rack switches such as the Mellanox SN2700 supports 32 ports and available in 40G and 100G port versions. The 32 transceiver ports can split, using breakout fibers, into 64 50Gb/s or 128 25Gb/s ports and configured in multiple mixtures depending on the configuration and bandwidth required. In this way, one 32-port Mellanox SN2700 ToR switch makes 50Gb/s less expensive overall than a 32-port SN2700 40Gb/s switch. Additionally, it provides an upgrade path to 100Gb/s as well by simply changing the fibers and end point 25Gb/s or 50Gb/s transceivers to 100Gb/s SR4s.
32-ports at 40Gb/s with no upgrade path –or- 64 ports at 50Gb/s with an upgrade path.
2.5X bandwidth at only ~50% price premium – you do the math!
This following graphic shows only part of Mellanox’s complete, “end-to-end” portfolio of switches, adapters, DAC/AOC cables, and optical transceivers. Mellanox is one of a few companies in the data center business that designs switch and network adapter ICs, transceiver control and Silicon Photonics ICs, and sells complete switching, adapter and cables and transceivers system solutions. This figure shows SR and SR4 transceivers with breakout fibers used in a server/storage racks, between system racks within rows, and in switch-to-switch networking infrastructures over long reaches.
Not every sub-system application today needs 100Gb/s SR4 bandwidth, so breakouts are a convenient way to split a single ToR port into two 50Gb/s or four 25Gb/s links. Most single CPU socket servers today use a network adapter such as the Mellanox ConnectX-series network adapters, with four-to-eight 8GT/s PCIe Gen3 bus lanes. (PCIe is a parallel bus protocol and specified in Giga-Transfers/sec as it is often interrupted by other bus activities). Four times 8GT/s is 32GT/s and subtracting out approximately 20 percent PCIe overhead, these can neatly fit into a single 25Gb/s link. Similarly, four 10GbE CPU I/Os can fit into a 40GbE or 50GbE link. Many hyperscale builders use two socket servers and are using 50Gb/s links and some with four sockets, need 100Gb/s links to the Top-of-Rack switch. Next generation CPUs are becoming available with multiple 25Gb/s ports integrated into the CPU chips and servers are adding enormous amounts of DRAM and FLASH memories requiring faster I/Os. Soon, 100Gb/s will be too slow!
The Open Compute Project (OCP) “Yosemite” is a four-socket server and uses a 100Gb/s uplink – ideal for SR4 transceivers if optical connectors are needed or if not, DAC or AOC cables. Mellanox offers ConnectX-4Lx SFP and QSFP-based network adapter cards designed for the OCP Yosemite server.
“High-Speed” and “Storage” Can Be Used in the Same Sentence Now!
Storage is jumping into the high-speed game as well. For storage links, 10Gb/s in the past has been adequate for HDD arrays but with big SSDs and newer all electronics FLASH memory the jump to 25Gb/s is rapidly being adopted. Only three NVME Flash card link requires 80Gb/s and can saturate one 100G SR4 QSFP28 transceiver link!
Optical Buzzword Cheat Sheet
The optical technologies have more buzz words than you would ever believe and it continues to get worse! Here are a few definitions for the most popular devices:
Two types of transceiver form-factors (connector shells):
Two types of optical connectors:
Types of optical fibers:
Mellanox LinkX SR and SR4 transceivers offer the lowest-cost, lowest power consuming, short reach, optical links that use detachable optical connectors.
This enables the most cost efficient, highest ROI, lowest Capex and Opex interconnect solution. The full portfolio of 10Gb/s and 25Gb/s line rates in SFP and QSFP form-factors enable customers to build a wide variety of configurations that meet every application in both InfiniBand and Ethernet protocols.
Mellanox designs and builds its own switch, adapter and transceiver ICs as well as Silicon Photonics.
This enables designing end-to-end complete systems with optimal performance and low power consumption between components. By designing our own ICs and transceivers, the SR4 transceiver offer 2.0W power consumption with the CDRs turned off and 2.8W with all on – one of the lowest power consuming devices in the industry. Matching the transceiver electronics to the switch and network adapter ICs guarantees the best performance possible, hence are used by major hyperscale, enterprise, and storage system builders. Other features include:
Check out the Mellanox LinkX website, more blogs, and log into the Mellanox Community for more detailed DAC, AOC and transceiver white papers and articles in the future. LinkX is Mellanox’s brand name for its cables and transceivers product line.
Contact your Mellanox sales representative for availability and pricing options and stay tuned to my blog more interconnect news and tips.