TE Perspective on Data Center Performance
Author: Sudhakar Sabada, SVP & GM, Data & Devices
As artificial intelligence (AI) models become increasingly sophisticated, data centers are changing their architecture to process increasing amounts of data more quickly and more efficiently.
The business insights produced by AI models have boosted productivity across a range of industries. From AI-powered chatbots that deliver 24/7 customer support at financial institutions to healthcare platforms that can analyze patient data in real time and help providers predict potential complications and intervene faster, the applications for data-driven computing systems continue to expand. As these models become more sophisticated, the amount of data they need to consume continues to increase as well. That’s all before taking into account the development of generative AI, which relies on increasingly large language models requiring more and more computing power to produce results.
To support these applications, data centers have had to become much more efficient and effective at processing huge amounts of data. That trend is changing the equipment they use as well as the technology they use to connect it.
Supporting AI workloads effectively requires the highest-bandwidth, lowest-latency systems available. The processing-intensive workload has moved beyond the standard central processing units (CPUs) that have traditionally powered computers to more powerful graphics processing units (GPUs) — so-named because they were originally designed to render complex images by doing a large number of relatively uncomplicated calculations at the same time. GPUs became the go-to engine for AI applications, which require the completion of multiple calculations in a short period of time. Now, GPUs are being augmented by tensor processing units (TPUs), capable of accelerating AI calculations even further.
There’s a limit to what a single processor can accomplish, however. By linking clusters of processers together, data centers can boost the amount of computing power available. The technological challenge related to building these clusters is the ability to connect them efficiently.
Moving massive amounts of data among multiple components quickly and reliably requires a range of different connectors. The GPUs that do the heavy lifting and the CPUs that orchestrate the management of the workload throughout the process rely on socket and mezzanine connectors to attach them to printed circuit boards. High-speed cable assemblies and cable cartridges connect the electrical connections on the server’s backplane to the circuit boards and other components on the server. Other input/output (I/O) connectors move data from one server to another and connect clusters across multiple servers.
To operate efficiently and effectively, these connectors have to be designed to meet form factor specifications while also maximizing speed of data transfer. The fastest AI solutions today transfer data at around 56 gigabits per second. In deployed systems, that number will grow to 112 gigabits per second in the next year or so, and eventually to 224 gigabits per second two to three years later.
With each step up in the data rate, the margin of error for maintaining a reliable signal to ensure reliable system performance shrinks. Passing 224 gigabits per second across a copper connection reliably means operating at the limits of physics. These exacting performance specifications are in addition to the importance of engineering connectors that are mechanically and thermally robust enough for use in a harsh operating environment.
To support this, TE produces a variety of connectors designed with the right features, while balancing performance, cost, reliability and durability. These include connector interfaces that mount accelerated compute processing units onto different circuit boards, as well as sockets for implanting the processors used to control the movement of data throughout the system. To connect these components at very high speeds, TE has also developed a family of internal cable assemblies for high-speed board level connectivity, cable backplane assemblies, as well as cartridges and high-speed connectors that simplify the system integration process and support a modular approach to building and scaling these systems, always with an eye toward supporting the highest speed and lowest latency practicable.
Moving data where it needs to be is only half the battle. The components that make up AI clusters also need power to do their job, and as a general rule, more computing power requires more electrical power to drive it. Distributing that power also requires more efficient connectors that support the highest level of system performance.
To support compute-intensive applications, these components also must be robust to ensure they can reliably withstand the demands of continuous operation. To ensure evolving architectures continue to meet these demanding specifications, component manufacturers will need to supply a wide range of power cables and connectors across all form factors.
The higher power required by sophisticated AI computing components also generates more heat, which makes greater thermal dissipation a critical concern. Connectivity on the front panel of an AI system is often among the greatest heat-generating sources, which makes that area an important target for efficiency gains. For example, TE’s I/O products have built-in heat-sink capability to transfer thermal energy away from these modules and keep things operating cooler, improving the system’s overall efficiency and reliability.
The demand for more speed and bandwidth to support increasingly sophisticated AI applications at the data center level is essentially insatiable. Even as they deploy today’s solutions, our customers are actively thinking about how to design a faster, more efficient architecture for the next step in the evolution of the data center.
Sometimes, the capabilities designed into connectors can change the approach to the system’s architecture. For example, as we worked closely with one customer early on design exploration of their system, the strategy evolved from a board-to-board connector-based system to one that used a cable-based backplane instead, resulting in a more flexible and efficient design.
Such innovations are possible because we engage early on with our customers to understand their current requirements, and where they intend to go tomorrow. With AI accelerating the transformation of data centers, this kind of collaboration will be essential to continuing moving the industry forward fast enough to keep up with the surging demand for more and more computing power.
Sudhakar Sabada is senior vice president and general manager of the Data & Devices business at TE Connectivity. In this role, he is responsible for the overall P&L of the business, which broadly serves the electronics industry covering the cloud, artificial intelligence, enterprise, telecom, and business retail market segments. He also oversees the development of Internet of Things (IoT) business that brings communication solutions and innovations across all areas of life. He leads business and product strategy, go-to-market activities, as well as engineering and manufacturing functions.
More Stories on Tech Innovation