Intel Xeon 6+ “Clearwater Forest” - A leap forward with Intel 18A and Darkmont

intel-xeon-6-plus-clearwater-forest-migovi

Clearwater Forest or Xeon 6+ appeared at Intel Tech Tour 2025, showing a leap forward with Intel 18A and Darkmont E-core.

Intel and the vision for the data center

New leadership and direction

intel-xeon-6-plus-clearwater-forest-migovi

Intel Tech Tour 2025 marks an important moment for Intel’s DCG (Data Center Group), from the technology aspect to the strategic message conveyed by the new leadership. Mr. Kevork Kechichian, the new Executive Vice President and General Manager of DCG, frankly acknowledged the challenges of the group’s past. He emphasized that the problems of product roadmap delays and competitiveness were not “R&D problems” but “ leadership problems ”. Notably, Kevork had only been a leader at Intel for 18 days (at the time of Intel Tech Tour 2025). His frank speech showed a profound cultural change, towards greater transparency and accountability to those who care about Intel (investors, media, employees, customers, etc.).

intel-xeon-6-plus-clearwater-forest-migovi

Not only did Kevork Kechichian admit the company's weaknesses, he also outlined three core values ​​that would shape DCG’s future: listening to customers, delivering competitive products and executing on time with high quality. These are basic business principles, but when they are emphasized in the current context, we can see Intel’s efforts to re-establish trust with customers and partners after a long period of skepticism. Mr. Kechichian also shared that he had repeatedly turned down Intel’s previous offers because he could not trust the old leadership. The decision to join now shows confidence in the vision and execution of the new leadership team, with Lip-Bu Tan leading the way. It is clear that Intel’s problems do not lie in the lack of talent - ​​the engineering team - but in the strategic direction and decision-making ability of the top level.

“Fabless” Thinking in IDM

intel-xeon-6-plus-clearwater-forest-migovi

Kechichian is referring to the idea of ​​adopting a fabless mindset within a leading integrated device manufacturer (IDM) like Intel. Kechichian, who has extensive experience at fabless companies like Qualcomm and Arm, has found that the convenience of having a fab nearby can inadvertently create a lack of discipline in the design process. One example is relying too much on tape-outs to fix bugs, rather than optimizing correctly from the start.

intel-xeon-6-plus-clearwater-forest-migovi

The new strategy requires Intel’s R&D team to operate with the rigor of a fabless company, which must ensure designs are right the first time to avoid huge manufacturing costs. Once this design discipline is established, having the most advanced factories will no longer be a convenience but a true multiplier - a strategic competitive advantage. The combination of optimal design and leading manufacturing processes will allow Intel to create products that are superior in performance and energy efficiency.

intel-xeon-6-plus-clearwater-forest-migovi

The fabless mindset within IDM is a defense mechanism against complacency and is the core foundation for the success of Intel’s Intel Foundry Services (IFS) business model. When Intel’s internal design teams - the largest and most demanding customers of Intel’s own foundries - are the most disciplined, they push the foundries to provide the best Process Design Kit (PDK) and supporting tools. This makes Intel’s manufacturing capabilities more attractive to external fabless customers. Essentially, Intel is using its most advanced products, such as Xeon 6+, to refine and perfect the workflow between design and manufacturing, creating a positive feedback loop that helps both businesses grow.

Xeon 6 Roadmap

intel-xeon-6-plus-clearwater-forest-migovi

Modern data centers have diverse needs, so Intel offers a dual microarchitecture strategy for the Xeon 6 product line. Launching in 2024, the Xeon 6 Series consists of two distinct branches, optimized for different types of workloads. Intel Xeon 6 with P-core (Granite Rapids) is designed for compute-intensive workloads such as HPC and AI, focusing on performance per core. Meanwhile, Intel Xeon 6 with E-core (Sierra Forest) focuses on performance per watt, serving the needs of high-density compute workloads, as well as the ability to scale out workloads, typically cloud-native computing (cloud-native), microservices (microservices) and telecommunications (telco).

intel-xeon-6-plus-clearwater-forest-migovi

There is no single CPU architecture that can be optimized for all usage scenarios, as evidenced by the Xeon 6 with its two distinct branches. Instead of creating a “one size fits all” chip, Intel is clearly segmenting the market. The P-core line will compete directly with its rivals’ high-performance x86 CPUs, while the E-core line is an answer to ARM-based CPUs in the cloud and telecom space, where core density and total cost of ownership (TCO) are top factors.

intel-xeon-6-plus-clearwater-forest-migovi

In that context, the launch of Xeon 6 with E-core (Sierra Forest) could be seen as a strategic defensive move, helping Intel have a competitive product to retain customers in the segment being attacked by ARM competitors. Sierra Forest had initial successes with major partners such as Ericsson, then HPE, Nokia, phoenixNAP. However, the birth of Intel Xeon 6+ (Clearwater Forest) was Intel's real attack. With double the number of cores, applying Intel 18A process and Darkmont microarchitecture, Clearwater Forest was not just a "good enough" product but was designed to lead the market in density and efficiency.

Intel Xeon 6+ platform

Clearwater Forest

intel-xeon-6-plus-clearwater-forest-migovi

Intel Xeon 6+ codenamed Clearwater Forest is the first 18A CPU for data centers, equipped with E-core. The “+” in the name not only shows a new improvement but also a significant leap compared to the current Xeon 6 platform. Clearwater Forest is a pioneering product, integrating a series of modern Intel technologies on a single chip.

intel-xeon-6-plus-clearwater-forest-migovi

Intel Xeon 6+ offers up to 288 E-cores with Darkmont microarchitecture , supports DDR5 memory with speeds up to 8000 MT/s, equipped with a huge Last Level Cache (LLC) of up to 576 MB. Most importantly, this is Intel's first data center processor built on the Intel 18A manufacturing process and is also the first Xeon product to commercialize the advanced 3D packaging technology Foveros Direct. It is not too much to say that Clearwater Forest is a modern technological wonder, showing the technical feat of Intel's engineering team to combine from the manufacturing process, packaging technology to the core microarchitecture in the same product.

Intel 18A with RibbonFET and PowerVia

intel-xeon-6-plus-clearwater-forest-migovi

The Intel 18A process is the foundation for Xeon 6+ performance and energy efficiency advancements. Intel 18A features two groundbreaking innovations: RibbonFET and PowerVia.

intel-xeon-6-plus-clearwater-forest-migovi

RibbonFET is Intel’s Gate-All-Around (GAA) transistor architecture. Unlike previous FinFET architectures, the gate in RibbonFET completely encloses the ribbon channels. This structure allows for tight control of current, increasing drive current while significantly reducing leakage current. The result is a significant improvement in performance per watt, allowing the transistor to operate at lower voltages, which contributes significantly to the chip’s overall energy efficiency.

intel-xeon-6-plus-clearwater-forest-migovi

PowerVia is the industry’s first backside power delivery technology. Traditionally, both the signal path and the power path are located on the front of the wafer, causing congestion and IR drop. PowerVia has moved the entire power grid to the back of the wafer, connecting directly to the transistors via tiny Through-Silicon Via (TSV) connections. This completely frees up the front metal layers for signal routing, reducing the power grid resistance, reducing power loss by 4-5%, allowing designers to increase the standard cell density by over 90%. PowerVia is the key factor that allows Intel to “cram” twice the number of cores into the same socket area compared to the previous generation.

When Intel successfully deploys the 18A process to a complex product like Clearwater Forest, it shows that Intel has regained its technology leadership. For the IDM 2.0 business model and IFS strategy, bringing Intel 18A to mass production with a high-end server product is also a demonstration of capabilities for potential IFS customers. Looking at Xeon 6+, fabless customers can trust that Intel's process and technology are ready for the most complex designs.

Foveros Direct 3D

intel-xeon-6-plus-clearwater-forest-migovi

Foveros Direct 3D packaging technology is the pinnacle of Intel's disaggregation journey. Intel has had a clear evolution from monolithic designs on the Intel 10nm process (Ice Lake), to 2.5D architecture using EMIB bridges on the Intel 7 process (Sapphire Rapids). Then there is the 2.5D architecture with separate functional blocks (tiles) using different processes (Xeon 6 Sierra Forest, combining Intel 7 and Intel 3). Clearwater Forest is the final evolution in this disaggregation journey, combining both 2.5D (EMIB) and 3D (Foveros Direct) packaging with 3 different manufacturing processes (Intel 18A, Intel 3 and Intel 7) on the same package.

intel-xeon-6-plus-clearwater-forest-migovi

Foveros Direct 3D is Intel's next-generation 3D die stacking technology, first introduced in the Xeon family. It uses direct copper-to-copper (Cu-to-Cu) interconnects instead of microbumps, allowing for a bump pitch of just 9 micrometers (9 μm). This creates extremely high interconnect density, allowing for massive die-to-die bandwidth with near-zero power consumption of just 0.05 picojoules per bit (0.05 pJ/bit), or 50 femtojoules/bit. This extremely low power is crucial, as otherwise moving data between stacked dies would consume significant amounts of power, negating the benefits of 3D stacking.

intel-xeon-6-plus-clearwater-forest-migovi

The disaggregation architecture gives Intel a flexible business strategy. By breaking down the SoC into functional “LEGO blocks” (tiles), Intel can:

  • Cost and efficiency optimization: Only use the most expensive Intel 18A process for CPU cores (Compute Tile), while I/O and Fabric blocks can use cheaper, mature processes like Intel 7 and Intel 3.
  • Accelerate product launch: Reusing proven tiles, such as the I/O Tile from the Granite Rapids generation, significantly reduces R&D time and costs, while ensuring backward compatibility with the existing Birch Stream platform.
  • Creating a diverse product range: In theory, Intel can flexibly combine different types of Compute Tiles (P-core, E-core, or specialized accelerator tiles) with the same set of I/O Tiles and Base Tiles to quickly create customized products, suitable for many different market segments.

Clearwater Forest SoC Architecture

intel-xeon-6-plus-clearwater-forest-migovi

Clearwater Forest demonstrates the complexity and sophistication of modern packaging technology. Instead of a single monolithic silicon die, the complete Clearwater Forest processor is a system-in-package consisting of multiple chiplets (or tiles) connected together both horizontally and vertically. The structure on each socket includes:

  • 12 Compute Tiles: Contains Darkmont E-cores, manufactured on the most advanced Intel 18A process, along with L2 cache. Compute Tiles connect to Base Tiles using Foveros Direct 3D.
  • 3 Base Tiles: Acts as the foundation, containing LLC cache, interconnect fabric and memory controller, manufactured on Intel 3 process. Base Tile connects to Compute Tile using Foveros Direct 3D and connects to other Base Tiles as well as I/O Tiles using EMIB.
  • 2 I/O Tiles: Handles all peripheral communications (PCIe, CXL, UPI connections), manufactured on Intel 7 process and reused from previous generation. I/O Tile connects to Base Tile via EMIB.

These blocks are assembled in a 2-layer structure. On the lower layer (2.5D), 3 Base Tiles and 2 I/O Tiles are placed side by side on an organic substrate, connected via 12 EMIB bridges. On the upper layer (3D), each Base Tile has 4 Compute Tiles stacked directly on top using Foveros Direct 3D technology. This structure is like building a skyscraper (3D layers) on a carefully planned urban area (2.5D base), requiring absolute precision in signal management, power supply and heat dissipation.

I/O Tile (Intel 7)

intel-xeon-6-plus-clearwater-forest-migovi

The I/O Tile acts as the communication gateway for the entire SoC to the outside world. Intel decided to completely reuse the I/O Tile from the Granite Rapids generation, saving costs and R&D time. Reusing the I/O Tile also ensures that Clearwater Forest is fully compatible with the already launched Birch Stream platform, allowing OEM/ODM partners to upgrade to the new CPU without redesigning the motherboard.

Each I/O Tile is equipped with hardware interfaces and accelerators including:

  • Peripheral connectivity: 48 PCIe Gen 5.0 lanes, of which 32 lanes can operate in CXL 2.0 mode to connect to next-generation memory and accelerators.
  • Multi-socket connectivity: 96 UPI 2.0 (Ultra Path Interconnect) lanes for high-speed communication between sockets in Xeon 6+ multi-processor server systems.
  • Integrated Accelerators: Each tile contains eight accelerators, including Intel QuickAssist Technology (QAT) for encryption and compression, Intel Dynamic Load Balancer (DLB) for network load distribution, Intel Data Streaming Accelerator (DSA) for optimizing data movement and Intel In-Memory Analytics Accelerator (IAA) for database and analytics tasks.

With two I/O Tiles per socket, Clearwater Forest’s total bandwidth is massive, enough to handle the most demanding storage, networking and AI systems. Built-in accelerator blocks offload infrastructure tasks from CPU cores, freeing up computing power for core applications.

Base Tile (Intel 3)

The Base Tile is the nerve center of the SoC, where all data streams converge. Manufactured on the mature Intel 3 process and optimized for high SRAM density. Each Base Tile performs 2 functions:

  • Last Level Cache (LLC): Each Base Tile contains a massive 192 MB LLC cache. With 3 Base Tiles in 1 socket, the total LLC capacity is up to 576 MB, providing a huge data buffer. Large LLC helps reduce main memory access latency and keeps CPU cores supplied with enough data.
  • Memory Controller: Each Base Tile integrates a controller for 4 DDR5 memory channels. As a result, the entire socket supports 12 DDR5 channels. This is a significant upgrade over the 8 channels of the Sierra Forest 6700E generation, providing the memory bandwidth needed to support 288 cores.

In addition, the Base Tile also contains the coherent fabric, ensuring data consistency between the caches of all cores and the LLC.

Compute Tile (Intel 18A)

intel-xeon-6-plus-clearwater-forest-migovi

The Compute Tile houses the “brains” of the Clearwater Forest processor - the Darkmont E-cores. Splitting the 288 cores into 12 separate Compute Tiles, each containing just 24 cores, is a key strategy for achieving high yields on the fledgling 18A process. If a Compute Tile fails during production, Intel can simply disable it and ship a SKU with fewer cores, rather than scrapping an entire large, expensive die.

Each Compute Tile is built with 6 modules. Each module contains 4 Darkmont cores that share a 4MB unified L2 cache. The 4-core module architecture with shared L2 cache balances low L2 access latency (due to the proximity of the cores) and saves die space. In total, with 12 Compute Tiles, 72 modules and 288 cores, Clearwater Forest delivers extremely high compute density.

Darkmont E-core microarchitecture

"Tiny Beast"

intel-xeon-6-plus-clearwater-forest-migovi

If Clearwater Forest is a quantum leap in system architecture, the E-core Darkmont microarchitecture is the source of that power. Described as a “tiny beast” by Stephen Robinson, Intel Fellow and chief architect for x86 cores, Darkmont completely erases the stereotype of E-cores as “weak” cores meant for background tasks.

intel-xeon-6-plus-clearwater-forest-migovi

Compared to Crestmont on Sierra Forest, Darkmont is a complete redesign. Darkmont is an 8-wide execution width microarchitecture powered by a 9-wide front-end and backed by a massive out-of-order execution engine with a 416-entry window and 26 execution ports. These numbers not only surpass Crestmont, but also approach previous-generation high-performance P-cores like Golden Cove (which has a 512-entry window). This shows that Darkmont is designed to take full advantage of Instruction-Level Parallelism (ILP), capable of efficiently processing complex server code with large code footprints.

Microarchitectural components Crestmont Darkmont Increase/Improve
Front-End
Decode Width 6-wide (2×3) 9-wide (3×3) +50% bandwidth
uop queue 64 entries 96 entries +50% capacity
Out-of-Order Engine
Allocate Width 6-wide 8-wide +33%
Retire Width N/A (less than 16) 16-wide Significant increase
ROB window 256 entries 416 entries +62.5%
Execution Engine
Execution port number 17 26 +53%
Integer ALU 4 8 2x
FMA Vector Unit 2×128-bit 4×128-bit 2x throughput
AGU (Address Generation) 2 4 2x
Memory Subsystem
L2 bandwidth 64B/cycle 128B/cycle 2x

Front-End

intel-xeon-6-plus-clearwater-forest-migovi

To feed the massive execution engine, Darkmont features a powerful front-end. The new front-end features a large 64KB instruction cache (I-Cache) and an upgraded Branch Predictor, which has a 30% increase in capacity from 6,000 to 8,000 targets. Improved branch prediction is critical for server workloads, which have more branches and are less predictable.

What makes Darkmont (and Crestmont) unique is its clustered front-end architecture. Instead of a single monolithic 9-wide decoder, Darkmont uses three independent 3-wide decoder clusters that operate out-of-order. Each time a branch prediction is taken, a new decoder cluster is activated to start decoding instructions from the new destination address. As long as the branch predictor can stay ahead, all three clusters can operate in parallel, providing a total of 9-wide decode bandwidth. Intel has added heuristics to reduce over-speculation from Sierra Forest implementations - an important optimization for server environments.

Out-of-Order Engine and Execution Engine

intel-xeon-6-plus-clearwater-forest-migovi

Once decoded, micro-ops are fed into the out-of-order engine. Darkmont can allocate 8 micro-ops per cycle into the 416-entry ROB window and retire up to 16 micro-ops per cycle. Doubling the retirement width compared to previous designs frees up machine resources significantly faster, which is especially important for quickly processing store instructions that cannot be executed speculatively. From the ROB window, micro-ops are sent to one of 26 execution ports—a huge number that demonstrates Darkmont’s parallel processing capabilities. These execution ports include:

  • 8 integer arithmetic and logic units (ALU).
  • 3 load ports and 2 store ports.
  • Vector computation (FMA) and AI (VNNI) throughput doubled compared to Crestmont.
intel-xeon-6-plus-clearwater-forest-migovi

The increase in execution resources, especially vector/AI throughput, shows that Intel is positioning E-core not only for traditional integer tasks but also for increasingly popular multimedia and AI inference workloads.

Memory Subsystem and RAS Features

To “feed” the 288-core processor, the auxiliary memory system must be extremely powerful. Darkmont can maintain a bandwidth of up to 48 bytes per cycle from the L1 cache (3 loads × 16 bytes/load). The bandwidth between the L1 and L2 cache is also doubled compared to Crestmont, reaching 128 bytes/cycle. In addition, the prefetcher mechanisms at all cache levels have been improved. In addition, the prefetcher is also capable of receiving “telemetry” from the SoC fabric to automatically adjust the level of activity, avoiding unnecessary system bandwidth congestion.

intel-xeon-6-plus-clearwater-forest-migovi

But, the real highlight of Darkmont lies in its enterprise-grade Reliability, Availability and Serviceability (RAS) features, which are typically found only on high-end P-cores.

  • L1 Data Cache ECC: L1 data cache is equipped with Error Correcting Code, which is capable of correcting single bit errors and detecting double bit errors, ensuring data integrity right from the level closest to the processor core.
  • Data Poisoning & Recoverable Machine Check: When an uncorrectable error occurs, the system has the ability to “poison” the corrupted data to prevent it from propagating and causing damage to other parts of the system. Recoverable machine check mechanisms allow the software (operating system or hypervisor) to handle errors more accurately, possibly just terminating the application or virtual machine instead of crashing the entire system.
  • Core Lockstep: This is the most advanced RAS feature integrated. Lockstep allows two cores in the same module to operate synchronously, executing the same instruction stream. At each cycle, their results are compared. If there is any difference, even the smallest (due to a temporary hardware error), the system will immediately detect and report the error.
intel-xeon-6-plus-clearwater-forest-migovi

Integrating a mainframe-level feature like Core Lockstep into E-core is extremely important. This shows that Intel is not only targeting the general cloud market but also segments that require extremely high reliability such as 5G Core networks, financial trading systems and scientific applications, where a single undetected data error can have catastrophic consequences. Core Lockstep is a distinct competitive advantage that competitors will find difficult to replicate in the short term, helping Intel strengthen its position in core markets.

Performance, efficiency and software

Performance and energy efficiency

intel-xeon-6-plus-clearwater-forest-migovi

Intel presented impressive performance figures for Xeon 6+ (Clearwater Forest), showing significant improvements over previous generations. Compared to the Xeon 6 6780E (144-core Sierra Forest), Clearwater Forest delivers up to 1.9x higher performance. This increase is the result of consistent architectural improvements: 17% increase in IPC (Instructions Per Cycle) per core, 5x larger LLC capacity (576 MB vs. 108 MB), 50% more memory channels (12 vs. 8) and 20% faster memory speeds (8000 MT/s vs. 6400 MT/s).

intel-xeon-6-plus-clearwater-forest-migovi

The new platform demonstrates up to a 23% improvement in energy efficiency across the entire load line, from idle to peak load. For data centers considering upgrading from five-year-old infrastructure (2nd Gen Xeon), Clearwater Forest offers compelling value with up to 8:1 server consolidation. This can reduce the power consumption of a cluster by up to 750 kW, while reducing the required floor space by 71%, while delivering 3.5x more performance per watt.

intel-xeon-6-plus-clearwater-forest-migovi

Simply doubling the core count from 144 to 288 would theoretically double performance. Achieving a near-linear (1.9x) increase shows that Intel has successfully addressed potential bottlenecks. By significantly increasing the memory bandwidth, cache size and IPC performance of each core, Intel ensures that the 288 processing cores are not starved for data, allowing them to reach their full potential.

Optimized for Cloud-Native and Telco

intel-xeon-6-plus-clearwater-forest-migovi

Intel clearly positions Xeon 6+ as the ideal platform for next-generation cloud-native operations and 5G Core networks for the telecommunications industry. These workloads have common characteristics: they are built on microservices architectures, are highly horizontally scalable and are extremely sensitive to total cost of ownership (TCO), especially power and physical space costs.

Clearwater Forest perfectly meets these requirements. With its ultra-high core density (up to 576 cores in a 2-socket server), superior power efficiency thanks to the 18A process and Darkmont microarchitecture and massive memory bandwidth, the platform enables cloud and telecom service providers to serve more users or connections per rack. This directly improves operational efficiency and profitability.

Software ecosystem

As always, powerful hardware is only half the story. Intel's competitive advantage remains its large and mature x86 software ecosystem. Developers can take advantage of Intel's highly optimized tools and libraries to get the most out of new hardware without having to rewrite low-level code.

intel-xeon-6-plus-clearwater-forest-migovi

Libraries like the Intel oneAPI Deep Neural Network Library (oneDNN) are constantly updated to support new instructions, helping to accelerate AI and deep learning models. Intel compilers are also optimized to automatically vectorize source code, taking advantage of the powerful vector processing units in the Darkmont core.

Additionally, Intel introduced advanced power management tools such as Intel Application Energy Telemetry (Intel AET). This tool provides application-level energy usage monitoring and analysis, allowing data center administrators to make intelligent optimization decisions to reduce operating costs. Tools such as AET are becoming increasingly important in the context of rising energy costs.

Specifications
Foundation
Sockets 1S – 2S (Xeon 6900P compatible)
Max TDP 300W to 500W per CPU
Compute & Memory
Multiplier Up to 288 E-cores (Darkmont)
L2 Cache Up to 288 MB (4 MB per 4-core cluster)
Last Level Cache 576 MB
Memory 12 channels DDR5, up to 8000 MT/s
Connect
Intel UPI Up to 6 UPI 2.0 links (up to 24 GT/s per lane)
PCI Express Up to 96 PCIe 5.0 lanes (x16, x8, x4, x2)
Compute Express Link Up to 64 CXL 2.0 lanes
Speed ​​& Security
Accelerated integration Up to 16 blocks (4x QAT, 4x DLB, 4x DSA, 4x IAA)
AI script Intel Advanced Vector Extensions 2 (VNNI/INT8)
Security Intel SGX, Intel TDX
Energy management Intel AET, Intel Turbo Rate Limiter

Conclude

intel-xeon-6-plus-clearwater-forest-migovi

By combining the advanced Intel 18A process, groundbreaking Foveros Direct 3D packaging technology and the powerful E-core Darkmont microarchitecture, Clearwater Forest sets a new standard for compute density and power efficiency. Integrating enterprise-class RAS features like Core Lockstep into E-core also demonstrates a strategy to protect high-value markets, creating a distinct competitive barrier.

With Clearwater Forest, Intel has provided a compelling answer to its competitors in the high-performance computing segment while also demonstrating the capabilities of the IDM 2.0 model. The success of Xeon 6+ will have a positive impact not only on the future of the data center business, but also on Intel's semiconductor manufacturing business in the coming years.

Share your thoughts ^^

Discover more from migovi

Subscribe now to keep reading and get access to the full archive.

Continue reading