Gadgets & Hardware

AMD Launches MI350P: A 144GB PCIe Card Challenging Nvidia

Huma Shazia8 May 2026 at 1:08 am4 دقيقة للقراءة

Key Takeaways

The MI350P runs on CDNA4 architecture with 144GB HBM3E and 4TB/s bandwidth in a standard PCIe dual-slot form factor
AMD claims roughly 40% faster FP16 and FP8 theoretical compute versus Nvidia's H200 NVL
The card supports 450W or 600W power configurations for different server environments

AMD has unveiled the Instinct MI350P, a PCIe AI accelerator that brings the company's latest CDNA4 architecture to standard rack-mounted servers. The card packs 144GB of HBM3E memory and aims to give data centers a straightforward upgrade path for AI inference workloads.

The MI350P is designed as a drop-in solution for existing air-cooled servers. It fits into a 10.5-inch dual-slot form factor with a fanless design that relies on chassis airflow. AMD rates it for 600W but includes a 450W configuration for power-constrained environments.

~40%

AMD's claimed performance advantage over Nvidia's H200 NVL in FP16 and FP8 theoretical compute

What's Inside the MI350P

The MI350P runs on AMD's CDNA4 architecture, built on TSMC's 3nm and 6nm FinFET processes. It features 8,192 cores, 128 compute units, and 512 Matrix Cores with a maximum clock speed of 2.2GHz. The GPU pairs with 144GB of HBM3E memory offering 4TB/s of bandwidth and includes a 128MB last-level cache.

These specs are exactly half of what AMD's flagship MI350X and MI355X OAM accelerators offer. The tradeoff is clear: you get a card that fits into standard PCIe infrastructure instead of requiring specialized OAM platforms.

AMD's MI350P brings CDNA4 architecture to standard PCIe servers

Performance Claims and Target Workloads

AMD claims the MI350P delivers roughly 40% faster FP16 and FP8 theoretical compute compared to Nvidia's H200 NVL, the current top PCIe AI accelerator. The company estimates 2,299 TFLOPs of performance with standard precision and 4,600 peak TFLOPs using MXFP4.

The card supports lower-precision MXFP6 and MXFP4 formats natively, matching the capabilities of the higher-end MI350X and MI355X. These formats accelerate large language model inference, where reduced precision can speed up computations without meaningfully affecting output quality.

AMD is targeting inference and retrieval-augmented generation (RAG) pipelines with this card. Up to eight MI350P cards can work together in a single system, letting data centers scale performance incrementally.

How It Stacks Up: MI350 Family Specs

Spec	MI350P PCIe	MI325X OAM	MI350X OAM	MI355X OAM
Architecture	CDNA 4	CDNA 3	CDNA 4	CDNA 4
Memory	144GB HBM3E	256GB HBM3E	288GB HBM3E	288GB HBM3E
Memory Bandwidth	4 TB/s	6 TB/s	8 TB/s	8 TB/s
FP64 Performance	36 TFLOPs	—	72 TFLOPs	78.6 TFLOPs
FP16 Performance	2.3 PFLOPs	2.61 PFLOPs	4.6 PFLOPs	5 PFLOPs
FP8 Performance	4.6 PFLOPs	5.22 PFLOPs	9.2 PFLOPs	10.1 PFLOPs
FP4 Performance	—	—	18.45 PFLOPs	20.1 PFLOPs

Why PCIe Still Matters

The MI350P fills a gap in AMD's lineup. While the MI350X and MI355X target purpose-built AI infrastructure with OAM form factors, many organizations run AI workloads on existing server hardware. A PCIe card that slots into standard infrastructure reduces the barrier to adoption.

Nvidia's H200 NVL currently dominates this segment. If AMD's 40% performance claims hold up in real-world testing, the MI350P could give enterprises a compelling alternative, particularly those already running AMD systems or looking to diversify their GPU suppliers.

What We Don't Know Yet

AMD has not announced pricing or availability for the MI350P. The 40% performance advantage is a theoretical compute claim, which rarely translates directly to real-world workloads. Independent benchmarks comparing the MI350P to the H200 NVL on actual inference tasks will be crucial.

Software maturity is another consideration. Nvidia's CUDA ecosystem has years of optimization and broad framework support. AMD's ROCm platform has improved but still trails in some areas. Enterprises will weigh hardware performance against the total cost of software migration.

ℹ️

Logicity's Take

Frequently Asked Questions

How much memory does the AMD MI350P have?

The MI350P comes with 144GB of HBM3E memory offering 4TB/s of bandwidth.

Is the MI350P faster than Nvidia's H200 NVL?

AMD claims roughly 40% faster FP16 and FP8 theoretical compute compared to the H200 NVL. Real-world performance will depend on specific workloads.

What power does the MI350P require?

The card can be configured for either 600W or 450W operation, depending on server thermal and power constraints.

How many MI350P cards can run in one system?

Up to eight MI350P cards can be paired together in a single system for scaled performance.

What architecture does the MI350P use?

The MI350P runs on AMD's CDNA4 architecture, built on TSMC's 3nm and 6nm FinFET processes.

ℹ️

Need Help Implementing This?

Source: Latest from Tom's Hardware

اقرأ أيضاً

الأمن السيبراني·8 د

رأي مغاير: كيف يؤثر اختراق الأمن الداخلي الأميركي على شركاتنا الخاصة؟

في ظل اختراق عقود الأمن الداخلي الأميركي مع شركات خاصة، نناقش تأثير هذا الاختراق على مستقبل الأمن السيبراني. نستعرض الإحصاءات الموثوقة ونناقش كيف يمكن للشركات الخاصة أن تتعامل مع هذا التهديد. استمتع بقراءة هذا التحليل العميق

عمر حسن·١٦ مارس ٢٠٢٦

الروبوتات·8 د

الإنسان في زمن ما بعد الوجود البشري: نحو نظام للتعايش بين الإنسان والروبوت - Centre for Arab Unity Studies

في هذا المقال، سنناقش كيف يمكن للبشر والروبوتات التعايش في نظام متكامل. سنستعرض التحديات والحلول المحتملة التي تضعها شركات مثل جوجل وأمازون. كما سنلقي نظرة على التوقعات المستقبلية وفقًا لتقرير ماكنزي

فاطمة الزهراء·١٦ مارس ٢٠٢٦

أخبار التقنية·7 د

إطلاق ناسا لمهمة مأهولة إلى القمر: خطوة تاريخية نحو استكشاف الفضاء

تعتبر المهمة الجديدة خطوة هامة نحو استكشاف الفضاء وتطوير التكنولوجيا. سوف تشمل المهمة إرسال رواد فضاء إلى سطح القمر لconducting تجارب علمية. ستسهم هذه المهمة في تطوير فهمنا للفضاء وتحسين التكنولوجيا المستخدمة في استكشاف الفضاء.

عمر حسن·١٦ مارس ٢٠٢٦