Regarding the original post's question, the answer is "it depends."
I have worked on professional workstations for a few decades. I found that there is a lot of general user/general knowledge information posted here (including the YouTube video), but some parts lack nuance from an engineering perspective.
If cost were no issue, then using a W5-2575X and higher would not be a problem. However, special care may be needed to configure the operating system, software, BIOS, etc., if there are strict clock-rate and latency requirements. For example, both CPUs will reach a maximum turbo rate of 4.8 GHz when a lower core count is used. However, the BIOS/OS must allow for C6 sleep/Core Parking to hit those clocks. This introduces latency when switching rapidly between using a few vs many cores (which the OS scheduler may try to do). This latency is usually only an issue for applications that need time-critical audio loop (monitoring). The workaround for having the best of both worlds is to enable core parking on most cores, while leaving fewer cores always active. Further tweaks to the OS Scheduler or power management are needed for special applications.
What about the lower base clock rate? With turbo, you only reach this if you use the latest CPU instructions (i.e., Tile Matrix Multiplication) and all cores are maxed out. Most of the time, if using AVX2/AVX512, you will be 400 MHz higher when all cores are used or 600 MHz higher when using standard IA/SSE instructions (very common). This assumes adequate power and thermal management, and the previous information is specific to the Sapphire Rapids CPUs.
CPU Instructions per Cycle (IPC) has improved and isn't equal across vendors, so comparing a 4 GHz AMD to an Intel isn't a fair comparison. A single core of an Intel Pentium G6500 4.1 GHz CPU is slower than a single core of an Intel W5-2575X. However, the W5-2575X lists its base clock at 3 GHz, but in reality, it uses turbo modes to reach anywhere from 3.6 GHz to 4.8 GHz.
If your goal is to achieve the highest clock rates for low-threaded applications, the W5-2545, W5-2555X, and W5-2565X are the sweet spot for this generation of CPUs. However, for this application, it appears to be multi-threaded-optimized, in which case higher-core-count CPUs will be better. When using the UI and not using multiple cores, the CPU will enter turbo mode up to 4.8 GHz and be highly responsive. But once cores are used, the clock rates will drop to around 3.8 GHz (W5-2565X). I like how VMware shows this for high-clock-count CPUs. It takes the clock rate and multiplies it by the number of cores, which gives you an idea of the total work the CPU could do if fully loaded. For example, (excluding HT for simplicity), the W5-2595X has 26 Cores at a minimum of 2.8 GHz (Total 72.8 GHz) while the W5-2575X has 22 Cores at a minimum of 3.0 GHz (Total 66 GHz). This means that if you were truly using an application that uses all available cores, even if the cores are slightly lower in clock rate, the total work done would be higher on the higher-core-count CPU (by 10% in this case), given the same execution time. Also, remember that if properly configured, the CPU can be told to try to reach its max TDP performance if all cores are not used. If you give the W7-2595X only 10 cores of work, those cores will likely be operating around 4.4 GHz, which should be fast enough for general use.
This still comes down to "it depends": if the application really maxes out the CPUs and lowers the clock rates, GPU performance may suffer because it may not be fed new instructions fast enough. This does not apply to encoding/decoding. Does anyone have any CPU and GPU usage reports when the program is in use and taxing the system? Focusing on the bottleneck would answer the question of what is more critical: higher clock speed/lower latency, or more cores/more throughput. Also, consider that some "consumer" CPU/motherboard combos usually push newer cores/technologies faster with fewer cores. The new Core Ultra 2/3 series can hit over 5 GHz! Benchmark the same RTX Pro 6000 on a low thread workload, and the higher clock rate/consumer CPU will usually come out significantly ahead. If this series had more PCIe lanes available, it probably would be better for this. These are the CPUs they have for the Opal/Ruby/Topaz HD systems, and they probably only move to the workstation for the 12G SDI Input bandwidth over the PCIe slots attached to the CPU.
With more cores in recent CPUs reducing clock rates, current datacenter CPUs are increasingly configurable. In the future, you could buy a high-core-count CPU and then configure it to use more or less total power, set a specific number of CPUs to use higher base clock rates (while lowering other cores to stay within limits), limit the total number of cores available, and increase all enabled core clock rates, etc. Intel is calling this "Speed Select Technology."
Encoding performance of recent NVIDIA GPUs - NVIDIA has posted performance numbers (See NVIDIA's NVENC Application Note in the NVIDIA Video Codec SDK Documentation). Performance depends on the quality settings and codec used. If you needed 60 fps streams, a top-of-the-line RTX Pro Blackwell card with 1 NVENC engine encoding to AV1 at 1080p/YUV 4:2:0/8-bit can handle 16 streams at the lowest quality preset and six at the highest quality preset. If the Blackwell GPU has multiple NVENC engines, multiply the streams by the number of encoders available. Note that the highest-quality VBR HEVC encoding from Turing through Blackwell is about three streams @ 60fps per NVENC.
About the Gigabyte MW83-RP0 issues brought up:
Cooling is handled by the case airflow design. If done correctly, cooling is not an issue.
Two M.2 drives via Chipset DMI, this would only be a realistic issue if the M.2 drives could sustain 7+GB/s, and you are also using the remaining bandwidth via Thunderbolt. Otherwise, you are unlikely to hit a limit that causes latency.
Regarding some points in the video
A professional graphics card would use two slots and exhaust heat outside the case. Their diagram shows using 3+ slots, probably a consumer graphics card.
M.2 drives getting warm is normal; This is a case/cooling design issue to resolve.
The AMD CPU bracket cooling mounting issue - this looks like a problem specific to the cooler being used, rather than a motherboard issue.
Seeing framedrops during streaming - I would like to see data indicating this was due to networking or if it was other system bottlenecks.
Storage performance concerns - If sustained write (and write endurance) are a concern, I would not recommend consumer drives. Get enterprise drives meant for the workload or RAID them if even higher performance is needed.
I suggest going with major OEMs (HP, Lenovo, Dell) workstations, as they have already handled the cooling, design, and performance. A custom workstation would also be an option and would use similar hardware or be built to your specs while addressing the power and cooling concerns. There might be some advantage to creating your own, but I found I could never match the dedicated companies that have far more resources and time invested in getting it right. I spend more time and $ trying to handle custom builds, workarounds, or independent support solutions.
I have a personal "small/entry" Z4 G5 Workstation. It is enough for most use cases. It has a W5-2465X, and if I were to upgrade, it would be to the W7-2575X or W7-2595X (both are 250W TDP), as the more power potentially available, the higher my turbo clocks can remain on lower core usage. On average, I find that my Windows 11 system keeps 6-8 cores active due to background tasks. This results in cores staying around 4.2 GHz. With a higher TDP CPU, I can probably hit 4.4-4.5 GHz. When all cores are in use, they run at 3.7 GHz, even though the CPU spec is a base rate of 3.1 GHz. GPU: RTX 5000 Ada, 2 Slot, blower style, 250W, 2x NVENC/2x NVDEC + 1x AV1 Encode/Decode. Thunderbolt 4, Wi-Fi 7, Mellanox 25/50Gb/s NIC, 6x NVME storage drives; 4x in Intel VROC RAID Optane 6TiB, 1x General use/OS Samsung 4 TiB 9100 Pro, 1x Optane 5800X series accelerator/caching, 128 GiB ECC RAM (4x32 GiB), 2x SATA 14 TiB HDDs, 1175W PSU. I use USB3 20Gb/USB4/Thunderbolt 4 for some additional external connections. If I needed more inputs/outputs, I'd probably step up to a physically larger system and CPU setup like a Z6/Z8. I don't do massive video production on my personal systems, so this would be overkill for me. But if you are doing this professionally (making $ off the work), spend the $ on the hardware.
Next-gen Granite Rapids CPUs info is leaking out, so they will probably be here in the coming year. Sadly, they are not based on Panther Cove, which we won't see until Diamond Rapids, so HEDT might be in 2027 - meaning Granite Rapids is a poor bet for longevity. Fortunately, Nova Lake (Core Ultra 4 Series) should be out next year and offer 32 PCIe Lanes on the CPU and DMI 5.0 x8, meaning the "consumer" level enthusiast parts coming out should be enough to handle what Obtanium is trying to accomplish. HEDT will probably still find a place in the future if we keep pushing for higher resolutions and sources such as 16x 8K 120 fps simultaneous streams.