NVidia announced their new generation of GPUs at the beginning of this year, at CES in Las Vegas. The first of those “Blackwell” architecture based cards are now hitting the market, starting with the top end 5090, and the second tier 5080, which I have had the opportunity to test. The next steps down, the 5070 and 5070 Ti, will be available in February. The new Blackwell cards are the first to support PCIe 5.0, for faster interface with the host system, and DDR7 for greater memory bandwidth and lower power consumption. NVidia also announced laptop versions of the Blackwell lineup, which should be coming soon as well.
The Blackwell Chips
With 92 Billion transistors, and 21760 CUDA cores, the GB202 Blackwell chip in the GeForce 5090 is a beast. The GB203 chip in this GeForce 5080 has about half that many transistors at 45 Billion, which is the same total as the previous generation 4080, on the same 4nm 4N process from TSMC. This should still lead to a 15% gain in performance due to higher clock speed and faster DDR7 memory, but without the benefit of a more efficient process, we aren’t seeing huge gains in efficiency like we did in the last Ada release. Basically this new generation doesn’t seem as impressive, because the last step (in the Ada generation) was such a huge leap forward. This is the diagram for the top end Blackwell GB202 chip.DLSS4
These cards are expected to offer more than minimal improvements in gaming, but that is primarily due to the shift towards AI processing. NVidia also announced the newest version of their AI powered rendering acceleration “Deep Learning Super Sampling” in DLSS 4, which now can double the resolution of a ray traced frame, as well as generate up to 3 new frames to display, before the next ray traced frame is rendered. This could effectively double the effective performance of DLSS3, in games that support it.
The GeForce 3000 series (Ampere) doubled the resolution with AI allowing more rendered frames, the GeForce 4000 series (Ada) doubled the framerate via AI, and this development will double that again, and may be backwards compatible with existing hardware. The original frame generator used optical flow hardware exclusive to the 4000 series, but this new approach just uses tensor cores. So gamers should see a huge performance leap with DLSS4 in the new cards, but that is technically a software improvement, not due to major changes to the underlying hardware processing power. So some of those new functions will be supported on existing cards, and gamers will see improved performance without even having to buy a new card. Professional users: not so much, unless you can utilize DLSS4.
The New Product Lineup
NVidia announced a total of 4 new GPU cards, the 5090, 5080, 5070Ti, and 5070. The 5090 is a huge step forward from the 4090, with a 33% increase in core count, (16K to 21K) memory bus (384bit compared to 512bit) and capacity (24GB to 32GB). But the other models in the lineup are all subtle upgrades from their previous generation equivalents, with similar core counts and memory specs. The jump to DDR7 appears to provide a 30% increase in memory bandwidth across the lineup, which is a welcome upgrade, as is the fact that they are slightly cheaper at each step.
One interesting trend to note over the last few generations of cards, is that the difference between the top end #090 series cards, and the second tier #080 series cards is increasing. When the 3090 was released, it was 20% more powerful than the 3080, and then the 4090 was 50% more powerful than the 4080. But now the 5090 is double the 5080, measured in GPU core count, memory capacity, bandwidth, or price. Basically the #080 series cards have remained around 10K cores, while the #090 series cards have grown to over 20K cores.
Another interesting observation is that the price has remained close to 10 cents per core for the last three generations. But NVidia has been able to squeeze out more performance per core in the newer generations, both in raw compute performance measured in TFLOPs, and far more effectively via AI acceleration via DLSS. So gaming frame rates will increase dramatically in the newest titles that are optimized for those changes, but GPU compute tasks won’t see quite as much of an improvement.
Mobile GPUs
There is also a new lineup of mobile Blackwell GPUs with the same names, but different specs, and the prices are harder to define, because they come as part of an entire laptop computer. But NVidia claims that RTX 5090 systems will be under $3K, down to the RTX 5070 for $1300. We will need more details before we can draw too many conclusions, but DLSS4 will make these attractive for gaming, while professional users may not see the same level of draw to the newest generation.
New Hardware Features
There are two new hardware capabilities which were added in this generation, FP4 for AI workloads, and support for 4:2:2 video encoding and decoding for H264 and HEVC. Intel first launched 4:2:2 hardware acceleration in 2021, and now NVidia supports that format in their cards as well. There are a number of cameras that record 4:2:2 footage in that format, but very few people are exporting it, due to playback compatibility limitations. (99% of MP4 files are 4:2:0)
One other update in the hardware is that the cards have more video encode and decode engines, with this 5080 having two of each, while the 5090 also has a third encode engine. This should increase the overall encode and decode capacity, and allow larger frame sizes and higher frame rates. NVidia claims this should directly increase the speed of certain professional encoding renders.
New Software Features
Besides DLSS 4 for gaming, there are new features within NVidia Broadcast, including Virtual Keylight to improve your camera feed, and Studio Voice, to improve your microphone feed. These should work in any relevant application, because NVidia Broadcast functions by creating virtual devices for your applications to see. Then it runs GPU accelerated processing on your AV feeds (Speakers, Microphone, and Webcam) which are then sent to your applications.
The new audio feature being added is Studio Voice, which “enhances the quality of your mic to simulate a high end recording studio.” Now I have a pretty decent Yeti microphone, so I didn’t hear a major difference at first, although I could identify some background audio removal. But once I pulled out my headset mic, I got more substantial results, raising the levels and filtering the background. Will it put actual recording rooms out of business, of course not, but it can increase the quality of your microphone input, at the cost of about 20% of you GPU power. So this is intended for recordings and live calls, not necessarily game streaming, where the performance hit would be too high.
This is even more true for the new Virtual Keylight webcam effect. It identifies the subject, and then brightens their face and softens the shadows, without removing much detail. I feel that it looks a little plastic-y, which is common for highly processed video, but there are no user controls to fine tune the effect, besides temperature. This is nice for beginners because there is very little they need to learn, but limits the higher end users to the default setting. I wish I could scale the effect back 50% in my case. And it uses pretty much all of my GPU power to process my feed, regardless of the resolution. So this is not for gaming, it is exclusively for optimizing video calls, where your GPU is otherwise unused. This makes sense, in that they dialed up the processing to maximum, to fully utilize what is available, but I would have limited it to 90% so it didn’t effect performance of minimally graphics intensive applications. But it is a good tool to have in the chest, especially if your webcam or lighting situation offer much to be desired.
The Card Itself
The GeForce 5080 card I received to test and review comes in a pretty innovative package, even more so that the previous generations. The packaging is supposed to be more sustainable, and the inner box uses no ink. Instead the info is cut into the layers of (presumably recycled) cardboard that the box is made from.
Unlike the previous two generations, both the new Founders Edition cards only take up two PCIe slots. They do accept the same 16pin power cord as the previous generation cards, but confusingly enough, the plug is different, with an improved design. After the connection issues and melting problems of the original implementation, the ’12VHPWR’ plug on the card side was improved to the ’12V-2×6′ design, which has different length pins to fit into the existing cables. Basically the sensing pins are noticeably shorter, so they only make contact if the cable is fully inserted, triggering a UI warning if you boot the system without the power cable fully installed. This is because poor connections from improperly seated power cables led to melting issues with previous generation cards.
The cable also fits in at an angle, to a recessed plug, which should help it fit more easily into a larger selection of cases. So you can use your existing 16pin cord to power the new cards, but they are less likely to have issues now. The new card also has the same basic output ports, with 3 DisplayPort connectors and 1 HDMI. (Interestingly all of the ports are inverted, compared to the previous generation.) The DisplayPort connectors are now 2.1, increasing the display bandwidth from 32Gb to 80Gb, and supporting 4K @ 480hz or 8K @ 120hz monitors with Display Stream Compression.
Installing the card was simple, and once I had the correct driver, I was good to go. Most applications, like Premiere and Photoshop took advantage of the card right away. Others like Resolve required a beta version, and some like CineBench don’t support it at all yet. I ran some benchmarking tests with mixed results. I have not tested the 4080 or 3080 in the past, so most of my previous comparison data is based on the top end 4090 and 3090 cards. While this new 5080 can compete well with the 5 year old 3090, it is not going to best the 4090 unless it uses DLSS4 in supported games. I expect that the 5090 would set new records in nearly every test, but the 5080 is only half of that card. It does beat the older professional RTX A4500 in most measurements, but that is hardly a surprise.
Benchmarking
Blender and V-Ray’s benchmarking tools both revealed that my new GeForce 5080 can’t quite beat the previous flagship 4090, although it is pretty close. But it should beat every other card on the market, including the 3090, by a secure margin.
The only application I tested which can leverage the new DLSS4 optimizations was 3D Mark, which showed that previous DLSS implementations tripled the frame rate, but new 4X frame generation support in DLSS4 offered 6x the performance of straight ray traced rendering. While less than the theoretically possible 8x improvement, 6x the frame rate is nothing to scoff at, and should greatly improve the smoothness of top end games on high resolution displays with high refresh rates.
4:2:2 HEVC Encode and Decode Acceleration
The new 4:2:2 HEVC support is probably the single biggest performance leap for non-gamers in this new architecture, for the select users who need it. H.264 is also supported, HEVC is far more likely to be the compression of choice for 4:2:2 content. While the hardware capability is available now, the software is still catching up, because this is a new functionality. Resolve will support accelerated 4:2:2 input and output in the near future, and I was able to test a beta version. The new accelerated encoding was nearly 20 times faster than CPU encoding of 10bit 4:2:2 HEVC files in Resolve. I can tell that accelerated decoding is properly implemented for playback by analyzing the CPU and GPU usage. So I was surprised that the CPU encoding speeds were consistent across cards, because accelerating the decoding of the 4:2:2 source content, should free up the CPU for the encode threads. But you can see it in the NVidia encodes, which took 3 times as long on the older cards, even though I was going to an accelerated 4:4:4 output (Because 4:2:2 output is not supported at all). In that case, the decoding of the 4:2:2 source footage is what made the difference. (The 5080 was doing GPU decode and encode, while the others were doing CPU decode, and GPU encode.)
Similarly, Premiere Pro doesn’t support encoding to 4:2:2 HEVC files, but playback of 4:2:2 footage can be accelerated in the timeline and during export. Currently this is only supported on Intel QuickSync hardware, but I expect that it will be supported on these new NVidia cards in a coming software update. I recently created a new series of benchmark tests for Premiere Pro and Media Encoder, which utilize Adobe’s new color management system. All of the various color space calculations do tax the GPU more, which makes it a good test of performance. Surprisingly the Blackwell card did not fare as well, which might be related to needed driver or software optimizations, or just because it has less GPU memory and this was mostly 8K source content.
Currently 4:2:2 H.264 and HEVC footage is a bit of a corner case. While 4:2:2 video is common in professional workflows, it is usually stored in less compressed formats, like ProRes or DNxHR. Highly compressed formats like H.264 and HEVC valued size over quality, and one of the first steps to save space was to drop to 4:2:0 color resolution before further compressing the data. Now that HEVC is hardware accelerated, it offers good performance at higher bitrates, leading to the need for higher quality than was supported in initial implementations. 4:2:2 HEVC files could become the basis for pro level video workflows in the future, to compete with ProRes, due to the increased quality while supporting 10bit and HDR content, and full hardware acceleration for peak performance. Both Sony and Canon use this underlying 4:2:2 HEVC technology in variations of their XAVC and XF formats, which hopefully will benefit from accelerated playback on these new video cards. Exporting in that 4:2:2 format offers some interesting workflow possibilities as well, but most final delivery HEVC files will remain 4:2:0 for the foreseeable future, because many HEVC playback devices for end viewers don’t support the newer 4:2:2 files. But if you do edit XAVC or Canon XF content regularly, buying a Blackwell GPU should greatly accelerate your performance.
Concluding Observations
While the new GeForce 5090 is huge record setting news, this GeForce 5080 is a more moderate update. The top tier cards have made huge leaps forward in overall raw performance over the last few generations, but the lower tier models have only made modest gains. Yes, they are getting faster, but only by 5-10% in many cases. The big exciting leaps in performance have been due to the shift to AI frame generation, which is effective in video games, but does not improve video processing performance or other professional graphics processing tasks. There are new hardware accelerated encoding and playback features, like AV1 last time, and 4:2:2 color coming out now, but if users aren’t using those file types, they are of no general benefit.
So if you are doing high level grading and VFX work with a 4090, the new 5090 will probably be a noticeable improvement. But if you are on a lower tier 4080, the jump to this 5080 probably won’t make as significant a difference, unless you are editing 4:2:2 MP4 source footage on an AMD desktop. (Intel already accelerates 4:2:2 playback in the CPU.) For laptops, it may be similar, in that the performance jumps will be marginal unless you are using DLSS4 to play games. But for professional uses, DLSS is rarely going to be a factor. Similarly, if you are doing a very particular type of AI processing, then the new FP4 support will be helpful, but like the codecs, those are narrow corner cases, instead of general use. If you are using any older Ampere series card, even a 3090, this new GeForce 5080 that I tested would be an upgrade that will take less space in your system, and provide a performance boost, as well as support for all of the newest features and codecs. And it “only” costs $1000, if you can find one for MSRP. But you can start looking for one soon, as the new cards release January 30th.