logo

Live Production Software Forums


Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

Options
Go to last post Go to first unread
Integrion  
#1 Posted : Wednesday, January 21, 2015 1:35:23 AM(UTC)
Integrion

Rank: Newbie

Groups: Registered
Joined: 1/21/2015(UTC)
Posts: 7
Location: New York

Hi there,

First off, I'm new to the forum, so a big hello to everyone here. Before posting, I attempted to see if anyone else had posted something similar what I'm experiencing but didn't see anything similar, so here it goes:

I have a perplexing issue involving VMIX. I have Dell T7500 Workstation configured with dual XEON E5645 processors (6 cores each, 2.4Ghz), 24Gb RAM, with dual Nvidia Quadro NVS420 graphics cards running the latest Nvidia driver (341.21). I have six 24in 1920x1080 monitors hanging off of this machine. I am running VMIX HD - v14.0.0.112 x64. For video camera inputs, I use 3 Logitech C920 webcams configured for 1280 x720 30P. I primarily use this system for software development and creating and streaming online multimedia presentations.

My issue is as follows. When running VMIX, I have noticed that VMIX is maxing out one of my four Quadro NVS420 GPUs (94%+) while the other 3 GPUs just sit idling doing nothing. As a result, I frequently get jerky streaming video.

While I presume that VMIX is fairly GPU compute intensive, it would appear that VMIX is only taking advantage of one of our 4 GPUs in our system.

In summary, I'm not sure if this is the correct question to ask, but is there any way to tune or to tell VMIX to load balance/use our system's available GPUs, as this appears to be the reason as to why I am experiencing jerky streaming video with our system.

Any assistance would be deeply appreciated.

Thanks,
JeanClaude
Integrion attached the following image(s):
Nvdia GPU Utilization -VMIX HD.jpg (28kb) downloaded 1 time(s).

You cannot view/download attachments. Try to login or register.
admin  
#2 Posted : Wednesday, January 21, 2015 4:49:12 AM(UTC)
admin

Rank: Administration

Groups: Administrators
Joined: 1/13/2010(UTC)
Posts: 5,208
Man
Location: Gold Coast, Australia

Was thanked: 4288 time(s) in 1520 post(s)
Hi JeanClaude,

Unfortunately it is not possible to load balance or distribute processing to multiple graphics cards.
The main reason being the memory bandwidth intensive nature of vMix. It has to transfer large amounts of
video to the GPU every second and having to transfer this to multiple cards simultaneously would negate any benefit
of load balancing.

For what you are doing, vMix should not need much GPU power anyway, but unfortunately the NVS420
is is quite slow compared to current mid range GeForce cards.

Regards,

Martin
vMix
Integrion  
#3 Posted : Wednesday, January 21, 2015 3:08:01 PM(UTC)
Integrion

Rank: Newbie

Groups: Registered
Joined: 1/21/2015(UTC)
Posts: 7
Location: New York

Hi Martin,

Thanks much for the reply. I think perhaps a bit more discussion is in order. The NVS420 is designed to drive multiple video displays, and each NVS420 has two physical GPUs. Each NVS420 card provides multiple physical video outputs that can be used to drive up to four separate 1920 x 1080 video displays. When more than one NVS420 is installed in a system, as is the case with our system, it is the graphics subsystem driver that leverages the available GPU's that are available on each installed graphics card. In our case, we have 4 available GPUs. This is accomplished via the PCIe multi-channel bus which inter-communicates directly with the NVS420s without involving the CPU or the application itself.

However, for this to work correctly, the application has to be properly coded to take advantage of the available graphics subsystem hardware via the graphics subsystem's API. If the application doesn't do this, the result will be that the application will not take advantage of the available graphics hardware. This is unfortunately precisely what is happening here with VMIX and fully explains why the Nvidia GPU Utilization Monitor shows one GPU is max'ed out, whilst the other 3 GPUs sit idle doing absolutely nothing. In our case, it appears VMIX is running in what NVidia defines as Compatibility Mode. To quote Nvidia, "In this mode only [one GPU] is used by the graphics API device (or context) and any other GPU in the system may be idle, used on a separate device (for either a graphics or compute API), or used by other applications. This offers no graphics performance scaling but ensures compatibility. This is the default setting for all applications [..]." Unquote

I would actively encourage you to investigate this issue, for in our specific case, the NVS420's are a plenty powerful graphics card solution with more than sufficient graphics processing power to handle the VMIX application requirements. I am sure many other users are being negatively impacted by the choice of API mode used by VMIX. The fact that the NVS420 is quite slow relative to current mid range GeForce cards has nothing do with the performance issue described herein. VMIX simply isn't leveraging the available graphics hardware to its advantage. By running in Compatibility Mode, VMIX sabotages itself to run only on a single GPU regardless of how many GPU's are available to it in the graphics subsystem. In our case, the result is jerky streaming video.

Thanks much,
Jean-Claude
Ittaidv  
#4 Posted : Wednesday, January 21, 2015 6:36:52 PM(UTC)
Ittaidv

Rank: Advanced Member

Groups: Registered
Joined: 12/19/2013(UTC)
Posts: 600
Man
Belgium
Location: Belgium

Thanks: 75 times
Was thanked: 91 time(s) in 75 post(s)
Integrion wrote:
Hi Martin,

Thanks much for the reply. I think perhaps a bit more disucssion is in order. The NVS420 is designed to drive multiple video displays, and each NVS420 has two physical GPUs. Each NVS420 card provides multiple physical video outputs that can be used to drive up to four separate 1920 x 1080 video displays. When more than one NVS420 is installed in a system, as is the case with our system, it is the graphics subsystem driver that leverages the available GPU's that are available on each installed graphics card. In our case, we have 4 available GPUs. This is accomplished via the PCIe multi-channel bus which inter-communicates directly with the NVS420s without involving the CPU or the application itself.

However, for this to work correctly, the application has to be properly coded to take advantage of the available graphics subsystem hardware via the graphics subsystem's API. If the application doesn't do this, the result will be that the application will not take advantage of the available graphics hardware. This is unfortunately precisely what is happening here with VMIX and fully explains why the Nvidia GPU Utilization Monitor shows one GPU is max'ed out, whilst the other 3 GPUs sit idle doing absolutely nothing. In our case, it appears VMIX is running in what NVidia defines as Compatibility Mode. To quote Nvidia, "In this mode only [one GPU] is used by the graphics API device (or context) and any other GPU in the system may be idle, used on a separate device (for either a graphics or compute API), or used by other applications. This offers no graphics performance scaling but ensures compatibility. This is the default setting for all applications [..]." Unquote

I would actively encourage you to investigate this issue, for in our specific case, the NVS420's are a plenty powerful graphics card solution with more than sufficient graphics processing power to handle the VMIX application requirements. I am sure many other users are being negatively impacted by the choice of API mode used by VMIX. The fact that the NVS420 is quite slow relative to current mid range GeForce cards has nothing do with the performance issue described herein. VMIX simply isn't leveraging the available graphics hardware to its advantage. By running in Compatibility Mode, VMIX sabotages itself to run only on a single GPU regardless of how many GPU's are available to it in the graphics subsystem. In our case, the result is jerky streaming video.

Thanks much,
Jean-Claude



Might be me, but personaly, I don't see how the vast majority of users could be negatively impacted by the current architecture of Vmix. Most users build their system specificly for Vmix. Since it doesn't need that much GPU power, I don't see why anyone would build a multi GPU system. Since a simple gaming card will do, most people will pick that option.

Isn't it an option to swap one Quadro card for a more powerfull one?
Integrion  
#5 Posted : Wednesday, January 21, 2015 8:11:02 PM(UTC)
Integrion

Rank: Newbie

Groups: Registered
Joined: 1/21/2015(UTC)
Posts: 7
Location: New York

No, in our case, it's not an option to swap out one of the Quadro cards. Our system needs to support multiple displays as do many powerful business workstations today.

I guess the bigger question why one would write a modern application like VMIX to spin hard on only a single GPU when virtually all of today's modern graphics cards are based on multiple GPUs and unified driver APIs that make it simple to leverage the horse power and load balancing of today's modern graphics cards with multiple GPUs without burdening or complicating the application? You appear to confuse a multi graphics card system with a multi GPU system. The two are not the same. Even today's simple gaming cards have multiple GPUs. Your response makes no sense.
admin  
#6 Posted : Wednesday, January 21, 2015 11:07:46 PM(UTC)
admin

Rank: Administration

Groups: Administrators
Joined: 1/13/2010(UTC)
Posts: 5,208
Man
Location: Gold Coast, Australia

Was thanked: 4288 time(s) in 1520 post(s)
Jean-Claude,

We have looked into what you are saying, however what you are suggesting is not actually how multi GPU workloads work in the real work.
While you are correct there can be direct data transfer between the GPUs, there is a heavy penalty for doing this with a
large amount of data (such as live video) and provides no performance benefit for what vMix is doing.

vMix is certainly not running in "compatibility mode". There just isn't the sort of API you are thinking of.

Have a look at the benchmarks for the NVS420 and you will see the performance is quite low compared to modern graphics cards.
Also, most graphics cards today support up to three simultaneous outputs. So you could look
at one powerful three output GeForce and perhaps a second card for a fourth display.

Regards,

Martin
vMix
fordry  
#7 Posted : Thursday, January 22, 2015 2:23:01 PM(UTC)
fordry

Rank: Advanced Member

Groups: Registered
Joined: 1/25/2012(UTC)
Posts: 78

Was thanked: 12 time(s) in 12 post(s)
Just for the sake of comparison and perspective. A Geforce GT 610 would handle your setup, as far as running vmix, with ease...

http://www.newegg.com/Pr...spx?Item=N82E16814130815
Integrion  
#8 Posted : Thursday, January 22, 2015 3:55:26 PM(UTC)
Integrion

Rank: Newbie

Groups: Registered
Joined: 1/21/2015(UTC)
Posts: 7
Location: New York

Martin,

Yes, you are quite right, the benchmarks for the NVS420 are quite low relative to other graphics cards that are today targeted for high-end gaming. However this is not what is causing the lack of performance in our system. The issue is that VMIX isn't taking advantage of the available GPU processing that's available to it in our system. This is what is causing the problem.

For example in our system with two NVS 420s installed, when we go to the main settings panel in VMIX under the Performance section, we see six NVIDIA Quadro NVS 420 entries and one Default entry. These entries correspond not to the number of GPUs that are available in our system, but rather to the number of display adapters that are connected to our system. In our system, we have 4 display adapters connected to one NVS 420, and 2 display adapters connected to the other. Each NVS 420 has two available GPUs that any application can access via the NVIDIA API. However, since the NVS 420 does not support SLI, the NVIDIA API does not support virtual bonding of the two installed NVS 420s as one virtual graphics processor. This is essentially what SLI provides. In lieu of this, the NVIDIA API instead provides the ability for an application to either have direct access to each GPU located on each NVS 420, or in the alternate, to access the two available GPUs on each NVS 420 as one virtual "bonded" GPU, wherein the application simply needs to read and write to the graphics card and the NVIDIA API takes over the task of load balancing the work across the two available GPUs on the graphics card (but not across the two graphics cards). This functionality as described herein is available via the NVIDIA API on all NVIDIA graphics cards that support NVIDIA drivers 244 and greater.

Unfortunately, our testing shows that VMIX does not take advantage of this native performance scaling capability. VMIX instead runs in what is referred to as by NVIDIA as Compatibility Mode, wherein only 1 GPU is selected and used by the graphics API device (or application context) and any other GPU in the system remains idle. At minimum, VMIX should at least be capable of leveraging all GPUs that are available to it on a single graphics card. More preferably however, VMIX should be capable of directly accessing all GPUs that are available to it in the system. In our system configuration with two installed NVS 420s this would provide VMIX with a total of four available GPUs. As it is now, VMIX only uses one GPU of out of the four. As a result, VMIX quickly maxes out one GPU with only a few inputs added while the other GPUs remain idle.

So this really isn't a question of the VMIX application doing more work. Quite the contrary. It's a question of whether the VMIX application intelligently leverages the GPUs that are available to it on the graphics card via the driver native API. Our analysis shows that the VMIX application doesn't do this well at all. It just spins very hard and maxes out on a single GPU. Your suggestion to swap out one of the NVS 420s and to substitute a more powerful graphics card, while I appreciate this, isn't the solution for the reasons stated herein.

More detailed developer information regarding the NVIDIA API may be found at https://developer.nvidia.com/nvapi The NVIDIA API allows direct access to all NVIDIA GPUs and drivers on all windows platforms. I would kindly encourage you to take a look at this. If what I have discussed here sounds a bit too technically daunting, then I would also encourage you to contact NVIDIA directly about possibly sending your VMIX application to NVIDIA so that they can analyze and create a profile for your application. This may obviate the need for some of the common changes suggested in their API developer documentation to handle for example, SLI configurations. In some cases, however, driver profiles may not be the most optimal solution, and application changes may be recommended. Once NVIDIA has created a profile for your application, this profile will be automatically added to their next driver release, making it available to all end users as soon as they install the updated driver.

I believe this would benefit your entire installed VMIX user base.

Best,
Jean-Claude
PFBM  
#9 Posted : Thursday, January 22, 2015 4:51:41 PM(UTC)
PFBM

Rank: Advanced Member

Groups: Registered
Joined: 3/30/2011(UTC)
Posts: 308
Man
Location: Portugal

Thanks: 347 times
Was thanked: 35 time(s) in 30 post(s)
Vmix works like a 3d game.
get a high powered Gforce !
quadro´s are not for gamers and are not for vmix !! :)
its a big waste of money :)


ovinas  
#10 Posted : Thursday, January 22, 2015 5:02:59 PM(UTC)
ovinas

Rank: Advanced Member

Groups: Registered
Joined: 6/4/2013(UTC)
Posts: 308
Man
Location: Germany

Thanks: 1 times
Was thanked: 57 time(s) in 49 post(s)
Did you read Martin's replies? Don't think so...
And why should he take time for 1% of the users that are unable/unwilling to use a recommended GPU instead of investing the time for useful features for the other 99%?
fordry  
#11 Posted : Thursday, January 22, 2015 6:29:34 PM(UTC)
fordry

Rank: Advanced Member

Groups: Registered
Joined: 1/25/2012(UTC)
Posts: 78

Was thanked: 12 time(s) in 12 post(s)
Integrion wrote:
No, in our case, it's not an option to swap out one of the Quadro cards. Our system needs to support multiple displays as do many powerful business workstations today.

I guess the bigger question why one would write a modern application like VMIX to spin hard on only a single GPU when virtually all of today's modern graphics cards are based on multiple GPUs and unified driver APIs that make it simple to leverage the horse power and load balancing of today's modern graphics cards with multiple GPUs without burdening or complicating the application? You appear to confuse a multi graphics card system with a multi GPU system. The two are not the same. Even today's simple gaming cards have multiple GPUs. Your response makes no sense.


I don't know what all your needs are but if the reason you're saying you can't swap out cards is because you think there aren't other reasonable options that support 4 monitors, well, http://www.newegg.com/Pr...spx?Item=N82E16814150719

That card will handle vmix and your monitor setup without blinking.

And if you think you need an workstation class nvidia card here is your fairly reasonable option on that front which again will work just fine with vmix.
http://www.newegg.com/Pr...%20600050357%20600044686

As has been stated already but i'll chime in as well, you in such a severe minority (Probably the only person trying to make that setup work) that there is just no point in trying to develop that. Why? A simple $50 graphics card will handle 90% of the usage scenarios vmix will see and the other 10% can be easily met by under $300 graphics cards. So what is the point of putting all the effort into making an old NVS420 relevant to vmix users? There are plenty of other things far more pressing than supporting that usage scenario because its one that is so easily dealt with.
Integrion  
#12 Posted : Thursday, January 22, 2015 6:38:06 PM(UTC)
Integrion

Rank: Newbie

Groups: Registered
Joined: 1/21/2015(UTC)
Posts: 7
Location: New York

Thanks for all the wonderful feedback, guys. I have no intent of engaging in a subjective pissing contest with anyone here. Granted, Quadro's aren't for everyone. But for those of us who use them, they're indispensable for what we do. If you're a gamer and a gaming card works well for you in your VMIX application, then great for you. Unfortunately, they don't work well for us in our use case. The fact that our test case involves an NVS 420 setup isn't what's relevant here. It's the fact that our NVS 420 setup exposes a graphics subsystem issue with VMIX that gaming cards simply mask, and that if this issue were to be properly addressed under the hood as suggested, VMIX would perform even better on all types of systems. Just saying, guys.
fordry  
#13 Posted : Thursday, January 22, 2015 6:43:16 PM(UTC)
fordry

Rank: Advanced Member

Groups: Registered
Joined: 1/25/2012(UTC)
Posts: 78

Was thanked: 12 time(s) in 12 post(s)
Integrion wrote:
Thanks for all the wonderful feedback, guys. I have no intent of engaging in a subjective pissing contest with anyone here. Granted, Quadro's aren't for everyone. But for those of us who use them, they're indispensible for what we do. If you're a gamer and a gaming card works well for you in your VMIX application, then great for you. Unfortunately, they don't work well for us in our use case. Martin is a smart guy. I'm sure he doesn't need you guys to tell him what to do.


Well I just linked you a Quadro card...
thanks 1 user thanked fordry for this useful post.
Ittaidv on 1/22/2015(UTC)
admin  
#14 Posted : Thursday, January 22, 2015 11:24:49 PM(UTC)
admin

Rank: Administration

Groups: Administrators
Joined: 1/13/2010(UTC)
Posts: 5,208
Man
Location: Gold Coast, Australia

Was thanked: 4288 time(s) in 1520 post(s)
Hi Jean-Claude,

NVAPI is a configuration API and does not provide any rendering capabilities as far as I am aware.
Perhaps you are thinking of something else?

Maybe the CUDA API which is a Compute API and not a Render API which is what vMix needs to process
and composite multiple video feeds in real time.

As I said, memory bandwidth is the key here and multi GPU workloads will actually run slower because of the amount of data
vMix needs to process.

In typical workstation applications, only one or two video feeds are processed by the GPU at a time (typically by effects and
editing applications). Or only a small amount of data in the case of CAD.
Thus it is very easy to copy this across all GPUs and submit code for processing.

vMix is only capable of doing what it does by taking advantage of every last bit of power available on modern PCs.
I have researched in depth the various APIs and methods available over the years, and the way vMix currently works is the fastest available, Quadro or not.

Regards,

Martin
vMix
thanks 2 users thanked admin for this useful post.
Ittaidv on 1/23/2015(UTC), Barney Box Lane on 11/12/2019(UTC)
Users browsing this topic
Guest (2)
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.