logo

Live Production Software Forums


Welcome Guest! To enable all features please Login or Register.

Notification

Icon
Error

Options
Go to last post Go to first unread
BloodyIron  
#1 Posted : Friday, January 20, 2017 5:10:39 PM(UTC)
BloodyIron

Rank: Advanced Member

Groups: Registered
Joined: 9/25/2013(UTC)
Posts: 48
Location: Canada

Thanks: 8 times
Was thanked: 3 time(s) in 3 post(s)
Hi Folks,

We're in the process of designing some absurdly large recording workflows, and we're torn as to whether GTX 10 series GPUs are better for vMix than say Quadro M4000 cards.

We're considering on the GTX front, 1060/1070/1080 models, and the Quadro M4000 8GB models.

We are not yet certain if we want to go x264 or NVENC, as I am getting inconsistent information about how NVENC session limitations may impact us. And I also am unsure how we can offload the rendering of x264 at scale.

How much scale? Well, we're trying to design a system that could ingest 8-16x HD-SDI inputs, 1920x1080@60FPS, and have all of them not drop frames. Oh, and then do like 3 of these boxes maybe, and pipe them into each other for some external display flows.

Of course we're going to be throwing super beefy CPUs at them too, considering E5-2637 v4's (2x per system).

So, I'm torn as to what kind of direction we should be heading for GPU offloading.

Also, the resulting broadcasts would be to twitch.

Can we get some advice here please? nVidia specs seem to be insufficient for us to be able to make an informed decision here.
thanks 1 user thanked BloodyIron for this useful post.
PFBM on 1/20/2017(UTC)
PFBM  
#2 Posted : Friday, January 20, 2017 8:33:11 PM(UTC)
PFBM

Rank: Advanced Member

Groups: Registered
Joined: 3/30/2011(UTC)
Posts: 308
Man
Location: Portugal

Thanks: 347 times
Was thanked: 35 time(s) in 30 post(s)
https://www.quora.com/Wh...-the-Nvidia-Quadro-M4000

1080

and get a HP Z620 !

Cheers,

PFBM
BloodyIron  
#3 Posted : Friday, January 20, 2017 8:48:31 PM(UTC)
BloodyIron

Rank: Advanced Member

Groups: Registered
Joined: 9/25/2013(UTC)
Posts: 48
Location: Canada

Thanks: 8 times
Was thanked: 3 time(s) in 3 post(s)
1. That discussion doesn't even talk about vMix, and I care about vMix performance.
2. I was also asking about 1060/1070 models, not just 1080.
3. The Z620 doesn't even come close to the scale of system we're working on. Sorry, not going to work here. ;)
4. Raw/synthetic benchmarks have no applicability with my _very specific_ performance inquiries about things like NVENC/DX/x264 encoding...

Sorry mate, while I know you're trying to help, this isn't what I'm looking for :)


PFBM wrote:
https://www.quora.com/Which-is-a-better-GPU-the-Nvidia-GTX-1080-or-the-Nvidia-Quadro-M4000

1080

and get a HP Z620 !

Cheers,

PFBM

PFBM  
#4 Posted : Saturday, January 21, 2017 6:53:22 AM(UTC)
PFBM

Rank: Advanced Member

Groups: Registered
Joined: 3/30/2011(UTC)
Posts: 308
Man
Location: Portugal

Thanks: 347 times
Was thanked: 35 time(s) in 30 post(s)
1 : vmix is based on game engine

but:


2 : http://www8.hp.com/us/en/workstations/z840.html if you want E5-2637 v4 xeons ( or Z640 )

get a workstation pc not a costumized one if you have the money ....
The Z620 with 2 10 core xeons can be very cost effective ( but its just an example )
vmix likes cores more than speed for mangement of non GPU parts...
FFMPEG and NDI for example.

regards,

PFBM

BloodyIron  
#5 Posted : Saturday, January 21, 2017 12:40:48 PM(UTC)
BloodyIron

Rank: Advanced Member

Groups: Registered
Joined: 9/25/2013(UTC)
Posts: 48
Location: Canada

Thanks: 8 times
Was thanked: 3 time(s) in 3 post(s)
Do you know how x264 offloading is handled with vMix?

Also, I'm a sys admin, I design, integrate and support systems. We're designing beefier than that ;)


PFBM wrote:
1 : vmix is based on game engine

but:


2 : http://www8.hp.com/us/en/workstations/z840.html if you want E5-2637 v4 xeons ( or Z640 )

get a workstation pc not a costumized one if you have the money ....
The Z620 with 2 10 core xeons can be very cost effective ( but its just an example )
vmix likes cores more than speed for mangement of non GPU parts...
FFMPEG and NDI for example.

regards,

PFBM


PFBM  
#6 Posted : Saturday, January 21, 2017 9:09:17 PM(UTC)
PFBM

Rank: Advanced Member

Groups: Registered
Joined: 3/30/2011(UTC)
Posts: 308
Man
Location: Portugal

Thanks: 347 times
Was thanked: 35 time(s) in 30 post(s)
Fourth Generation (Pascal GP10x)

Fourth generation NVENC implements HEVC Main10 10-bit hardware encoding. It also doubles the encoding performance of 4K H.264 & HEVC when compared to previous generation NVENC. It supports HEVC 8K, 4:4:4 chroma subsampling, lossless encoding and sample adaptive offset (SAO). There is no B-Frame support for HEVC encoding and maximum CU size is limited to 32x32.

i use nvenc all the time. yes it works. it cuts quite a bit of cpu usage. vMix manages quite well machine resources.

Cheers,

PFBM

BloodyIron  
#7 Posted : Saturday, January 21, 2017 11:11:26 PM(UTC)
BloodyIron

Rank: Advanced Member

Groups: Registered
Joined: 9/25/2013(UTC)
Posts: 48
Location: Canada

Thanks: 8 times
Was thanked: 3 time(s) in 3 post(s)
What's the max NVENC sessions for these then? :
-Quadro M4000 8GB
-GTX 1060
-GTX 1070
-GTX 1080

Also, the reason I am looking into x264 offloading stuff is in the circumstance where not enough NVENC sessions are available.


PFBM wrote:
Fourth Generation (Pascal GP10x)

Fourth generation NVENC implements HEVC Main10 10-bit hardware encoding. It also doubles the encoding performance of 4K H.264 & HEVC when compared to previous generation NVENC. It supports HEVC 8K, 4:4:4 chroma subsampling, lossless encoding and sample adaptive offset (SAO). There is no B-Frame support for HEVC encoding and maximum CU size is limited to 32x32.

i use nvenc all the time. yes it works. it cuts quite a bit of cpu usage. vMix manages quite well machine resources.

Cheers,

PFBM


thanks 1 user thanked BloodyIron for this useful post.
PFBM on 1/22/2017(UTC)
PFBM  
#8 Posted : Sunday, January 22, 2017 9:30:40 AM(UTC)
PFBM

Rank: Advanced Member

Groups: Registered
Joined: 3/30/2011(UTC)
Posts: 308
Man
Location: Portugal

Thanks: 347 times
Was thanked: 35 time(s) in 30 post(s)
NVENC works on CUDA engine.
the more cores you have ... the more power you have :)

and.... yes i like more costumizing vmix machines :)

NVIDIA TITAN X : 3584
GEFORCE GTX 1080 : 2560
QUADRO M4000 : 1664

http://www.geforce.com/hardware/desktop-gpus


you also can balance the resources with good and lite codecs :
Newutek SpeedHQ 422 codec , or VC3 ( DnxHD ) codec.


Cheers,

PFBM
DWAM  
#9 Posted : Sunday, January 22, 2017 11:32:47 AM(UTC)
DWAM

Rank: Advanced Member

Groups: Registered
Joined: 3/20/2014(UTC)
Posts: 2,721
Man
France
Location: Bordeaux, France

Thanks: 243 times
Was thanked: 794 time(s) in 589 post(s)
3.NVENC LICENSING POLICY

There is no change in licensing policy in the current SDK in comparison to the earlier SDK (Video Codec SDK 6.0). The licensing policy is explained as follows:
The underlying software puts a limit of “two” concurrent encoding sessions on the combined number of encoding sessions executed on all non-qualified cards present on the system.

For example, on a system with one Quadro K4000 card and three GeForce cards, the application can run N simultaneous encode sessions on Quadro K4000 card (where N is defined by the encoder/memory/hardware limitations) and two sessions on all the three GeForce cards combined. Thus the limit on the number of simultaneous encode sessions for such a system is N + 2.
For the purposes of this discussion, non-qualified hardware is defined as any GeForce GPUs or low-end Quadro GPUs

Reference link:
https://developer.nvidia...m/nvidia-video-codec-sdk

Apparently there is no such restrictions on Intel QuickSync but I personally never tried to encode with QuickSync so...

Guillaume


Mathijs  
#10 Posted : Sunday, January 22, 2017 3:39:22 PM(UTC)
Mathijs

Rank: Advanced Member

Groups: Registered
Joined: 5/24/2015(UTC)
Posts: 370
Location: Netherlands

Thanks: 16 times
Was thanked: 81 time(s) in 72 post(s)
Quote:
What's the max NVENC sessions for these then? :
-Quadro M4000 8GB
-GTX 1060
-GTX 1070
-GTX 1080


For the GTX cards it is 2.

For the professional cards see this:

UserPostedImage

So you need M or P series.
thanks 1 user thanked Mathijs for this useful post.
SportsNetUSA.net on 1/22/2017(UTC)
BloodyIron  
#11 Posted : Sunday, January 22, 2017 3:54:10 PM(UTC)
BloodyIron

Rank: Advanced Member

Groups: Registered
Joined: 9/25/2013(UTC)
Posts: 48
Location: Canada

Thanks: 8 times
Was thanked: 3 time(s) in 3 post(s)
Oh man, some solid info here folks! Thanks for all this! :D

Looks like the M4000 is probably where I should head :3

In regards to the SDK, does it actually need to be installed to use NVENC, or is that baked into vMix? I can't recall this very moment.
Mathijs  
#12 Posted : Sunday, January 22, 2017 3:57:07 PM(UTC)
Mathijs

Rank: Advanced Member

Groups: Registered
Joined: 5/24/2015(UTC)
Posts: 370
Location: Netherlands

Thanks: 16 times
Was thanked: 81 time(s) in 72 post(s)
SDK does not need to be installed.
BloodyIron  
#13 Posted : Sunday, January 22, 2017 3:58:29 PM(UTC)
BloodyIron

Rank: Advanced Member

Groups: Registered
Joined: 9/25/2013(UTC)
Posts: 48
Location: Canada

Thanks: 8 times
Was thanked: 3 time(s) in 3 post(s)
Also, those graphs, when it says @30, does that mean 30FPS? Can I expect about half if we encode at 60FPS? Our target is 1080p60.
Mathijs  
#14 Posted : Sunday, January 22, 2017 4:01:11 PM(UTC)
Mathijs

Rank: Advanced Member

Groups: Registered
Joined: 5/24/2015(UTC)
Posts: 370
Location: Netherlands

Thanks: 16 times
Was thanked: 81 time(s) in 72 post(s)
Yep, I think you can calculate like that.
BloodyIron  
#15 Posted : Sunday, January 22, 2017 4:03:31 PM(UTC)
BloodyIron

Rank: Advanced Member

Groups: Registered
Joined: 9/25/2013(UTC)
Posts: 48
Location: Canada

Thanks: 8 times
Was thanked: 3 time(s) in 3 post(s)
Also, if I add another, say, M4000, making 2 of them in the system, does that logically "double" the number of concurrent NVENC streams possible? At that point I'm throwing hardware at a problem and it mostly scales linearly? Or is it something else? (Do I need to even "SLI" them, or is it all over PCIe bus?)
Mathijs  
#16 Posted : Sunday, January 22, 2017 4:11:39 PM(UTC)
Mathijs

Rank: Advanced Member

Groups: Registered
Joined: 5/24/2015(UTC)
Posts: 370
Location: Netherlands

Thanks: 16 times
Was thanked: 81 time(s) in 72 post(s)
That is a good question. I did never have the chance to test it, so I'm not going to say yes. But I would think it works like that. I do not think you need SLI for it when using FFMPEG, as that is an application on itself and not bound to one card particularly. But that can better be answered by Martin, because he knows best how it is working.
I also don't know if you would use all nvenc resources, how the rest of the card behaves. It is not limited by drivers, so why is it performing the same on M4000, M5000 and M6000? The higher versions have more cuda cores and the M6000 even has more memory than the other two.
If you are going to put money into it, please give us a detailed report of your real life results, as numbers in this thread are pure hypothetically.
thanks 1 user thanked Mathijs for this useful post.
PFBM on 1/23/2017(UTC)
Users browsing this topic
Guest (4)
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.