I'm starting to get used to the new rig and what works and what doesn't. Here's a video describing how a render works on the new Dell SC1430, dual quad core xeon box running FC6, 64-bit Cinelerra. The render format (not shown in the video) is the following:
File format: Quicktime for Linux
Audio compression scheme: MPEG-4 audio, 128kbps bitrate, quantization quality 50%
Video compression scheme: H.264, 1000000 fixed bitrate (1Mbps)
I was listening to Aaron Newcomb's SourceShow podcast from LinuxWorld in Ohio while describing the render, so that's the voice you'll hear in the background.
PS - I rendered to Quicktime for Linux in this example, but you can use the
-threads [numberOfCpus]
command line switch to use more than one CPU.
enjoy,
the mule
Showing posts with label mpstat. Show all posts
Showing posts with label mpstat. Show all posts
Sunday, October 14, 2007
rendering on the dual quad core Dell SC1430
Labels:
64-bit,
dell sc1430,
mpstat,
performance,
quad core,
sourceshow
If this post was useful to you..consider buying me a beer via PayPal!
Even a $1 Draft will keep the Mule happily working..and help pay for equipment upgrades!
Monday, October 01, 2007
multithreading in ffmpeg and the mpstat program
My new server, the Dell SC1430, is dual Xeon processor, quad core. Therefore, I have a full eight cores available for processing tasks. As I have recently completed a new install of FC6, 64-bit on this system, I've been focused on Cinelerra performance optimization. As an adjunct, I happened to notice that when I ran command line ffmpeg, only one of my processors was being used. I had thought that FFMPEG was multithreaded by default, so I was perplexed.
Chasing My Tail
Thinking it was a compile option that needed to be specified, I bounced a few ideas off my friend Graham and at the time, we were thinking "compile option." I googled FFMPEG_CFLAGS, ffmpeg smp and a host of other searches while sniffing down what was to be the wrong track. Taking a step back, I figured I'd try to find information from the source, rather than looking for just a command line option solution. I found from the FFMPEG site (http://ffmpeg.mplayerhq.hu/changelog.html) that that as of version 0.4.9-pre1, FFMPEG supports multithreading/smp for the following codecs:
- multithreaded/SMP motion estimation
- multithreaded/SMP encoding for MPEG-1/MPEG-2/MPEG-4/H.263
- multithreaded/SMP decoding for MPEG-2
OK. So it supports multithreading, but not for all codecs. The absence of multithreading of jpeg/mjpeg was a bummer. And when I ran the following conversion script to convert a DVD to a smaller format MPEG:
ffmpeg -i testdvd.mpg -target svcd output.mpg
I saw that only one of my processors was being utilized. Let's investigate this further.
mpstat to the rescue!
A new find for me is mpstat. mpstat is a program available in RedHat/Fedora that allows you to view the CPU utilization of each processor in your system. Nice! From its output, I saw that only one processor out of eight was being utilized:
[root@localhost ~]# mpstat -P 0 -P 1 -P 2 -P 3 -P 4 -P 5 -P 6 -P 7 4
09:51:21 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
09:51:23 AM 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 6.00
09:51:23 AM 1 89.50 0.00 1.50 1.00 0.00 0.00 0.00 7.50 17.00
09:51:23 AM 2 7.00 0.00 0.50 0.00 0.00 0.00 0.00 93.00 3.00
09:51:23 AM 3 8.00 0.00 0.00 0.00 0.00 0.00 0.00 92.00 0.00
09:51:23 AM 4 5.50 0.00 0.00 0.00 0.00 0.00 0.00 94.50 250.50
09:51:23 AM 5 3.00 0.00 0.00 0.00 0.00 0.00 0.00 97.00 0.00
09:51:23 AM 6 5.50 0.00 0.50 1.00 0.00 0.00 0.00 93.00 0.00
09:51:23 AM 7 9.00 0.00 0.00 0.00 0.00 0.00 0.00 91.00 0.00
So something is wrong. As I was out of ideas, I finally decided to ask the folks who should know: the ffmpeg-users mailing list:
http://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-user
I soon received an answer from Lukas: the "-threads" option!
I tried the "-threads" parameter with various settings (1,2,8 threads). As I have eight processors, the limit was eight threads. If I used more threads than available CPUs, I saw this error at the bottom of the FFMPEG output:
[mpeg2video @ 0x3bfd518850]too many threads
So I then ran a couple of interesting tests.
TEST 1
Convert QT mov file to MPEG2 DVD
Syntax:
ffmpeg -i test.mov -threads 8 -target dvd output.mpg
In this test, the Quicktime file used MJPEG video compression scheme and is not supported for multithreading in FFMPEG. However, MPEG2 is supported.
From the output of top, I did see that process utilization increased slightly each time I increased the number of threads:
1 thread: 12.5% cpu used
2 threads: 14.7% cpu used
8 threads: 16.3% cpu used
However, when I looked at the output of mpstat, it showed the original behavior, whereby one processor was getting fed the entire task:
09:51:29 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
09:51:31 AM 0 7.00 0.00 0.00 0.00 0.00 0.00 0.00 93.00 6.00
09:51:31 AM 1 95.00 0.00 1.00 1.00 0.00 0.00 0.00 3.00 10.00
09:51:31 AM 2 5.00 0.00 0.00 0.00 0.00 0.00 0.00 95.00 3.00
09:51:31 AM 3 7.00 0.00 0.50 0.00 0.00 0.00 0.00 92.50 0.00
09:51:31 AM 4 5.00 0.00 0.00 0.00 0.00 0.00 0.00 95.00 0.00
09:51:31 AM 5 3.50 0.00 0.00 0.00 0.00 0.00 0.00 97.00 250.50
09:51:31 AM 6 5.00 0.00 0.00 0.00 0.00 0.00 0.00 95.00 0.00
09:51:31 AM 7 3.50 0.00 0.00 0.00 0.00 0.00 0.00 96.50 0.00
Hmmm. On to test two:
TEST 2
Convert a DVD of high quality to smaller resolution mpeg2video
Syntax:
ffmpeg -i testdvd.mpg -threads 8 -target svcd output.mpg
In this test, both the source and destination codecs are supported for multithreading in FFMPEG. Now this is where the testing got fun. The output from mpstat was somewhat different this time:
10:00:28 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
10:00:32 AM 0 22.50 0.00 0.50 0.00 0.00 0.00 0.00 77.00 5.00
10:00:32 AM 1 17.50 0.00 0.00 0.00 0.00 0.00 0.00 82.50 3.00
10:00:32 AM 2 23.00 0.00 1.50 0.00 0.00 0.00 0.00 75.75 0.00
10:00:32 AM 3 12.00 0.00 0.25 0.00 0.00 0.00 0.00 88.00 250.25
10:00:32 AM 4 29.00 0.00 1.25 0.00 0.00 0.00 0.00 70.25 0.00
10:00:32 AM 5 12.00 0.00 0.25 2.25 0.00 0.00 0.00 85.75 0.00
10:00:32 AM 6 71.25 0.00 3.25 4.00 0.00 0.00 0.00 22.00 17.00
10:00:32 AM 7 18.75 0.00 0.25 0.00 0.00 0.00 0.00 81.00 0.00
Sweet! Notice that all my processors are being utilized. Best part of all, my resulting render fps went from 48fps to 150fps. Awesome!
The Key Thing to Remember
So the key is that multithreading using the "-threads" option in FFMPEG only works when BOTH the source and destination files are of the supported types:
- multithreaded/SMP motion estimation
- multithreaded/SMP encoding for MPEG-1/MPEG-2/MPEG-4/H.263
- multithreaded/SMP decoding for MPEG-2
Remember this, Grasshopper.
And I am so very happy that I don't have to recompile..
thanks to Graham Evans and the ffmpeg-users mail list!
The Mule
related posts
http://crazedmuleproductions.blogspot.com/2010/01/batch-render-redux.html
/2010/01/compile-times-performance-improved.html
Chasing My Tail
Thinking it was a compile option that needed to be specified, I bounced a few ideas off my friend Graham and at the time, we were thinking "compile option." I googled FFMPEG_CFLAGS, ffmpeg smp and a host of other searches while sniffing down what was to be the wrong track. Taking a step back, I figured I'd try to find information from the source, rather than looking for just a command line option solution. I found from the FFMPEG site (http://ffmpeg.mplayerhq.hu/changelog.html) that that as of version 0.4.9-pre1, FFMPEG supports multithreading/smp for the following codecs:
- multithreaded/SMP motion estimation
- multithreaded/SMP encoding for MPEG-1/MPEG-2/MPEG-4/H.263
- multithreaded/SMP decoding for MPEG-2
OK. So it supports multithreading, but not for all codecs. The absence of multithreading of jpeg/mjpeg was a bummer. And when I ran the following conversion script to convert a DVD to a smaller format MPEG:
ffmpeg -i testdvd.mpg -target svcd output.mpg
I saw that only one of my processors was being utilized. Let's investigate this further.
mpstat to the rescue!
A new find for me is mpstat. mpstat is a program available in RedHat/Fedora that allows you to view the CPU utilization of each processor in your system. Nice! From its output, I saw that only one processor out of eight was being utilized:
[root@localhost ~]# mpstat -P 0 -P 1 -P 2 -P 3 -P 4 -P 5 -P 6 -P 7 4
09:51:21 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
09:51:23 AM 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 6.00
09:51:23 AM 1 89.50 0.00 1.50 1.00 0.00 0.00 0.00 7.50 17.00
09:51:23 AM 2 7.00 0.00 0.50 0.00 0.00 0.00 0.00 93.00 3.00
09:51:23 AM 3 8.00 0.00 0.00 0.00 0.00 0.00 0.00 92.00 0.00
09:51:23 AM 4 5.50 0.00 0.00 0.00 0.00 0.00 0.00 94.50 250.50
09:51:23 AM 5 3.00 0.00 0.00 0.00 0.00 0.00 0.00 97.00 0.00
09:51:23 AM 6 5.50 0.00 0.50 1.00 0.00 0.00 0.00 93.00 0.00
09:51:23 AM 7 9.00 0.00 0.00 0.00 0.00 0.00 0.00 91.00 0.00
So something is wrong. As I was out of ideas, I finally decided to ask the folks who should know: the ffmpeg-users mailing list:
http://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-user
I soon received an answer from Lukas: the "-threads" option!
I tried the "-threads" parameter with various settings (1,2,8 threads). As I have eight processors, the limit was eight threads. If I used more threads than available CPUs, I saw this error at the bottom of the FFMPEG output:
[mpeg2video @ 0x3bfd518850]too many threads
So I then ran a couple of interesting tests.
TEST 1
Convert QT mov file to MPEG2 DVD
Syntax:
ffmpeg -i test.mov -threads 8 -target dvd output.mpg
In this test, the Quicktime file used MJPEG video compression scheme and is not supported for multithreading in FFMPEG. However, MPEG2 is supported.
From the output of top, I did see that process utilization increased slightly each time I increased the number of threads:
1 thread: 12.5% cpu used
2 threads: 14.7% cpu used
8 threads: 16.3% cpu used
However, when I looked at the output of mpstat, it showed the original behavior, whereby one processor was getting fed the entire task:
09:51:29 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
09:51:31 AM 0 7.00 0.00 0.00 0.00 0.00 0.00 0.00 93.00 6.00
09:51:31 AM 1 95.00 0.00 1.00 1.00 0.00 0.00 0.00 3.00 10.00
09:51:31 AM 2 5.00 0.00 0.00 0.00 0.00 0.00 0.00 95.00 3.00
09:51:31 AM 3 7.00 0.00 0.50 0.00 0.00 0.00 0.00 92.50 0.00
09:51:31 AM 4 5.00 0.00 0.00 0.00 0.00 0.00 0.00 95.00 0.00
09:51:31 AM 5 3.50 0.00 0.00 0.00 0.00 0.00 0.00 97.00 250.50
09:51:31 AM 6 5.00 0.00 0.00 0.00 0.00 0.00 0.00 95.00 0.00
09:51:31 AM 7 3.50 0.00 0.00 0.00 0.00 0.00 0.00 96.50 0.00
Hmmm. On to test two:
TEST 2
Convert a DVD of high quality to smaller resolution mpeg2video
Syntax:
ffmpeg -i testdvd.mpg -threads 8 -target svcd output.mpg
In this test, both the source and destination codecs are supported for multithreading in FFMPEG. Now this is where the testing got fun. The output from mpstat was somewhat different this time:
10:00:28 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
10:00:32 AM 0 22.50 0.00 0.50 0.00 0.00 0.00 0.00 77.00 5.00
10:00:32 AM 1 17.50 0.00 0.00 0.00 0.00 0.00 0.00 82.50 3.00
10:00:32 AM 2 23.00 0.00 1.50 0.00 0.00 0.00 0.00 75.75 0.00
10:00:32 AM 3 12.00 0.00 0.25 0.00 0.00 0.00 0.00 88.00 250.25
10:00:32 AM 4 29.00 0.00 1.25 0.00 0.00 0.00 0.00 70.25 0.00
10:00:32 AM 5 12.00 0.00 0.25 2.25 0.00 0.00 0.00 85.75 0.00
10:00:32 AM 6 71.25 0.00 3.25 4.00 0.00 0.00 0.00 22.00 17.00
10:00:32 AM 7 18.75 0.00 0.25 0.00 0.00 0.00 0.00 81.00 0.00
Sweet! Notice that all my processors are being utilized. Best part of all, my resulting render fps went from 48fps to 150fps. Awesome!
The Key Thing to Remember
So the key is that multithreading using the "-threads" option in FFMPEG only works when BOTH the source and destination files are of the supported types:
- multithreaded/SMP motion estimation
- multithreaded/SMP encoding for MPEG-1/MPEG-2/MPEG-4/H.263
- multithreaded/SMP decoding for MPEG-2
Remember this, Grasshopper.
And I am so very happy that I don't have to recompile..
thanks to Graham Evans and the ffmpeg-users mail list!
The Mule
related posts
http://crazedmuleproductions.blogspot.com/2010/01/batch-render-redux.html
/2010/01/compile-times-performance-improved.html
Labels:
dell sc1430,
ffmpeg,
mpstat,
multicore,
multiple processors,
multithreading,
performance
If this post was useful to you..consider buying me a beer via PayPal!
Even a $1 Draft will keep the Mule happily working..and help pay for equipment upgrades!
Subscribe to:
Posts (Atom)