Showing posts with label multithreading. Show all posts
Showing posts with label multithreading. Show all posts

Sunday, February 21, 2010

fsarchiver, good backup for ext4 partitions

Since I had lost my root partition the other day (!), I needed a decent method to backup my new ext4 partitions. Since partimage does not currently support ext4, I found fsarchiver:
http://www.fsarchiver.org/

Overview
I've partitioned my new 4.5TB drives this way:
/boot
/root 3.7TB
/backup 800GB


I formatted the backup filesystem as ext3. This way, I can simply boot with a Fedora Live CD, mount the backup partition and backup my boot and root partitions to the backup. Of course, I'll need to roll that backup off to another storage media. But this strategy helps when I make major updates to my system because I can easily rollback to an earlier version that is stored locally.

Most importantly, restore works!

Detail
Here's what I did the other day to get 'er going.

First, I booted to my Fedora Live CD. It didn't have fsarchiver installed by default, so I did so. You need to become superuser to do this:
[liveuser@localhost ~]$ su
[root@localhost liveuser]# yum install fsarchiver
Loaded plugins: presto, refresh-packagekit
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package fsarchiver.i686 0:0.6.7-1.fc12 set to be updated
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================
Package Arch Version Repository Size
=============================================================================================================================================================
Installing:
fsarchiver i686 0.6.7-1.fc12 updates 93 k

Transaction Summary
=============================================================================================================================================================
Install 1 Package(s)
Upgrade 0 Package(s)

Total download size: 93 k
Is this ok [y/N]: y
Downloading Packages:
Transaction Test Succeeded
Installing : fsarchiver-0.6.7-1.fc12.i686 1/1
Installed:
fsarchiver.i686 0:0.6.7-1.fc12

Complete!


I created my backup directory and mounted it:
[root@localhost liveuser]# mkdir /mnt/backup
[root@localhost liveuser]# mount -t ext3 /dev/mapper/vg_ogre-lv_backup /mnt/backup
[root@localhost liveuser]# ls /mnt/backup
lost+found test.txt


Finally, I ran fsarchiver to do the backup and took advantage of its multithreaded capability:
[root@localhost liveuser]# fsarchiver -j7 -o savefs /mnt/backup/lv_root_backup.fsa /dev/mapper/vg_ogre-lv_root
Statistics for filesystem 0
* files successfully processed:....regfiles=306990, directories=31024, symlinks=16561, hardlinks=4157, specials=28
* files with errors:...............regfiles=0, directories=0, symlinks=0, hardlinks=0, specials=0


You could concatenate these steps into a script:
[mule@ogre ~]$ cat systemBackup.sh
su -
yum install fsarchiver
mkdir /mnt/backup
mount -t ext3 /dev/mapper/vg-ogre/lv_backup /mnt/backup
fsarchiver -j7 -o savefs /mnt/backup/lv_root_backup.fsa /dev/mapper/vg_ogre-lv_root


Voila! In seven hours, I backed up approximately 450GB of data:
[root@localhost liveuser]$ ll /mnt/backup
total 459289180
drwx------. 2 root root 16384 2010-02-10 20:09 lost+found
-rw-r--r--. 1 root root 480252970629 2010-07-21 07:34 lv_root_backup.fsa


Update:
When my RAID set was not being checked, an fsarchive of about 760GB took 3.5 hours. Not bad!
***end update***

Restore
Restore works in a similar way. Since fsarchiver allows you to backup multiple filesystems within one archive, you need to specify which filesystem is getting restored.

In the example below, the "id=0" specifies the index (starting at 0) of the filesystem that is in the archive.  The filesystem to be restored cannot be mounted:
fsarchiver restfs /mnt/backup/lv_root_backup.fsa id=0,dest=/dev/mapper/vg_ogre-lv_root


If you had multiple file  systems stored in the archive "lv_root_backup.fsa", then the id number would increment; eg, "id=1" for the second filesystem stored in the archive.

"dest" is the destination filesystem which is getting restored, in this case "/dev/mapper/vg_ogre-lv_root"


Password Protection for Archives
You can also specify a password to password protect your archive.  On backup and restore, the password switch looks the same:
sudo fsarchiver restfs backup_vg_ogre-lv_root.fsa id=0,dest=/dev/sdb1 -c [password]

Worked for me on many occasions.
the mule

References
http://www.fsarchiver.org/QuickStart

Sunday, January 24, 2010

compile time performance improved!

I was compiling Cinelerra today and noticed that CPU usage was very low during the compile..around 10-15% utilized. I have a dual CPU, quad core box. This makes for a total of eight processors. So with all those CPUs, I figured there must be a way to make compiling faster.

Actually, this low CPU use during compiles was something I had noticed the first time I installed Cinelerra. Ashamedly, I've forgotten to investigate this issue in the two years that I've had the box. So today I googled for "make compiler see multiple CPUs" and found the -j switch to "make" the program that does the compiling:
http://blogs.koolwal.net/2009/04/20/tip-compile-your-programs-fasters-with-multiple-processor-machines/

This article also mentions the CONCURRENCY_LEVEL environment variable, but that variable did not work for my box, a Dell SC1430. So I used the -j switch to make instead:
[mule@ogre my_cinelerra]$ make -j7
(CDPATH="${ZSH_VERSION+.}:" && cd . && /bin/sh /home/sfrase/my_cinelerra/missing --run autoheader)
rm -f stamp-h1


Here are my results.

the time it took to compile Cinelerra normally, without -j:
3min 45s
the time it took to compile Cinelerra with the -j8 (for eight cores):
1min 16s

Holy crap! That's a 300% speed improvement!

Glad I finally researched this.
scott

ps - one other note: 7z is a multithreaded version of tar. On Fedora, use 7za
Installing : p7zip-4.65-2.fc12.x86_64
7za - A file archiver with highest compression ratio

SYNOPSIS
7za [adeltux] [-] [SWITCH]


Related Posts
http://crazedmuleproductions.blogspot.com/2007/10/multithreading-in-ffmpeg-and-mpstat.html

Monday, May 18, 2009

ffmpeg pipe to mpeg2enc

Occasionally, I'll need to send a video stream into mpeg2enc. Mpeg2enc doesn't take an input file; it only accepts a yuv4mpeg stream. In order to send a yuv4mpeg stream to mpeg2enc, I do this using ffmpeg and the -f yuv4mpegpipe command line switch. Also, for best quality, I will send the stream using the FFMPEG variant of the Huffyuv lossless compression algorithm. ffyhuff is an enhanced version of Huffyuv that compresses better than Huffyuv.

Update 2009/05/19As per Dan Dennedy's comment below, ffmpeg's yuv4mpegpipe command will ignore the -vcodec option and pipe the video stream to mpeg2enc using an uncompressed C420jpeg stream, which is an uncompressed YUV format. Certainly good enough for the likes of me!
*** end update ***

Here is a sample command to reencode a 720P video stream as a yuv4mpeg pipe to mpeg2enc:
ffmpeg -threads 4 -i INPUT.M2V -f yuv4mpegpipe - ¦ mpeg2enc --verbose 0 --multi-thread 4 --aspect 3 --format 3 --frame-rate 4 --video-bitrate 18300 --nonvideo-bitrate 384 --interlace-mode 0 --force-b-b-p --video-buffer 448 --video-norm n --keep-hf --no-constraints --sequence-header-every-gop --min-gop-size 6 --max-gop-size 6 -o OUTPUT.M2V

Note that I am taking advantage of the eight processors in my dual quad core using the multithread switches in the commands to both ffmpeg and mpeg2enc. Note that the eight threads have been split evenly, four to each encoder, to avoid CPU context switching. (Thanks again, Dan!)

Here's another trick: to see the header information of a YUV4MPEG stream, pipe the FFmpeg conversion stream to head -1 like so:
ffmpeg -i intermediate.mov -vcodec mpeg2video -f yuv4mpegpipe - | head -1
ffmpeg -i intermediate.mov -pix_fmt yuv420p -f yuv4mpegpipe - | head -1

The FFmpeg output should show you some very important information, bolded below:
the output format: YUV4MPEG2 stream
height and width: 1280x720
framerate: 30001:1001 (or 29.97fps)
colorspace: C420JPEG
not sure what IP: 1 or XYSCSS is

Duration: 01:19:46.74, start: 0.000000, bitrate: 110301 kb/s
Stream #0.0(eng): Video: mjpeg, yuvj420p, 1280x720 [PAR 1:1 DAR 16:9], 108762 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 30k tbc
Stream #0.1(eng): Audio: pcm_s16be, 48000 Hz, 2 channels, s16, 1536 kb/s
Output #0, yuv4mpegpipe, to 'pipe:':
Metadata:
encoder : Lavf52.64.2
Stream #0.0(eng): Video: mpeg2video, yuv420p, 1280x720 [PAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 90k tbn, 29.97 tbc
Stream mapping:
Stream #0.0 -> #0.0
Press [q] to stop encoding
YUV4MPEG2 W1280 H720 F30000:1001 Ip A1:1 C420jpeg XYSCSS=420JPEG

Sweet, eh?

As a final note, I am a bit confused on the differences between FFMPEG compression algorithms: ffyhuff and ffv1. If someone has pointers to the documentation on these, I'd be interested in finding out more. A google search just added to my confusion.

the mule

References
mpeg2enc man page
mpeg2enc manual
ffmpeg vs mpeg2enc
Huffyuv
FFV1
FFMPEG How To

related posts
http://crazedmuleproductions.blogspot.com/2010/01/batch-render-redux.html
/2010/01/compile-times-performance-improved.html
FFMPEG HowTo

Monday, October 01, 2007

multithreading in ffmpeg and the mpstat program

My new server, the Dell SC1430, is dual Xeon processor, quad core. Therefore, I have a full eight cores available for processing tasks. As I have recently completed a new install of FC6, 64-bit on this system, I've been focused on Cinelerra performance optimization. As an adjunct, I happened to notice that when I ran command line ffmpeg, only one of my processors was being used. I had thought that FFMPEG was multithreaded by default, so I was perplexed.

Chasing My Tail
Thinking it was a compile option that needed to be specified, I bounced a few ideas off my friend Graham and at the time, we were thinking "compile option." I googled FFMPEG_CFLAGS, ffmpeg smp and a host of other searches while sniffing down what was to be the wrong track. Taking a step back, I figured I'd try to find information from the source, rather than looking for just a command line option solution. I found from the FFMPEG site (http://ffmpeg.mplayerhq.hu/changelog.html) that that as of version 0.4.9-pre1, FFMPEG supports multithreading/smp for the following codecs:
- multithreaded/SMP motion estimation
- multithreaded/SMP encoding for MPEG-1/MPEG-2/MPEG-4/H.263
- multithreaded/SMP decoding for MPEG-2

OK. So it supports multithreading, but not for all codecs. The absence of multithreading of jpeg/mjpeg was a bummer. And when I ran the following conversion script to convert a DVD to a smaller format MPEG:
ffmpeg -i testdvd.mpg -target svcd output.mpg

I saw that only one of my processors was being utilized. Let's investigate this further.

mpstat to the rescue!
A new find for me is mpstat. mpstat is a program available in RedHat/Fedora that allows you to view the CPU utilization of each processor in your system. Nice! From its output, I saw that only one processor out of eight was being utilized:
[root@localhost ~]# mpstat -P 0 -P 1 -P 2 -P 3 -P 4 -P 5 -P 6 -P 7 4
09:51:21 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
09:51:23 AM 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 6.00
09:51:23 AM 1 89.50 0.00 1.50 1.00 0.00 0.00 0.00 7.50 17.00
09:51:23 AM 2 7.00 0.00 0.50 0.00 0.00 0.00 0.00 93.00 3.00
09:51:23 AM 3 8.00 0.00 0.00 0.00 0.00 0.00 0.00 92.00 0.00
09:51:23 AM 4 5.50 0.00 0.00 0.00 0.00 0.00 0.00 94.50 250.50
09:51:23 AM 5 3.00 0.00 0.00 0.00 0.00 0.00 0.00 97.00 0.00
09:51:23 AM 6 5.50 0.00 0.50 1.00 0.00 0.00 0.00 93.00 0.00
09:51:23 AM 7 9.00 0.00 0.00 0.00 0.00 0.00 0.00 91.00 0.00


So something is wrong. As I was out of ideas, I finally decided to ask the folks who should know: the ffmpeg-users mailing list:
http://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-user

I soon received an answer from Lukas: the "-threads" option!

I tried the "-threads" parameter with various settings (1,2,8 threads). As I have eight processors, the limit was eight threads. If I used more threads than available CPUs, I saw this error at the bottom of the FFMPEG output:
[mpeg2video @ 0x3bfd518850]too many threads

So I then ran a couple of interesting tests.

TEST 1
Convert QT mov file to MPEG2 DVD

Syntax:
ffmpeg -i test.mov -threads 8 -target dvd output.mpg

In this test, the Quicktime file used MJPEG video compression scheme and is not supported for multithreading in FFMPEG. However, MPEG2 is supported.

From the output of top, I did see that process utilization increased slightly each time I increased the number of threads:
1 thread: 12.5% cpu used
2 threads: 14.7% cpu used
8 threads: 16.3% cpu used


However, when I looked at the output of mpstat, it showed the original behavior, whereby one processor was getting fed the entire task:
09:51:29 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
09:51:31 AM 0 7.00 0.00 0.00 0.00 0.00 0.00 0.00 93.00 6.00
09:51:31 AM 1 95.00 0.00 1.00 1.00 0.00 0.00 0.00 3.00 10.00
09:51:31 AM 2 5.00 0.00 0.00 0.00 0.00 0.00 0.00 95.00 3.00
09:51:31 AM 3 7.00 0.00 0.50 0.00 0.00 0.00 0.00 92.50 0.00
09:51:31 AM 4 5.00 0.00 0.00 0.00 0.00 0.00 0.00 95.00 0.00
09:51:31 AM 5 3.50 0.00 0.00 0.00 0.00 0.00 0.00 97.00 250.50
09:51:31 AM 6 5.00 0.00 0.00 0.00 0.00 0.00 0.00 95.00 0.00
09:51:31 AM 7 3.50 0.00 0.00 0.00 0.00 0.00 0.00 96.50 0.00


Hmmm. On to test two:

TEST 2
Convert a DVD of high quality to smaller resolution mpeg2video

Syntax:
ffmpeg -i testdvd.mpg -threads 8 -target svcd output.mpg

In this test, both the source and destination codecs are supported for multithreading in FFMPEG. Now this is where the testing got fun. The output from mpstat was somewhat different this time:
10:00:28 AM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
10:00:32 AM 0 22.50 0.00 0.50 0.00 0.00 0.00 0.00 77.00 5.00
10:00:32 AM 1 17.50 0.00 0.00 0.00 0.00 0.00 0.00 82.50 3.00
10:00:32 AM 2 23.00 0.00 1.50 0.00 0.00 0.00 0.00 75.75 0.00
10:00:32 AM 3 12.00 0.00 0.25 0.00 0.00 0.00 0.00 88.00 250.25
10:00:32 AM 4 29.00 0.00 1.25 0.00 0.00 0.00 0.00 70.25 0.00
10:00:32 AM 5 12.00 0.00 0.25 2.25 0.00 0.00 0.00 85.75 0.00
10:00:32 AM 6 71.25 0.00 3.25 4.00 0.00 0.00 0.00 22.00 17.00
10:00:32 AM 7 18.75 0.00 0.25 0.00 0.00 0.00 0.00 81.00 0.00


Sweet! Notice that all my processors are being utilized. Best part of all, my resulting render fps went from 48fps to 150fps. Awesome!

The Key Thing to Remember
So the key is that multithreading using the "-threads" option in FFMPEG only works when BOTH the source and destination files are of the supported types:
- multithreaded/SMP motion estimation
- multithreaded/SMP encoding for MPEG-1/MPEG-2/MPEG-4/H.263
- multithreaded/SMP decoding for MPEG-2

Remember this, Grasshopper.

And I am so very happy that I don't have to recompile..

thanks to Graham Evans and the ffmpeg-users mail list!
The Mule

related posts
http://crazedmuleproductions.blogspot.com/2010/01/batch-render-redux.html
/2010/01/compile-times-performance-improved.html