|
View:
New views
13 Messages
—
Rating Filter:
Alert me
|
|
|
XFS and DPX filesHello,
For many years and with great success, I have been capturing and editing high bandwidth video on Linux systems with XFS filesystems exported via Samba. However, I am currently running into a problem and I am wondering if somebody has some hints about how to solve it. Whereas in the past, I have been working with video formats such as MXF and QuickTime -- in which a video clip is represented by a single file (or by a handful of files -- one video, several audio) I now find myself having to deal with DPX files. Unlike MXF or QuickTime files, the DPX format creates one file for each frame of video or film. For American video, that's about 30 files per second, 1800 files per minute, and so on. I have a high performance 10-gigabit-based NAS that allows me to capture and playback "single file" uncompressed HD video streams (up to 160 MB/sec per stream) without any problems. I can also PLAY BACK so-called 2K DPX video, which has the "1 file per frame structure" and has a higher data rate than uncompressed HD -- a bit over 300 MB/sec. However, when I go to WRITE DPX files, that's where the trouble begins. Even when I am recording "standard definition" DPX files at only a data rate of about 40 MB/sec or 1.3 MB/file, I am having trouble. This is what I am observing: 1) When I begin recording, I can see that data immediately starts moving across the network at a steady rate of about 41-42 MB/sec, and data also starts getting written to the hardware RAID at the same steady rate (3ware 9650 + 16 x 7200 RPM enterprise-class SATA disks) 2) After about 3 minutes of recording, or after about 6000 files have been written, suddenly my server is no longer writing to the RAID subsystem. The data continues to come in through the network interface, but the writing stops. When I look at vmstat, I can see that "outgoing blocks" pretty much grind to a halt at this point. 3) Then, after about 5-10 seconds of pause, the system begins writing to the RAID again. All the while, the data has been coming into the network interface at a fairly steady 41-42 MB/sec. The writing never seems to "catch up" and about 10 seconds after the writing begins again, the client application stops sending data because it senses that it has "dropped frames". 4) Level 5 Samba show some curious errors now and then about "xfs_quota" failing -- but they don't seem to be concentrated just at the point where the writing stops. [2009/10/30 15:00:59, 3] lib/sysquotas.c:sys_get_quota(433) sys_get_xfs_quota() failed for mntpath[/mnt/vol1] bdev[/dev/sdb1] qtype[2] id[502]: No such file or directory. By the way, I tried mounting my XFS filesytsems without quota support -- I don't see these messages any more, but I also still have the same problem that the system stops writing to the disks after about 3 minutes. 5) If I export an iSCSI target from the exact same NAS (via iSCSI Enterprise Target, for example), mount it on my Windows machine and format it as NTFS, I don't have any trouble capturing for an hour or more. So, there is clearly nothing wrong with the network or the cabling or the RAID subsystem itself. 6) Similarly, if I format my storage with EXT3 instead of XFS and export the volume via Samba, I don't have any trouble recording for the same very long periods of time. I DO observe a very different pattern of writing to the storage, however. While 41-42 MB/sec comes in steadily over the network interface, with ext3-formated disks, the NAS writes to the storage at about 200-250 MB/sec every now and then. Then there is no writing activity for about 4-5 seconds. Then another burst of 200-250 MB/sec again. And the pattern continues. 7) My NAS system is running a plain-vanilla 2.6.20.15 kernel.org kernel. It is a 64-bit system with 3.2 Ghz Quad Core Intel 5482 CPUs and 4 GBs of RAM. However, I see EXACTLY the same behavior on an an even more powerful Nehalem-based system with 2.93 Ghz Quad Core CPU and 6 GBs RAM and the very latest 2.6.31.4 kernel. So, I don't think it has anything to do with the XFS version or the hardware, for instance. And as I said above, I don't have trouble handling much higher data rates when I am only creating a few files per hour, versus creating 30 files per second. My hunch is that the problem is related to the number of files I am creating per second. Could it be that XFS is not handling this situation well, whereas this doesn't pose a problem for EXT3 or iSCSI/NTFS? I am wondering if there are any specific XFS formating or mounting options that would make a huge difference (size of log, sectorsize, agsize, inode size, allocation group count, log buffers at mounting, etc). Any ideas here? Is this a known issue? And is there a workaround? Any help would be greatly appreciated. Andrew _______________________________________________ xfs mailing list xfs@... http://oss.sgi.com/mailman/listinfo/xfs |
|
|
Re: XFS and DPX filesLe Sat, 31 Oct 2009 08:26:28 -0400 vous écriviez:
> Any ideas here? Is this a known issue? And is there a workaround? Any > help would be greatly appreciated. Maybe you should try mounting the XFS filesystem with these options : nobarrier,noatime -- -------------------------------------------------- Emmanuel Florac www.intellique.com -------------------------------------------------- _______________________________________________ xfs mailing list xfs@... http://oss.sgi.com/mailman/listinfo/xfs |
|
|
Re: XFS and DPX files> Maybe you should try mounting the XFS filesystem with these options : > nobarrier,noatime > > Thanks. Already doing that -- 3ware controller does not support barriers, so that's automatically ruled out (you see a message in the syslog when mounting the filesystem that barriers will not be used). Already mounting with "noatime". Also have been trying "filestreams". _______________________________________________ xfs mailing list xfs@... http://oss.sgi.com/mailman/listinfo/xfs |
|
|
Re: XFS and DPX filesLe Sat, 31 Oct 2009 10:37:30 -0400 vous écriviez:
> > Thanks. Already doing that -- 3ware controller does not support > barriers, so that's automatically ruled out (you see a message in the > syslog when mounting the filesystem that barriers will not be used). > Already mounting with "noatime". Also have been trying "filestreams". 3Ware controller definitely supports barriers... What model are you using BTW? On the 9650, the latest firmware gave me a solid 15-20% performance boost! Did you apply the recommended optimizations like these : echo 512 > /sys/block/sdXX/queue/nr_requests blockdev --setra 65536 /dev/sdXX for the read-ahead, usual values depends heavily on the disks make and number, usually ranging from 4096 to 65536. Another trick is to mkfs the drive with su and sw matching the underlying RAID, for instance for a 15 drives RAID6 with 64K stripe use something like (beware, unverified syntax from memory): mkfs -t xfs -d su=65536,sw=15 /dev/sdXX -- -------------------------------------------------- Emmanuel Florac www.intellique.com -------------------------------------------------- _______________________________________________ xfs mailing list xfs@... http://oss.sgi.com/mailman/listinfo/xfs |
|
|
Re: XFS and DPX filesOn Samstag 31 Oktober 2009 Emmanuel Florac wrote:
> Another trick is to mkfs the drive with su and sw matching the > underlying RAID, for instance for a 15 drives RAID6 with 64K stripe > use something like (beware, unverified syntax from memory): > > mkfs -t xfs -d su=65536,sw=15 /dev/sdXX I believe for a 15 drive RAID-6, where 2 disks are used for redundancy, the correct mkfs would be: mkfs -t xfs -d su=65536,sw=13 /dev/sdXX That is, you tell XFS how many *data disks* there are, not how many disks the RAID uses, because the important thing is that XFS should distribute it's metadata over different disks. One thing you could try: Each 2 minutes, create a new dir and store new files there. It could well be that XFS becomes slower when having a certain amount of files in a dir. If you change the dir, and now everything writes without drops, that should be the problem. If you can't change the dir for your application, start a small batch job that moves the files to another dir, or removes them. Another thing to try is if it would help to turn disk cache writes *on*, despite all warnings if the FAQ. That could also give an idea where to look at next time. mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 _______________________________________________ xfs mailing list xfs@... http://oss.sgi.com/mailman/listinfo/xfs |
|
|
Re: XFS and DPX filesLe Mon, 2 Nov 2009 12:05:27 +0100
Michael Monnerie <michael.monnerie@...> écrivait: > I believe for a 15 drive RAID-6, where 2 disks are used for > redundancy, the correct mkfs would be: > mkfs -t xfs -d su=65536,sw=13 /dev/sdXX Yes you're right, I replied a bit too quickly :) > Another thing to try is if it would help to turn disk cache writes > *on*, despite all warnings if the FAQ. The 3Ware is so slow it's almost unusable without write cache. I bet he already uses it anyway. -- ------------------------------------------------------------------------ Emmanuel Florac | Intellique ------------------------------------------------------------------------ _______________________________________________ xfs mailing list xfs@... http://oss.sgi.com/mailman/listinfo/xfs |
|
|
Re: XFS and DPX filesAndrewL733@... wrote:
> >> Maybe you should try mounting the XFS filesystem with these options : >> nobarrier,noatime >> >> > Thanks. Already doing that -- 3ware controller does not support > barriers, so that's automatically ruled out (you see a message in the > syslog when mounting the filesystem that barriers will not be used). > Already mounting with "noatime". Also have been trying "filestreams". Filestreams is really only useful if you have multiple threads writing in this manner - for example 2 different movie streams. Normally an allocation group is chosen for all new files in a directory, so if you have 2 streams to 2 directories you are writing to 2 ag's ... all is good until those ags get full and things spill over. At that point you may wind up interleaving those files from both streams in a 3rd ag. the filestreams option more or less locks out the new ag from other streams, so that stuff stays segregated. If this is your situation (multiple streams, each to their own directory), then filestreams may help. I think for a single stream of files, for a single video source, it won't matter. -Eric _______________________________________________ xfs mailing list xfs@... http://oss.sgi.com/mailman/listinfo/xfs |
|
|
Re: XFS and DPX files>> I believe for a 15 drive RAID-6, where 2 disks are used forredundancy, the correct mkfs would be: >> mkfs -t xfs -d su=65536,sw=13 /dev/sdXX >> > > Yes you're right, I replied a bit too quickly :) > > > >> Another thing to try is if it would help to turn disk cache writes >> *on*, despite all warnings if the FAQ. >> Thank you for your suggestions. Yes I have write caching enabled. And I have StorSave set to "Performance". And I have a UPS on the system at all times! The information about barriers was useful. In years past I was running much older firmware for the 3ware 9650 cards and that did not support barriers. But it is true the current firmware does support barriers. I also believe the 3ware StorSave "Performance" setting will disable barriers as well -- at least it makes the card ignore FUA commands. Anyway, I have mounted the XFS filesystem with the "nobarrier" flag and I'm still seeing the same behavior. If you want to take a closer look at what I mean, please go to this link: http://sites.google.com/site/andrewl733info/xfs_and_dpx At this point, I have tried the following -- and none of these approaches seems to fix the problem: -- preallocation of DPX files -- reservation of DXP files (Make 10,000 zero-byte files named 0000001.dpx through 0010000.dpx) -- creating xfs filesystem with external log device (also a 16-drive RAID array, because that's what I have available) -- mounting with large logbsize -- mounting with more logbufs -- mounting with larger allocsize Again, I want to point out that I don't have any problem with the underlying RAID device. On Linux itself, I get Bonnie++ scores of around 740 MB/sec reading and 650 MB/sec writing, minimum. Over 10 Gigabit Ethernet, I can write uncompressed HD streams (160 MB/sec) and I can read 2K DPX files (300+ MB/sec). DD shows similar results. My gut feeling is that XFS is falling over after creating a certain number of new files. Because the DPX format creates one file for every frame (30 files/sec), it's not really a video stream. It's really like making 30 photoshop files per second. It seems as if some resource that XFS needs is being used up after a certain number of files are created, and that it is very disruptive and costly to get more of that resource. Why ext3 and ext4 can keep going past 60,000 files and xfs falls over after 4000 or 5000 files, I do not understand. _______________________________________________ xfs mailing list xfs@... http://oss.sgi.com/mailman/listinfo/xfs |
|
|
Re: XFS and DPX filesOn Montag 02 November 2009 Emmanuel Florac wrote:
> The 3Ware is so slow it's almost unusable without write cache. I bet > he already uses it anyway. Don't mix up the controller write cache vs. disk write cache. The controller write cache should be on whenever you have a BBM installed, because this brings real performance, while the disk write cache should always be off in a production environment, because you will loose data on power fail, which nobody can recognize (the controller believes the sectors to be written already...) Hence my suggestion for turning on the disk write cache just to see if it makes a difference. mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 _______________________________________________ xfs mailing list xfs@... http://oss.sgi.com/mailman/listinfo/xfs |
|
|
Re: XFS and DPX filesLe Mon, 02 Nov 2009 16:50:48 -0500 vous écriviez:
> It seems as if some resource that > XFS needs is being used up after a certain number of files are > created, and that it is very disruptive and costly to get more of > that resource. Why ext3 and ext4 can keep going past 60,000 files and > xfs falls over after 4000 or 5000 files, I do not understand. I'll do some checks on my side, I have several RAID systems with various RAID controllers (including 3Ware) and a nice "dpx stream" simulator from OPenCube. -- -------------------------------------------------- Emmanuel Florac www.intellique.com -------------------------------------------------- _______________________________________________ xfs mailing list xfs@... http://oss.sgi.com/mailman/listinfo/xfs |
|
|
Re: XFS and DPX filesAndrewL733@... wrote:
> >>> I believe for a 15 drive RAID-6, where 2 disks are used >>> forredundancy, the correct mkfs would be: >>> mkfs -t xfs -d su=65536,sw=13 /dev/sdXX >>> >> >> Yes you're right, I replied a bit too quickly :) >> >> >> >>> Another thing to try is if it would help to turn disk cache writes >>> *on*, despite all warnings if the FAQ. > > Thank you for your suggestions. Yes I have write caching enabled. And I > have StorSave set to "Performance". And I have a UPS on the system at > all times! > > The information about barriers was useful. In years past I was running > much older firmware for the 3ware 9650 cards and that did not support > barriers. But it is true the current firmware does support barriers. I > also believe the 3ware StorSave "Performance" setting will disable > barriers as well -- at least it makes the card ignore FUA commands. > > Anyway, I have mounted the XFS filesystem with the "nobarrier" flag and > I'm still seeing the same behavior. If you want to take a closer look > at what I mean, please go to this link: > > http://sites.google.com/site/andrewl733info/xfs_and_dpx > > At this point, I have tried the following -- and none of these > approaches seems to fix the problem: > > -- preallocation of DPX files > -- reservation of DXP files (Make 10,000 zero-byte files named > 0000001.dpx through 0010000.dpx) > -- creating xfs filesystem with external log device (also a 16-drive > RAID array, because that's what I have available) > -- mounting with large logbsize > -- mounting with more logbufs > -- mounting with larger allocsize Have you said how large the filesystem is? If it's > 1T or 2T, and you're on a 64-bit system, have you tried the inode64 to get nicer inode vs. data allocation behavior? Other suggestions might be to try blktrace/seekwatcher to see where your IO is going, or maybe even oprofile to see if xfs is burning cpu searching for allocations, or somesuch ... -Eric > > Again, I want to point out that I don't have any problem with the > underlying RAID device. On Linux itself, I get Bonnie++ scores of > around 740 MB/sec reading and 650 MB/sec writing, minimum. Over 10 > Gigabit Ethernet, I can write uncompressed HD streams (160 MB/sec) and I > can read 2K DPX files (300+ MB/sec). DD shows similar results. > > My gut feeling is that XFS is falling over after creating a certain > number of new files. Because the DPX format creates one file for every > frame (30 files/sec), it's not really a video stream. It's really like > making 30 photoshop files per second. It seems as if some resource that > XFS needs is being used up after a certain number of files are created, > and that it is very disruptive and costly to get more of that resource. > Why ext3 and ext4 can keep going past 60,000 files and xfs falls over > after 4000 or 5000 files, I do not understand. > > > _______________________________________________ > xfs mailing list > xfs@... > http://oss.sgi.com/mailman/listinfo/xfs > _______________________________________________ xfs mailing list xfs@... http://oss.sgi.com/mailman/listinfo/xfs |
|
|
Re: XFS and DPX filesLe Mon, 2 Nov 2009 22:58:35 +0100
Michael Monnerie <michael.monnerie@...> écrivait: > Hence my suggestion for turning on the disk write cache just to see > if it makes a difference. Unfortunately there isn't any way in the 3Ware controllers to manage that. Worse, I couldn't get a clear answer from 3Ware support about the controller policy about drives caches. However from the tests I've done (pulling the plug on a server while writing) it looks like the 3Ware uses the disks caches in write-thru mode, however (fortunately now that many drives come with 32 or 64 MB caches). Some other SATA/SAS RAID controllers (Areca, LSI and Adaptec) allows to activate drive write-back cache separately from the controller cache, though. -- ------------------------------------------------------------------------ Emmanuel Florac | Intellique ------------------------------------------------------------------------ _______________________________________________ xfs mailing list xfs@... http://oss.sgi.com/mailman/listinfo/xfs |
|
|
Re: XFS and DPX filesOn Dienstag 03 November 2009 Emmanuel Florac wrote:
> Michael Monnerie <michael.monnerie@...> écrivait: > > Hence my suggestion for turning on the disk write cache just to see > > if it makes a difference. > > Unfortunately there isn't any way in the 3Ware controllers to manage > that. Worse, I couldn't get a clear answer from 3Ware support about > the controller policy about drives caches. However from the tests > I've done (pulling the plug on a server while writing) it looks like > the 3Ware uses the disks caches in write-thru mode, however > (fortunately now that many drives come with 32 or 64 MB caches). > > Some other SATA/SAS RAID controllers (Areca, LSI and Adaptec) allows > to activate drive write-back cache separately from the controller > cache, though. Looks like you didn't read the FAQ until now, I tried to document the unclear bits as good as I could: http://www.xfs.org/index.php/XFS_FAQ#Q._Which_settings_does_my_RAID_controller_need_.3F mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 _______________________________________________ xfs mailing list xfs@... http://oss.sgi.com/mailman/listinfo/xfs |
| Free embeddable forum powered by Nabble | Forum Help |