Bad performance under heavy I/O, and a possible solution

Other talk about Salix
GJones
Donor
Posts: 300
Joined: 22. Jul 2011, 23:27

Bad performance under heavy I/O, and a possible solution

Post by GJones »

For a while I've been noticing some behavior on Linux during heavy disk I/O, which is probably good for servers, but terrible for desktops. Basically, when one program is writing a lot of data to the disk, it hogs all the I/O bandwidth and slows everything else down. This results in very high throughput, but bad desktop responsiveness.

(Whereas on Windows XP pretty much the opposite is true; copying files and such takes a long time, but the desktop remains responsive.)

Apparently I'm not the only one who's seen this. I found this thread on Server Fault: http://serverfault.com/questions/126413 ... irty-pages

Based on that thread, and on some other stuff I'd read about the vm.dirty_* settings in sysctl, I did some primitive tests with dd and eventually settled on this:

Code: Select all

vm.dirty_background_ratio = 1000000
vm.dirty_ratio = 1000000
Which hugely increased responsiveness under load, but cut throughput with dd approximately in half relative to the defaults. Since I don't need high throughput for write operations (at this point) I considered that acceptable. However, it is a bit of a hackish solution...

Has anyone else had issues with performance under load? If so, have you had success with anything like this? Or is there a better way to fix it?
User avatar
jayseye
Posts: 233
Joined: 24. Jul 2011, 17:22
Location: Brownsmead, Oregon (Center of the Universe)

Re: Bad performance under heavy I/O, and a possible solution

Post by jayseye »

Apologizing in advance if this is unrelated; it's generally relevant to desktop (vs server) settings, though perhaps OT re performance under load:

Have had success here with decreased "swappiness" as mentioned in other threads, IIRC. Caveat is that this old PC is fairly low on RAM (640 MB) relative to how many desktop processes are running (apps with multiple tabs / windows).

Maybe more to your point, 50% decrease in 'dd' performance would be too much of a sacrifice here, as frequent backups are de rigeur on this old hardware. So I'd be interested if you find a more balanced solution, or at least can detail how to tune the tradeoffs.
GJones
Donor
Posts: 300
Joined: 22. Jul 2011, 23:27

Re: Bad performance under heavy I/O, and a possible solution

Post by GJones »

Well, further experimentation indicates that setting those parameters to something more like 4 MB produces much better throughput, while not hurting desktop performance seriously:

Code: Select all

vm.dirty_background_bytes=4000000
vm.dirty_bytes=4000000
(I feel like it might be better to use a power of two, but 4194304 is harder to remember. :P )

Actually it looks like setting them above 4 MB does not significantly help throughput. The major cutoff seems to be below 4 MB. Granted, that's on my netbook... Not sure what I'd suggest as a generic setting. This is probably not a one-size-fits-all tweak.

Edit: as far as swappiness, I've never found tweaking it to be that helpful. Actually, seeing as setting vm.swappiness to 100 does not immediately create a swapfest, I kind of wonder how much the kernel listens to that setting.
GJones
Donor
Posts: 300
Joined: 22. Jul 2011, 23:27

Re: Bad performance under heavy I/O, and a possible solution

Post by GJones »

Hey jayseye, good news - I ran the dd test again with vm.vfs_cache_pressure=50 and normal dirty_background_ratio and dirty_ratio, and got much better responsiveness *and* less reduction in write speed. It looks like lowering vfs_cache_pressure is the way to go.

NB: as I understand it, this makes the kernel prefer to keep program memory in RAM, and when necessary swap out cached files instead. Obviously that could also slow things down when copying large files, but it doesn't seem to slow things down as much as the dirty_bytes/dirty_background_bytes tweak.

Edit: Hmm, interestingly programs are slower to quit under load with vfs_cache_pressure reduced. Odd.

Edit again: the vfs_cache_pressure tweak seems to stop being effective after a while... Maybe program memory is being forced into swap space anyway? Dunno. At any rate, the core of the problem seems to be "writes hog up too much I/O bandwidth," so the best solution is probably to go ahead and make the kernel do smaller writes (using vm.dirty_bytes).

(If I understand this properly anyway... Which is a big if.)

For now I think I'd suggest setting dirty_bytes to something like 8 MB for a desktop, that seems to be a decent compromise. Actually it looks like dirty_background_bytes doesn't have to be set, since a lower dirty_bytes value takes care of it already. I think.
User avatar
jayseye
Posts: 233
Joined: 24. Jul 2011, 17:22
Location: Brownsmead, Oregon (Center of the Universe)

Re: Bad performance under heavy I/O, and a possible solution

Post by jayseye »

On a PC with limited RAM, reduced swappiness makes a big difference. So our situations are likely very different: the only "load" I experience is when the system starts thrashing (swapping) due to too many running processes. :|

Usually I handle that by restarting Firefox 7, which still seems to have a memory leak when left running too long. :roll: Before restarting FF I like to save and sort all open tabs, but am too busy ATM. Will get to that sometime today as I hear that FF 8 is out.

Actually I do have Salix (also Fluxbox) installed on another PC with a lot more RAM (3GB), but have been mostly running FreeBSD there for consulting work. Might free up that laptop for more Salix experiments, as I move more FreeBSD stuff over to the old Mac Mini. Finally got xorg installed, and plan to solve a "no screens found" error there today.

Anyway, back to the point: what are you running that does so much writing to disk? :?:
Last edited by jayseye on 8. Nov 2011, 04:46, edited 1 time in total.
GJones
Donor
Posts: 300
Joined: 22. Jul 2011, 23:27

Re: Bad performance under heavy I/O, and a possible solution

Post by GJones »

For me the main disk I/O hog is slapt-get (or probably more properly, spkg). Without tweaks the desktop gets slow during updates. It's not a big deal, but it's rather annoying.
User avatar
jayseye
Posts: 233
Joined: 24. Jul 2011, 17:22
Location: Brownsmead, Oregon (Center of the Universe)

Re: Bad performance under heavy I/O, and a possible solution

Post by jayseye »

That's interesting... being lazy and prone to carpal tunnel problems, I mostly use gslapt. Though updates tend to take a while, I chalk that up to needing to tweak my mirrors (Sources). Fixing that is on my list...

Anyway, even on this slow old workhorse (PIII at ~700MHz, 640MB RAM, 40GB HD), with waaay too many programs and tabs open, responsiveness is fine during updates. Under Fluxbox 13.37, I just Alt-Tab or mouse over to another virtual desktop and go about my business.

Forgive me for asking, as I'm a recovering computer support consultant: is there any chance that your hard drive has a hardware problem such as bad sectors? Do you ever run surface scans / check SMART data? SpinRite is my tool of choice for keeping old drives alive, but even new drives occasionally have problems which can really slow down I/O.
GJones
Donor
Posts: 300
Joined: 22. Jul 2011, 23:27

Re: Bad performance under heavy I/O, and a possible solution

Post by GJones »

... You know, I hadn't even thought of that.

Smartctl shows all the statistics as okay. It shows a bunch of DMA errors happening a while back, but nothing recent. Short self-test passes without error.

Hmm, just a thought - this is a netbook, maybe some of the latency is due to CPU throttling?
User avatar
jayseye
Posts: 233
Joined: 24. Jul 2011, 17:22
Location: Brownsmead, Oregon (Center of the Universe)

Re: Bad performance under heavy I/O, and a possible solution

Post by jayseye »

You could test disabling CPU power saving, though a bad block on the HD seems a way more likely culprit.

If you can do without the netbook for around 90 minutes, the Long Self Test would perform an actual read check of every sector on the drive. Or, overnight, run 'badblocks -svn' for a full, non-destructive read-write test.
GJones
Donor
Posts: 300
Joined: 22. Jul 2011, 23:27

Re: Bad performance under heavy I/O, and a possible solution

Post by GJones »

Some updates...

1. As far as I can tell, Linux (with any I/O scheduler) will get bogged down on any computer with a normal HDD, when heavy I/O is going on. Different schedulers behave a little differently; CFQ will make things slow, deadline will make things slower and cause audio skipping, and noop will prevent things from launching, period.

2. If you use dd like this as the test:

dd if=/dev/zero of=dumpfile bs=4096 count=1000000

the vfs_cache_pressure tweak will work... For about 30 seconds. After that (presumably as everything is pushed out of RAM) things will start slowing down again.

3. With CFQ, increasing the I/O queue length (/sys/block/sdX/queue/nr_requests) makes things a little faster. It has to be absolutely huge though - I didn't see any effects until it was over a million.

Limiting dirty_bytes seems to produce good performance no matter what - and it should, because it's hamfistedly cutting off the I/O bandwidth of write operations. But that is a really, really hackish solution. It would be better to have a scheduler that was actually *fair*, and as far as I can tell, none of the three current schedulers (nor the anticipatory scheduler, judging from experiments with Debian Lenny) qualify.

I'm sure there's a better solution... Such a popular OS can't possibly be this crappy, can it? But if there is I haven't found it.
Post Reply