<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<div class="moz-cite-prefix">francis picabia wrote the following on
1/25/2013 7:55 AM:<br>
</div>
<blockquote
cite="mid:CA+AKB6FmLK=DYWT48Lc1YAD7AZ_jR_AWAoTV99v7r1QCvLfdnw@mail.gmail.com"
type="cite"><br>
<div class="gmail_quote">On Thu, Jan 24, 2013 at 12:22 PM, Blake
Hudson <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:blake@ispn.net" target="_blank">blake@ispn.net</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="HOEnZb">
<div class="h5"><br>
<br>
</div>
</div>
There are a couple suggestions I'd like to put forth. First,
improper<br>
partition alignment is generally masked by the controller
cache. I<br>
strongly encourage you to check that your RAID array is making
use of<br>
this cache by enabling the WriteBack caching option on this
array,<br>
especially if your PERC card has a BBU (I think this was
optional on<br>
perc 5). You can install the MegaCLI tool from LSI to verify
this (can<br>
also be checked from OpenManage or reboot into the controller
BIOS).<br>
<br>
</blockquote>
<div><br>
Thanks for this tip. It put me on to what is wrong.<br>
<br>
Jan 18 07:25:39 myserv Server Administrator: Storage Service
EventID: 2335 Controller event log: BBU disabled; changing WB
virtual disks to WT: Controller 0 (PERC 5/i Integrated) <br>
<br>
Bingo! We had write back all along, and the performance
tanked when it<br>
fell back to write through. I was wondering why my policy
change attempts<br>
were flipping back when I tried testing WB this morning!<br>
<br>
This explains everything we've been seeing. Wow. Gotta call
Dell.<br>
<br>
Thanks everyone for the assistance. I didn't think a battery
which shows OK<br>
status in omreport could wound us!<br>
<br>
</div>
</div>
<br>
</blockquote>
The PERC cards will disable write-back caching while the BBU is
charging/exercising. However, within a few hours the BBU should
return to normal status. In rare instances, people on the Dell
mailing list have reported that their caching status never returns
to write-back - even after attempting to force write-back caching on
the array. Attempts and power cycling or firmware flashing are
tried, but seem to be futile in most cases. Often, replacement of
the card is necessary. I'm unsure if it's the battery, the card, or
some software setting, but I would definitely follow up with Dell.<br>
<br>
On the next server (or array) you configure, I would attempt to
align your partitions as you've investigated. Sector 2048 seems to
be a good starting position for most RAID levels. I have no
conclusive evidence that a different file system or alignment
improves my performance, because I've never done a fair side by side
test with controlled inputs. However, we use ext4 and do align our
partitions using RAID10 on 15k SAS drives for all our Cyrus
installs. I have found some issues with the newer systems that I
attribute to the move from ext3 to ext4 which can result in MySQL
replication problems on power loss/freeze, but these issues are vary
rare and usually easy to recover from in our environment. I also
notice that new systems always perform better than the old systems,
even with identical hardware - I've often attributed this to
fragmentation.<br>
<br>
--Blake<br>
</body>
</html>