Discussion:
[Beowulf] Optimal BIOS settings for Tyan K8SRE
stephen mulcahy
2006-08-31 13:34:41 UTC
Permalink
Hi,

I'm maintaining a 20-node cluster of Tyan K8SREs (4GB RAM, dual Opteron
270s) which are being used primarily for Oceanographic modelling (MPICH2
running on Debian/Linux 2.6 kernel).

I had to make some tweaks to make all 4GB of RAM visible to the OS.

I'm now at the point where I'm considering a pass at performance tuning
the system. Before I start on OS level tuning, I'm trying to figure out
whether there are any performance improvements to be had from tweaking
BIOS settings, particularly those relating to Memory and ECC. I have a
reasonable conceptual understanding of what the various settings are doing
(and have glanced at the AMD BIOS developers guide for reference) but I am
very unclear on what the potential performance impact of any of these
settings are. Does anyone have any general advice or pointers to good
reference information on this?

My current settings are,

Hammer Configuration
HT-LDT Frequency Auto
Dual-Core Enable Enabled
ECC Features
ECC Enabled
ECC Scrub Redirection Enabled
Dram ECC Scrub CTL Disabled
Chip-Kill Disabled
DCACHE ECC Scrub CTL Disabled
L2 ECC Scrub CTL Disabled

Memory Hole
4GB Memory Hole Adjust Manual
4GB Memory Hole Size 768 MB
IOMMU Enabled
Size 32 MB
Memhole mapping Hardware

Memory Config
Swizzle Memory Banks Enabled
DDR clock jitter Disabled
DDR Data Transfer Rate Auto
Enable all memory clocks Populated
Controller config mode Auto
Timing config mode Auto
AMD PowerNow! Disabled
Node Memory Interleave Auto
Dram Bank Interleave Auto
GART Error Reporting Disabled
MTRR Mapping Discrete


Any comments on those welcome.

Thanks,

-stephen
--
Stephen Mulcahy Applepie Solutions Ltd http://www.aplpi.com
Unit 30, Industry Support Centre, GMIT, Dublin Rd, Galway, Ireland.
Bruce Allen
2006-09-03 22:10:35 UTC
Permalink
Stephen,

Search the archives of this list to find some advice from me and others
about setting the ECC features.

Cheers,
Bruce
Post by stephen mulcahy
Hi,
I'm maintaining a 20-node cluster of Tyan K8SREs (4GB RAM, dual Opteron
270s) which are being used primarily for Oceanographic modelling (MPICH2
running on Debian/Linux 2.6 kernel).
I had to make some tweaks to make all 4GB of RAM visible to the OS.
I'm now at the point where I'm considering a pass at performance tuning
the system. Before I start on OS level tuning, I'm trying to figure out
whether there are any performance improvements to be had from tweaking
BIOS settings, particularly those relating to Memory and ECC. I have a
reasonable conceptual understanding of what the various settings are doing
(and have glanced at the AMD BIOS developers guide for reference) but I am
very unclear on what the potential performance impact of any of these
settings are. Does anyone have any general advice or pointers to good
reference information on this?
My current settings are,
Hammer Configuration
HT-LDT Frequency Auto
Dual-Core Enable Enabled
ECC Features
ECC Enabled
ECC Scrub Redirection Enabled
Dram ECC Scrub CTL Disabled
Chip-Kill Disabled
DCACHE ECC Scrub CTL Disabled
L2 ECC Scrub CTL Disabled
Memory Hole
4GB Memory Hole Adjust Manual
4GB Memory Hole Size 768 MB
IOMMU Enabled
Size 32 MB
Memhole mapping Hardware
Memory Config
Swizzle Memory Banks Enabled
DDR clock jitter Disabled
DDR Data Transfer Rate Auto
Enable all memory clocks Populated
Controller config mode Auto
Timing config mode Auto
AMD PowerNow! Disabled
Node Memory Interleave Auto
Dram Bank Interleave Auto
GART Error Reporting Disabled
MTRR Mapping Discrete
Any comments on those welcome.
Thanks,
-stephen
Mark Hahn
2006-09-03 23:06:26 UTC
Permalink
Post by stephen mulcahy
270s) which are being used primarily for Oceanographic modelling (MPICH2
running on Debian/Linux 2.6 kernel).
on gigabit?
Post by stephen mulcahy
I had to make some tweaks to make all 4GB of RAM visible to the OS.
how much was missing, and was it just graphics aperture-related?
Post by stephen mulcahy
HT-LDT Frequency Auto
Dual-Core Enable Enabled
ECC Features
ECC Enabled
ECC Scrub Redirection Enabled
Dram ECC Scrub CTL Disabled
Chip-Kill Disabled
DCACHE ECC Scrub CTL Disabled
L2 ECC Scrub CTL Disabled
those seem to be normal settings I see on most machines. the RAS-related
settings seem to be unnecessary for a "normal" cluster (one where no large
rate of ECC's happen, and one where a reboot doesn't cause planes to fall
out of the sky.)

on the other hand, I'd love to find out whether there is any performance
impact from enabling scrub, since it does slightly increase memory workload.
then again, if your rate of correctable ECCs is trivial, scrubbing is not
relevant...
Post by stephen mulcahy
Memory Hole
4GB Memory Hole Adjust Manual
4GB Memory Hole Size 768 MB
IOMMU Enabled
Size 32 MB
Memhole mapping Hardware
I don't think there are performance implications here. you seem to have
already found the right combination of iommu/memhole settings that give
you your full roster of ram. my googling on the topic didn't enlighten
me much, though people apparently recommend "iommu=memaper=3"
Post by stephen mulcahy
Memory Config
Swizzle Memory Banks Enabled
donno - I don't think this appears in the AMD bios-writers guide
Post by stephen mulcahy
DDR clock jitter Disabled
DDR Data Transfer Rate Auto
Enable all memory clocks Populated
Controller config mode Auto
Timing config mode Auto
those are the settings I normally see as well.
Post by stephen mulcahy
AMD PowerNow! Disabled
Node Memory Interleave Auto
Dram Bank Interleave Auto
for numa-aware OS's (like any modern linux), I think node-memory
interleave should be disabled.

regards, mark hahn.
Bruce Allen
2006-09-04 07:26:43 UTC
Permalink
Post by stephen mulcahy
ECC Features
ECC Enabled
ECC Scrub Redirection Enabled
Dram ECC Scrub CTL Disabled
Chip-Kill Disabled
DCACHE ECC Scrub CTL Disabled
L2 ECC Scrub CTL Disabled
You can find our systems BIOS/ECC/Scrub settings here:
http://www.lsc-group.phys.uwm.edu/beowulf/nemo/construction/BIOS/bios_settings.txt
Our systems are Supermicro H8SSL-i motherboards, with a
Serverworks/Broadcom HT1000 chipset and a single Opteron 175 (dual core,
2.2 GHz).

The ECC part is:
DRAM ECC Enable = Enabled
MCA DRAM ECC Logging = Enabled
DRAM Scrub Redirect = Enabled
DRAM BG Scrub = 2.62ms
L2 Cache BG Scrub = 84.00ms
Data Cache BG Scrub = 84.00ms

Scrubbing is done one cache line (64) bytes at a time. Thus with 2GB of
memory and DRAM background scrub interval of 2.62ms we will scrub the
entire memory in approximately:

2 GB/64 Bytes * 2.62 ms = 2^31 / 2^6 * 2.62 ms = 87912 secs

So our choices correspond to one complete scrub of DRAM per day. Our
settings scrub the L2 cache more often: about once every half hour. Just
modify the calculation above, using 1MB instead of 2GB, and 84 ms instead
of 2.62 ms. One finds that the L2 cache is scrubbed about once every 1376
seconds (every 23 minutes).

Cheers,
Bruce
Bruce Allen
2006-09-05 00:48:20 UTC
Permalink
I did not check, but would not expect any significant performance impact.
With the 84 msec choice of scrub times, we are only touching about a dozen
cache lines (64 bytes each) per second. I don't see how this could have a
significant impact on performance.

Cheers,
Bruce
Hi Bruce,
Do you have any idea what the performance impact from enabling scrubbing
is on your systems? did you do any before/after benchmarking?
Thanks,
-stephen
Post by Bruce Allen
Post by stephen mulcahy
ECC Features
ECC Enabled
ECC Scrub Redirection Enabled
Dram ECC Scrub CTL Disabled
Chip-Kill Disabled
DCACHE ECC Scrub CTL Disabled
L2 ECC Scrub CTL Disabled
http://www.lsc-group.phys.uwm.edu/beowulf/nemo/construction/BIOS/bios_settings.txt
Our systems are Supermicro H8SSL-i motherboards, with a
Serverworks/Broadcom HT1000 chipset and a single Opteron 175 (dual core,
2.2 GHz).
DRAM ECC Enable = Enabled
MCA DRAM ECC Logging = Enabled
DRAM Scrub Redirect = Enabled
DRAM BG Scrub = 2.62ms
L2 Cache BG Scrub = 84.00ms
Data Cache BG Scrub = 84.00ms
Scrubbing is done one cache line (64) bytes at a time. Thus with 2GB of
memory and DRAM background scrub interval of 2.62ms we will scrub the
2 GB/64 Bytes * 2.62 ms = 2^31 / 2^6 * 2.62 ms = 87912 secs
So our choices correspond to one complete scrub of DRAM per day. Our
settings scrub the L2 cache more often: about once every half hour.
Just modify the calculation above, using 1MB instead of 2GB, and 84 ms
instead of 2.62 ms. One finds that the L2 cache is scrubbed about once
every 1376 seconds (every 23 minutes).
Cheers,
Bruce
Bruce Allen
2006-09-05 00:49:28 UTC
Permalink
PS: be sure to use the 'mcelog' utility and package to monitor for ECC
errors. If you have a large number of nodes this will help to identify
flaky memory and cpus with cache memory issues.
Hi Bruce,
Do you have any idea what the performance impact from enabling scrubbing
is on your systems? did you do any before/after benchmarking?
Thanks,
-stephen
Post by Bruce Allen
Post by stephen mulcahy
ECC Features
ECC Enabled
ECC Scrub Redirection Enabled
Dram ECC Scrub CTL Disabled
Chip-Kill Disabled
DCACHE ECC Scrub CTL Disabled
L2 ECC Scrub CTL Disabled
http://www.lsc-group.phys.uwm.edu/beowulf/nemo/construction/BIOS/bios_settings.txt
Our systems are Supermicro H8SSL-i motherboards, with a
Serverworks/Broadcom HT1000 chipset and a single Opteron 175 (dual core,
2.2 GHz).
DRAM ECC Enable = Enabled
MCA DRAM ECC Logging = Enabled
DRAM Scrub Redirect = Enabled
DRAM BG Scrub = 2.62ms
L2 Cache BG Scrub = 84.00ms
Data Cache BG Scrub = 84.00ms
Scrubbing is done one cache line (64) bytes at a time. Thus with 2GB of
memory and DRAM background scrub interval of 2.62ms we will scrub the
2 GB/64 Bytes * 2.62 ms = 2^31 / 2^6 * 2.62 ms = 87912 secs
So our choices correspond to one complete scrub of DRAM per day. Our
settings scrub the L2 cache more often: about once every half hour.
Just modify the calculation above, using 1MB instead of 2GB, and 84 ms
instead of 2.62 ms. One finds that the L2 cache is scrubbed about once
every 1376 seconds (every 23 minutes).
Cheers,
Bruce
stephen mulcahy
2006-09-04 13:27:40 UTC
Permalink
Hi Bruce,

Do you have any idea what the performance impact from enabling scrubbing
is on your systems? did you do any before/after benchmarking?

Thanks,

-stephen
Post by Bruce Allen
Post by stephen mulcahy
ECC Features
ECC Enabled
ECC Scrub Redirection Enabled
Dram ECC Scrub CTL Disabled
Chip-Kill Disabled
DCACHE ECC Scrub CTL Disabled
L2 ECC Scrub CTL Disabled
http://www.lsc-group.phys.uwm.edu/beowulf/nemo/construction/BIOS/bios_settings.txt
Our systems are Supermicro H8SSL-i motherboards, with a
Serverworks/Broadcom HT1000 chipset and a single Opteron 175 (dual core,
2.2 GHz).
DRAM ECC Enable = Enabled
MCA DRAM ECC Logging = Enabled
DRAM Scrub Redirect = Enabled
DRAM BG Scrub = 2.62ms
L2 Cache BG Scrub = 84.00ms
Data Cache BG Scrub = 84.00ms
Scrubbing is done one cache line (64) bytes at a time. Thus with 2GB of
memory and DRAM background scrub interval of 2.62ms we will scrub the
2 GB/64 Bytes * 2.62 ms = 2^31 / 2^6 * 2.62 ms = 87912 secs
So our choices correspond to one complete scrub of DRAM per day. Our
settings scrub the L2 cache more often: about once every half hour.
Just modify the calculation above, using 1MB instead of 2GB, and 84 ms
instead of 2.62 ms. One finds that the L2 cache is scrubbed about once
every 1376 seconds (every 23 minutes).
Cheers,
Bruce
--
Stephen Mulcahy, Applepie Solutions Ltd, Innovation in Business Center,
GMIT, Dublin Rd, Galway, Ireland. mailto:***@aplpi.com
mobile:+353.87.2930252 office:+353.91.751262 http://www.aplpi.com
Mark Hahn
2006-09-07 13:44:00 UTC
Permalink
What chipkill buys you is the reasonable assurance that if you
have a long uptime and you get soft memory errors something will
look at and correct the data before multiple errors have time to
accumulate.
true, but I think you mean scrubbing, not chipkill. from AMD's
bios-writers doc, chipkill actually just changes the ECC to
incorporate all 128b; the 'chip' part comes from the fact that
with 4x dram chips, the syndrome maps to a particular chip
(and all 4 bits from that chip can be corrected.)

this has been a useful discussion, because I hadn't previously
realized just what was the benefit (prevent accumulation of
multi-bit errors, proactively fix some singles) and cost
(a tiny amount of bandwidth, O(100KB/s).
For performance I suspect you will see much more variation depending
on the speed of memory in your system.
the max scrub rate (64B in 40ns) is 1.6 GB/s. that's a lot, but is
still lower than the bandwidth from even a single-dimm config.
(NOT to imply it's a sensible setting in that case!)

for my 8GB machines, I suspect a daily scrub (50KB/s per node)
will have undetectable overhead.
Eric W. Biederman
2006-09-07 05:51:58 UTC
Permalink
Hi Bruce,
Do you have any idea what the performance impact from enabling scrubbing
is on your systems? did you do any before/after benchmarking?
I don't know what the performance impact of scrubbing is for Bruce. I
did some looking a long time ago on Opterons and at the Highest
frequency the scrubbing consumed 50% of the memory bandwidth, at the
lowest frequency you couldn't measure that it was turned on.

Enabling ECC and chipkill ECC has no measurable impact.

What chipkill buys you is the reasonable assurance that if you
have a long uptime and you get soft memory errors something will
look at and correct the data before multiple errors have time to
accumulate.

For performance I suspect you will see much more variation depending
on the speed of memory in your system.

Eric
stephen mulcahy
2006-09-04 12:51:11 UTC
Permalink
Hi Mark,

Thanks for your mail.

See my comments below.
Post by Mark Hahn
Post by stephen mulcahy
270s) which are being used primarily for Oceanographic modelling (MPICH2
running on Debian/Linux 2.6 kernel).
on gigabit?
Yes, on gigabit (is this an uhu moment? :) Someone has suggested I
should be looking at OpenMPI in preference to MPICH2. We did some
initial testing with a beta of LAM but it was too buggy to be usable (we
hit the NFS bug in 7.1.1 and the test suite had some failures in
7.1.2beta) - is there a significant performance difference between the 2?
Post by Mark Hahn
Post by stephen mulcahy
I had to make some tweaks to make all 4GB of RAM visible to the OS.
how much was missing, and was it just graphics aperture-related?
We were missing about 1GB as far as I remember so it was more the
graphics aperture afaics.
Post by Mark Hahn
Post by stephen mulcahy
HT-LDT Frequency Auto
Dual-Core Enable Enabled
ECC Features
ECC Enabled
ECC Scrub Redirection Enabled
Dram ECC Scrub CTL Disabled
Chip-Kill Disabled
DCACHE ECC Scrub CTL Disabled
L2 ECC Scrub CTL Disabled
those seem to be normal settings I see on most machines. the RAS-related
settings seem to be unnecessary for a "normal" cluster (one where no large
rate of ECC's happen, and one where a reboot doesn't cause planes to fall
out of the sky.)
on the other hand, I'd love to find out whether there is any performance
impact from enabling scrub, since it does slightly increase memory workload.
then again, if your rate of correctable ECCs is trivial, scrubbing is
not relevant...
I was wondering if the scrubbing had a performance impact myself. I
guess if there is no performance impact then, since the functionality is
there, I'm inclined to enable as much of it as possible - but if it
costs a few percents of performance then I'm inclined to let a node die
on occasion rather than hobble the whole cluster ... but I'm not clear
on how exactly scrubbing works. Does anyone have any insights? Is
scrubbing something thats only triggered in the event of an error - or
is it something that happens continously in the background, if so, does
it incur a performance penalty?
Post by Mark Hahn
Post by stephen mulcahy
Memory Hole
4GB Memory Hole Adjust Manual
4GB Memory Hole Size 768 MB
IOMMU Enabled
Size 32 MB
Memhole mapping Hardware
I don't think there are performance implications here. you seem to have
already found the right combination of iommu/memhole settings that give
you your full roster of ram. my googling on the topic didn't enlighten
me much, though people apparently recommend "iommu=memaper=3"
I did some follow-up googling myself and it sounds like
"iommu=memaper=3" is useful if you run out of IOMMU space .. but failing
that theres probably no benefit? Someone has suggested that "software"
Memhole mapping may be "better" but I'm not sure what "better" means yet.
Post by Mark Hahn
Post by stephen mulcahy
Memory Config
Swizzle Memory Banks Enabled
donno - I don't think this appears in the AMD bios-writers guide
Post by stephen mulcahy
DDR clock jitter Disabled
DDR Data Transfer Rate Auto
Enable all memory clocks Populated
Controller config mode Auto
Timing config mode Auto
those are the settings I normally see as well.
Post by stephen mulcahy
AMD PowerNow! Disabled
Node Memory Interleave Auto
Dram Bank Interleave Auto
for numa-aware OS's (like any modern linux), I think node-memory
interleave should be disabled.
Thanks, it seems that "node-memory interleave" could cause a performance
hit alright and I'll definitely disable this.

(some numbers here -
http://www.digit-life.com/articles2/cpu/rmma-numa.html).

Thanks again for your response,

-stephen
--
Stephen Mulcahy, Applepie Solutions Ltd, Innovation in Business Center,
GMIT, Dublin Rd, Galway, Ireland. http://www.aplpi.com
Ivan Paganini
2006-09-03 21:52:53 UTC
Permalink
Can you post what were the tweaking that you had undergone to access the
4GB? Thank you.
Post by stephen mulcahy
Hi,
I'm maintaining a 20-node cluster of Tyan K8SREs (4GB RAM, dual Opteron
270s) which are being used primarily for Oceanographic modelling (MPICH2
running on Debian/Linux 2.6 kernel).
I had to make some tweaks to make all 4GB of RAM visible to the OS.
I'm now at the point where I'm considering a pass at performance tuning
the system. Before I start on OS level tuning, I'm trying to figure out
whether there are any performance improvements to be had from tweaking
BIOS settings, particularly those relating to Memory and ECC. I have a
reasonable conceptual understanding of what the various settings are doing
(and have glanced at the AMD BIOS developers guide for reference) but I am
very unclear on what the potential performance impact of any of these
settings are. Does anyone have any general advice or pointers to good
reference information on this?
My current settings are,
Hammer Configuration
HT-LDT Frequency Auto
Dual-Core Enable Enabled
ECC Features
ECC Enabled
ECC Scrub Redirection Enabled
Dram ECC Scrub CTL Disabled
Chip-Kill Disabled
DCACHE ECC Scrub CTL Disabled
L2 ECC Scrub CTL Disabled
Memory Hole
4GB Memory Hole Adjust Manual
4GB Memory Hole Size 768 MB
IOMMU Enabled
Size 32 MB
Memhole mapping Hardware
Memory Config
Swizzle Memory Banks Enabled
DDR clock jitter Disabled
DDR Data Transfer Rate Auto
Enable all memory clocks Populated
Controller config mode Auto
Timing config mode Auto
AMD PowerNow! Disabled
Node Memory Interleave Auto
Dram Bank Interleave Auto
GART Error Reporting Disabled
MTRR Mapping Discrete
Any comments on those welcome.
Thanks,
-stephen
--
Stephen Mulcahy Applepie Solutions Ltd http://www.aplpi.com
Unit 30, Industry Support Centre, GMIT, Dublin Rd, Galway, Ireland.
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
-----------------------------------------------------------
Ivan S. P. Marin
Laboratório de Física Computacional
lfc.ifsc.usp.br
Instituto de Física de São Carlos - USP
----------------------------------------------------------
stephen mulcahy
2006-09-04 12:31:50 UTC
Permalink
Hi,

Sure. We have a head node which acts as an NFS server for the diskless
compute nodes in the cluster. On the head node, I had to use the
following BIOS settings to ensure the OS could see the full 4GB of
physical memory installed,


Main
Installed O/S Linux

Memory Hole
4GB Memory Hole Adjust Auto
4GB Memory Hole Size 1024 MB
IOMMU Enabled
Memhole mapping Software

MTRR Mapping Discrete

On the diskless nodes, if I used the above settings - the driver for the
network card wasn't loaded (I guess the probing failed due to the
network device not being visible with the above memory hole config - but
I'm a bit fuzzy on the details of what happens here and whether this is
h/w or driver dependent). Some further experimentation gave the
following configuration which sees 4GB of memory available, *and* a
working network driver (which is pretty useful for diskless compute
nodes :)

Main
Installed O/S Linux

Memory Hole
4GB Memory Hole Adjust Manual
4GB Memory Hole Size 768 MB
IOMMU Enabled
Size 32 MB
Memhole mapping Hardware

MTRR Mapping Discrete

I'm not sure what the performance difference is between the 2 above and
whether there would be an advantage to changing the head node to also
use hardware memhole mapping.

The kernel in both cases are stock Debian kernels (mostly using standard
options, some config changes to allow diskless NFS booting but nothing
memory related).

-stephen
Post by Ivan Paganini
Can you post what were the tweaking that you had undergone to access the
4GB? Thank you.
Hi,
I'm maintaining a 20-node cluster of Tyan K8SREs (4GB RAM, dual Opteron
270s) which are being used primarily for Oceanographic modelling (MPICH2
running on Debian/Linux 2.6 kernel).
I had to make some tweaks to make all 4GB of RAM visible to the OS.
I'm now at the point where I'm considering a pass at performance tuning
the system. Before I start on OS level tuning, I'm trying to figure out
whether there are any performance improvements to be had from tweaking
BIOS settings, particularly those relating to Memory and ECC. I have a
reasonable conceptual understanding of what the various settings are doing
(and have glanced at the AMD BIOS developers guide for reference) but I am
very unclear on what the potential performance impact of any of these
settings are. Does anyone have any general advice or pointers to good
reference information on this?
My current settings are,
Hammer Configuration
HT-LDT Frequency Auto
Dual-Core Enable Enabled
ECC Features
ECC Enabled
ECC Scrub Redirection Enabled
Dram ECC Scrub CTL Disabled
Chip-Kill Disabled
DCACHE ECC Scrub CTL Disabled
L2 ECC Scrub CTL Disabled
Memory Hole
4GB Memory Hole Adjust Manual
4GB Memory Hole Size 768 MB
IOMMU Enabled
Size 32 MB
Memhole mapping Hardware
Memory Config
Swizzle Memory Banks Enabled
DDR clock jitter Disabled
DDR Data Transfer Rate Auto
Enable all memory clocks Populated
Controller config mode Auto
Timing config mode Auto
AMD PowerNow! Disabled
Node Memory Interleave Auto
Dram Bank Interleave Auto
GART Error Reporting Disabled
MTRR Mapping Discrete
Any comments on those welcome.
Thanks,
-stephen
--
Stephen Mulcahy Applepie Solutions Ltd http://www.aplpi.com
Unit 30, Industry Support Centre, GMIT, Dublin Rd, Galway, Ireland.
_______________________________________________
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
-----------------------------------------------------------
Ivan S. P. Marin
Laboratório de Física Computacional
lfc.ifsc.usp.br <http://lfc.ifsc.usp.br>
Instituto de Física de São Carlos - USP
----------------------------------------------------------
--
Stephen Mulcahy, Applepie Solutions Ltd, Innovation in Business Center,
GMIT, Dublin Rd, Galway, Ireland. http://www.aplpi.com
stephen mulcahy
2006-09-08 08:54:51 UTC
Permalink
Hi,

Thanks to everyone who responded to my queries. I've tried to summarise
the responses below for other's reference. Hope this is useful.

For BIOS memory settings, may want to disable "Node Memory Interleave".
It may decrease memory bandwidth and noticeably increase memory latency
(this is supported by the measurements in
http://www.digit-life.com/articles2/cpu/rmma-numa.html).

With K8SRE board in particular, there may be issues with Linux Broadcom
driver in kernels > 2.6.5 which could cause stability problems at high
load. If problems are seen, may want to use either 2.6.4 or 2.6.16+
Similarly, there are known issues with nforce4 chipset which may cause
NFS errors or K8SRE shutdowns. May need an NFS patch if these problems
occur.

Enabling ECC Scrubbing (for both cache and DRAM) using the highest scrub
times (normally 84ms) should not have a significant performance impact
(note that using scrubbing with the lowest times/highest frequency may
impact performance) and should make for a slightly more reliable
system. Enabling Chipkill should also increase memory reliability
without any performance impact and is recommended.

It is recommended to use the mcelog package so that any memory errors
are recorded at the operating system level.

Thanks to:
Alex Ninaber
Bruce Allen
Mark Hahn
Eric W. Biederman

-stephen
--
Stephen Mulcahy, Applepie Solutions Ltd, Innovation in Business Center,
GMIT, Dublin Rd, Galway, Ireland. http://www.aplpi.com
Loading...