Monday, March 29, 2010

Guide to Successful Overclocking

By Pauldovi @ www.overclockers.net

This guide is intended to help seasoned Intel overclockers grasp the new concepts of northbridge strap and help beginner overclockers get the most out of their system. With such a wide range of experience and skill it is going to be difficult to write a fully useful guide that all ranges of skill level will be able to use. This guide is intended for people who already have a system designed for overclocking. (I.E. a non-Dell)

This guide is going to have a format which I feel will help the beginner completely understand what overclocking is all about. It will take the following format:

* What is the goal of overclocking?
* Computer components that are key to your overclock
* How to overclock and test

I hope for the guide to evolve and mature as people read it, ask questions, and get answers. I will be able to change the guide to better explain subjects that are not understood or that are frequently questioned. It is just about impossible to write a perfect and complete guide without input from the readers. Ideas from the readers will help make this guide 10x better.
So here it goes:

1.What is the goal of overclocking?

Overclocking is manually overriding your systems default settings to obtain more performance. People often spend a considerable amount of time and money on their system. However, you can have a successful overclock at all levels of dedication and funding. Overclocking is often very addicting. People who want performance, who want to brag, or just really like computers find overclocking to be an enjoyable hobby.

When overclocking an Intel computer system, the goal is to obtain the highest level of memory bandwidth. This is really the only way to generalize the complete goal. Overclocking your system maximizes what you get out of it.

The question is always asked: Can I overclock my Dell, HP, IBM, or Compact system? The short answer is always no.

2.Computer components that are key to your overclock.

When overclocking your system, you must consider the relationship between 3 main components:

1. Your CPU
2. Your Memory
3. Your Northbridge

Other components such as power supply and cooling solution play an indirect role in your overclocked system’s stability, longevity, and potential.

Having a power supply which can deliver sufficient levels of amperage, voltage, and clean power is absolutely critical. The wattage of a power supply (PSU) is often touted as being the most significant part of a power supply. It is in fact not. The total amperage and the form compliance are more important. You want a PSU that deliver maximum amount of amperage. The 12V line is what supplies the processor and video card with power. These are the most power hungry components and thus makes the +12V line the most important part of a power supply.

One way to determine how much power a power supply can actually supply is to multiply the rated amperage on the 12V line by 12. For example, a PSU with 30 Amps on the 12V line can deliver 360 Watts of power. Of course, how many Amps the power supply can deliver is dependent upon many things like temperature and the quality of the capacitors.

Most computer will never exceed 30-35 Amps of power on 12V line. SLI systems with quad core processors and multiple hard drives rarely exceed 400 Watts of power (or about 34 Amps on the 12V line)

Some power supplies split their 12V line into different rails. Each rail is capable of supplying no more than 20Amps of power. However, a multiple rail line is not as good as a single rail line. This is because if you do not use all of the available amperage on one rail, it becomes wasted.

For Example:

You have a PSU will 2 – 20Amp +12V lines. You have a CPU and video card connected to the two individual +12V lines. The CPU is only using 8Amps while the video card is drawing 19Amps.

Guess what, the 1st 12V line that is connect to the CPU has wasted 12Amps because the CPU just doesn’t need all of the power. While the first +12V is supply more than ample power to the CPU, the 2nd 12V line is barely able to give the video card enough.

Therefore, a system with a single high amperage 12V rail will out-perform its multi-rail cousin.

The amount of wattage is also important. However, if you ensure that the power supply you have is carrying sufficient amperage, it will also have enough wattage.

Your cooling solution can have a significant role in the longevity and the potential of your systems overclock. You can overclock your CPU on its stock heatsink and fan (HSF or HS/F), however it is highly recommended that you get active air cooling anytime you increase the voltage over stock settings. Once you become more experienced with your system and overclocking in general, you may want to try out more extreme methods of cooling. Such as water cooling, TEC, peltier, and phase-change. These extreme methods all have high risk, high cost, and high maintain involved. However, they also have high levels of performance.

Since we are only dealing with Intel overclocking, I will not get into video card cooling or overclocking. However, there are other components that you may wish to cool with aftermarket solutions in addition to your CPU. Such components include your systems northbridge (NB), southbridge (SB), voltage regulator (VRM), Pulse-width modulator (PWM), and your memory modules (DIMMS). The more you invest in cooling the longer you system will last and the higher you will be able to overclock.

As with all components, cooling follows the law of diminishing returns. What that means is that the more money, time, the benefiting results become smaller and smaller.

For Example:

Adding a $50 HS/F to your system may increase your overclock by 50%. Spending an addition $250 for a watercooling setup may only offer 10% improvement in overclock over your HS/F, even though you spend 5 times as much!

Therefore, when someone on a budget is considering their system components, they should consider the Law of Diminishing Returns. Cooling can be a very expensive part of your system. If you are on a budget or are not interested in peaking your system performance air cooling is the best thing for you. Watercooling may give you a little more performance, but only your benchmarks will notice the difference. Watercooling should be used more as a hobby than a practical cooling solution.

The final component that I am going to discuss which plays an indirect role in your systems overclock is your motherboard power supplying components (VRM, PWM, ect) the Voltage Regulator and the Pulse-width Modulator regulate the power than goes to your CPU and memory. These components see what is called a voltage droop (Vdroop) when you go under system load. What happens is the system supplies less power at loads then what you set. That can be problematic if your system voltage varies to much. The only way this can be fixed is through hard voltage mods which require knowledge of soldiering and circuit boards. I just want to throw that out there to help you understand everything about your system.

The CPU

Your CPU has a FSB (front side bus), which is the speed it communicates with the memory and north bridge. In an Intel system, your FSB is quad pumped, meaning that you multiply the actual FSB times 4 to get the rated FSB. What this mean is that your CPU's front side bus sends 4 transmissions of data per clock cycle. This is analogous to memory which can transmit 2 pieces of data per clock cycle (DDR or Dual Data Rate). Instead, the FSB is QDR (Quad Data Rate).

For example:

266.66Mhz FSB x 4 = 1066Mhz rated FSB.

The FSB is the speed of the L2 (level 2) cache. L2 Cache is largest chunk (desktop CPUs only) of memory on the CPU. This is what directly communicates with your system memory. From the L2 cache the data is moved to the much smaller, but also much faster L1 (level 1) cache. CPU speed is normally referred to as the speed of the L1 cache.

In order to determine your CPU speed, you multiply the FSB by its multiplier.

For example:

266.66Mhz FSB x 9 = 2.40Ghz

The 2.40Ghz is the speed of the L1 cache.

Normal Intel processors are limited to their stock multiplier. They can go down multipliers if their motherboards support it, but they cannot increase in multiplier. This is intentionally done by Intel to prevent system vendors from buying cheaper processors and manually increase the multiplier and selling it as a faster system. However, some ES (Engineering Sample) and all XE (Extreme Edition) processors have unlocked multipliers. You can manually increase or decrease the multiplier on the system (to an extend). You will pay a premium price for this capability.

We will discuss below why the CPUs multiplier has a direct effect on system stability. Decrease the multiplier on non ES / XE processors and lose stability.

One thing that I think it is absolutely critical to discuss is voltage and temperature. Many people will easily exceed Intel recommended voltage for their processor by 20% or more. However, they won't for a second run their processor over Intel's maximum recommended temperature. This is rather silly. Over voltage plays a far more significant role in how long your processor will last than temperature does. However, both are important. Your computer will die just as fast as 10C as it will at 80C if it has 25-30% or more over-voltage.

How much voltage you put into your processor is up to you. It is really un-known how long processors will last a certain voltages, so a safety range is not easily determined. However, be aware, your processor will likely not die instantly from overvoltage. It will be a steady and slight decline of performance. You will need more and more voltage just to keep your system running.

The Memory

Current systems use DDR, DDR2, and in the future DDR3 memory. DDR stands for dual data rate. What this means is that the memory transmits data on both ends of the sine curve. For those less mathematically inclined, the result is 2 times the data bandwidth. The number after DDR stands for the generation of memory. Newer generations have the capability of higher speeds than older generations, but are no faster at the same speeds. We will focus on DDR2 memory, since that is what is used the majority of current Intel systems (Q2 07).

Your memory speeds can be tricky. This is because, like the CPU's FSB, it has rated and actual speeds.

For example, DDR2-800, is DDR2 memory rated at 800Mhz. However, that is its rated (Dual Data Rate) speed. The memory is actually only running at 400Mhz, but since data is being read on both peaks of each cycle, its rated speed is doubled.

Memory takes data from the system's hard drive and communicates it to the CPU for execution.

People compare the speed of the memory as a ratio to the CPU's FSB. For this ratio, you use the actual memory speed, not the rated speed.

For example, a CPU with a FSB of 266.66Mhz will be in a 1:1 ratio with memory at 266.66Mhz (DDR2-533)

People are confused (misinformed) as to what ratio is optimal for system performance. When looking at the bandwidth in terms of MB/s, your memory needs to be operating 2 times as fast as the CPU's FSB in order to match the CPU's L2 bandwidth. If you want to calculate your CPU’s or memory’s bandwidth you simply multiply the actual frequency by .016. This will give you the maximum theoretical bandwidth in GB/s.

For Example:

DDR2-800 has an actual speed of 400Mhz. 400Mhz x .016 = 6.4GB/s maximum bandwidth.

So, for optimal settings a CPU with a FSB of 266.66Mhz would want memory running at 533Mhz (DDR2-1066). However, this is highly unlikely that you will have memory that can run in a 2:1 ratio with your FSB. A 1:1 ratio is more often the target ratio as it is easier to reach with most memory.

A more in depth (mathematical) way of explaining the memory and system relationship is as follows:

If you want to calculate FSB bandwidth of a Core 2 Duo you multiply bus frequncy (266.66) times the transfers per clock (4) and the FSB width (64bit or 8 byte).

Therefore, a system with a 266.66Mhz FSB (stock Core 2 Duo) has a FSB bandwidth of:

266.66 x 4 x 8 = 8533.33MB/s

Your memory has a 64bit (8 byte) width and a capability of 2 transfers per clock (DDR).

Therefore, to flood the FSB bandwidth you get:

8533.33MB/s = X Mhz * 2 * 8
8533.33MB/s = X Mhz * 16
533.33Mhz = X

Therefore a memory bus speed of 533.33Mhz or DDR2-1066 will flood the FSB bandwidth.

Memory also has a series of latencies. Latencies are measured in terms of clock cycle delays. In order to understand how the latencies work, you must also understand how the memory reads and writes data.

DDR2 memory is a type of SDRAM. SDRAM stands for Synchronous Dynamic Random Access Memory. The memory is organized like a matrix or chart, with data arranged in rows and columns. The data is stored in blocks whose location are found by the coordinates of the specific rows and columns. Latencies come from the memory looking for the data in these series of rows and columns. The four most common latencies are:

* Column Address Strobe Latency (tCAS / CAS / tCL). This is the number of clock cycles needed to access a specific column of data.
* Row Address Strobe (tRCD, RAS). This is the number of clock cycles that it takes for the memory to actually start reading or writing from the time the coordinates of the data are defined.
* Row Precharge time (tRP) and is the number of clock cycles needed to end access to one row of memory and open access to the next row of memory.
* Active to Precharge Delay (tRAS) and is the number of clock cycles needed to access a specific row of data in the memory between the data request and the pre-charge command.

So what you have are 4 series of latencies. If you didn’t get much of the above paragraph, get this. The lower the latencies the better for system performance. However, lower latencies mean less stability at any given voltage. Common value of latencies are 3-3-3-X, 4-4-4-X, 5-5-5-X. The reason I put X in the last spot is because the latencies in this sport vary greatly, but are most commonly between 4 and 18 clock cycles.

Simply comparing memory latencies with considering the speed at which the memory is running those latencies is silly. This is because the overall latencies in nano-seconds is derived from dividing your total latencies in cycles by how many cycles your RAM can complete in one second. This gives you latencies per operation in seconds.

For example:

DDR2-800 does 800,000,000 cycles per second. Latencies of 4-4-4-12 add up to 24 cycles per operation of latency. Divide 24 cycles of latencies by 800,000,000 cycles and you get 30 nano-seconds worth of latencies per operation. However, DDR2-1000 with latencies of 5-5-5-15 also net you the same 30 nano-seconds of latencies per operation (30 / 1,000,000,000).

However, even though both settings have the same latencies. DDR2-1000 @ 5-5-5-15 is better than DDR2-800 @ 4-4-4-12, this is because DDR2-1000 has more data throughput when compared to DDR2-800.

Now, it is also a common myth that a system will be faster when it is "synced" (i.e. in a 1:1 ratio as apposed to a 5:4 ratio) with the processor. This is simply not true (or there is no substantial evidence to prove that it is true). Most people who will claim this and provide benchmarks are often missing a variable that would explain the difference in performance.

A few quick benchmarks proves this:

FSB = 200
Mutliplier = 9
CPU Speed = 1.8Ghz

@ 1:1 DDR2-400 Memory bandwidth = 3224MB/s
@ 2:3 DDR2-600 Memory bandwidth = 3774MB/s
@ 1:2 DDR2-800 Memory bandwidth = 4047MB/s

The Northbridge

The Northbridge is the link between your CPU, memory, graphics card (PCI Express) and Southbridge as shown in the diagram below:



Just like your CPU, the Northbridge on your motherboard (i865 and newer) has its own internal frequency and latencies which affect overall system stability. This is referred to as the NBCC (North Bridge Core Clock). The NBCC directly affects the performance and stability of your memory and CPU because Intel system used a NB based memory controller.
It has been recently discovered that the NBCC varies with your systems FSB and multiplier settings. The NBCC can be calculated by dividing your CPU current multiplier by its default multiplier and then multiplier the sum by your FSB.

For Example:

E6600 @ 500Mhz and a 7 multiplier:

(9 / 7) x 500 = 642Mhz NBCC

So it can be seen that lowering your multiplier, even though offering addition headway for FSB on the CPU, will increase the NBCC, reduce NB stability and thus cause the overall system stability to decrease.

XE (Extreme Edition) and ES (Engineering Sample) processors have the unique ability to adjust their multipliers up (All XE, not all ES) and down (all chips) while maintaining its multiplier status as default.

For Example:

X6800 @ 500Mhz and a 7 multiplier (just like above)

(7 / 7) x 500 = 500Mhz NBCC

As you can see, the X6800 has the exact same settings as the E6600, however, the NBCC is lower, resulting in increased system stability.

Moving along, the NBCC has a series of latencies at which it operates. These latencies have considerable effect on overall system performance. The latencies within the NB increase when your NBCC hits specific values, thus increasing stability, but decreasing performance. A range of latencies that operate in specific NBCC values are referred to as straps. There is a 1066Mhz strap, 1333Mhz strap, and so on. However, the name is misleading, because motherboard engineers change the frequency at which specific straps set in. The trigger NBCC's for each range of latencies is different for all motherboards. However, you can manually test many different NBCC values by using Super Pi or a memory bandwidth test to find where the latencies within your system change.

What does this mean for our overclocking?
The classical response to an instable overclock was:

1. Increase CPU voltage.
2. Increase Memory voltage.

However, one must now consider NB voltage (stability) when system instability arises. This is especially true with Core 2 systems which are capable of far higher FSB than what the NB can handle.
So with this in mind, an Intel overclocker must be aware of:

* what strap they are in
* what part of the strap they are in

If you are at the limit of a specific strap, you will most likely find your system to be less than stable. However, increase your NBCC to the next strap and your system suddenly becomes more stable, but not as fast.

For example. The P5B Deluxe changes from the 1066Mhz strap to the 1333Mhz strap after 400Mhz. Therefore, 400Mhz on a P5B is faster, but less stable than 401Mhz.

Some motherboards, like the Intel D975XBX2 and the Abit AB9 QuadGT allow the user to manually adjust what strap they are in. This unique feature allows overclockers to maximize their experience. If you find the limit of your CPU to be around 412Mhz, you can still maintain the 1066Mhz strap and maximize your CPU's potential.

3. How to overclock and test

After you fully understand how your system works and interacts with each other, you are ready to overclock. We are going to start off at the point where you have already acquired the proper components and your system is build, Windows is installed, and everything is running fine.

Many people have different styles of overclocking. Some like to move at 2-3Mhz FSB at a time, other like to jump up to a goal and see if they can make it stable. However, the best way to go about overclock your system is to first research.

* Go into your BIOS and become familiar with it. Know each setting and what it does. Know what different voltages settings adjust, how they affect stability, and what other settings will do to help your performance. This can be down the easy way by asking and researching, or done the hard way by trial and error.
* Look around for other people with the same or similar systems. Get an idea of what they have accomplished. Then create a goal.
* Once you have a goal create a chart of different memory speeds, NBCCs, and FSB values leading up to your goal. Once you have an idea of what your system is capable of and what your goal is, it is time to start overclocking.

If you have discovered through your research that certain settings should always be set a different way to ensure stability and performance, go ahead and do that. Examples of such settings would be disabling Intel SpeedStep, and setting your BIOS to manual (Jumper Free) control.

You can now take two methods to overclocking:

* Leave all your voltages at stock settings. Set your memory to a 1:1 ratio with stock timings. Gradually increase your FSB by anywhere between 5Mhz and 25Mhz at a time, each time booting into windows and running a test such as Orthos or SuperPi. You can run Orthos for 15 to 30 minutes or run a 32 million digit SuperPi test. If your system passes these tests without a problem you can continue increase your FSB. Once your system either will not pass the test or fails to boot it is time to increase your voltages. Depending on your settings, you will increase different voltages. If your memory is still not yet running near its stock rated setting you are safe with that. Try increasing your CPU voltage or your NB voltage. Continue this process of increasing FSB and voltage as necessary until you reach your goal. Always be mindful of what your NBCC is. You need to know what strap you are in, and where you are in your current strap. If you hit a ‘wall’ where it seems like no level of voltage will make you system stable, try increasing your FSB by 20 or 30Mhz, or to the next known NB strap. At some point you may find that you are having errors in terms of Windows booting because of hard drive corruption or you may find video card instability. This is your cue to increase your SB (Southbridge) voltage and lock your PCI to 33.33Mhz and your PCIe to 100Mhz. This will help ensure stability. If you continue to have stability issues you can increase your PCIe frequency even further. However, I would not run more than 110Mhz. Always leave your PCI frequency at 33.33Mhz.
* The second method is to increase all your voltages to high levels. This would include setting your NB and SB to maximum values and probably increasing your memory voltage. Then you set your FSB right to your goal speed. Once you have done this, you also run Orthos or SuperPi. If your system is stable on Orthos for 15 to 30 minutes or passes a 32 million digit Super Pi test you can lower your CPU, memory, NB, and SB voltages. Continue to do this until your system does not pass the tests. Once you reach the point where your system does not pass, go up to the last known point of stability.

Now that you have found a series of voltages and FSB that are within your goal level you want to begin long term testing of the system. Run a 12 to 24 hour Orthos test. If the system passes this test, you can try to increase your memory ratio or tighten the timings. After any adjustment run another 6-24 hour Orthos test.

Software to Aide You in Your Overlclocking:

There are several Windows and non-Windows based applications which may aide you with overclocking and playing with your new system:

* Intel Thermal Analysis Tool (TAT) – TAT gives you a clear idea of how hot your CPU cores are running at different levels of activity. This is more accurate than using the motherboard’s sensors which are non on-die.
* Orthos – Orthos is a CPU / memory / NB stress test that was developed from the Prime95 program. This is an excellent system at determining your systems 24/7 stability.
* CPUz – There is nothing like CPUz for getting information on your CPU, motherboard, and memory.
* Super Pi Mod 1.5 – An excellent program for testing the speed of your newly overclocked CPU and memory, this can calculate different values of pi and times how long your system takes.
* MemTest86 – This program is a stand-alone program. You have to put it on a bootable CD or floppy. It is an excellent program for testing memory and overall system stability.
* SiSoftware Sandra XI – This is a complete benchmark, stress test, and system information program. You can test your CPU, memory, hard drives, network connection, and various other components. It also provides detailed information on your hardware and software.

Software You Should Never Use to Aide Your Overclock:

* Never use the software that comes with your motherboard that claims quick and easy overclocking. Like most things in life, nothing good comes easy. Using these programs can cause nightmares for your overclock. Avoid them.
* Use clockgen sparingly. This program is not very stable at increasing your FSB and should never be used as your primary overclocking system.

No comments:

Post a Comment