Raspberry Pi 2 significantly improves on original model
The Raspberry Pi 2 significantly increases performance when compared to the original Raspberry Pi. It corrects deficiencies in the design of the original SoC used inside the Raspberry Pi by integrating four more modern and faster Cortex-A7 ARMv7 CPU cores in a quad-core configuration, as opposed to the single ARM11 core in the original SoC, all within the constraints of a similar 40 nm manufacturing process. Whereas the CPU inside the BCM2835 processor of the original Raspberry Pi effectively ran without a L2 cache (which was tied to the GPU), the new Broadcom BCM2836 SoC contains a dedicated 512 KB CPU cache, improving memory performance and performance in general. The amount of RAM has also doubled to 1 GB. Other changes include more USB ports and a MicroSD card slot for storage instead of SD.
Compatibility with Raspbian
Otherwise the new SoC as well as the device itself has been engineered to maintain hardware and software compatibility with the original Raspberry Pi, while running considerably faster. When using the Raspbian OS, an ARM11 compatible Debian-based distribution using armhf specifically maintained for the Raspberry Pi, only the kernel is specific to the Raspberry Pi 2 with the entire userland being 100% compatible. Although this misses out on some of the advantages of the newer ARMv7 instruction set (such as the reduced code size of Thumb2 instructions, which are used in ARMv7 Debian), applications that can take advantage of, for example, NEON SIMD instructions usually do so on a run-time detection basis (as they do in ARMv7 Debian), so that the most critical gains from the new instruction set can in theory be taken advantage of in Raspbian.
Nevertheless, the new device can run an OS specifically configured for ARMv7, such as Debian armhf and derived distributions such as Ubuntu, which take advantage of the reduced-size Thumb2 instruction set. An example of such a distribution that has been applied to the Raspberry Pi 2 is Ubuntu Snappy Core.
Components of Raspberry Pi 2 SoC clocked conservatively out of the box
The maximum CPU clock of the Cortex-A7 cores in the Raspberry Pi 2 is 900 MHz, while the L2 cache appears to be clocked at only 250 MHz by default, inheriting the clock rate of the original Pi's GPU cache. SDRAM is clocked at 450 MHz by default. The GPU is clocked at 250 MHz, similar the original Raspberry Pi.
The configured speed of the L2 cache is particularly low, as we will see, since speeds up to 600 MHz seem to be stable when overclocking, resulting in a large performance increase. The CPU clock speed can also be bumped up somewhat.
The raspi-config utility in Raspbian at the time of writing contains just one overclocking option for the Raspberry Pi 2, which clocks the CPU at 1000 MHz, doubles L2 cache speed to 500 MHz and clocks SDRAM also at 500 MHz. Unfortunately, this setting turned out to be unstable on my device. This appears to be due to the SDRAM clock speed being set too high and causing problems. Bumping the SDRAM speed down to 483 MHz results in a stable system.
Overclocking test set-up
I have performed a number of overclocking tests with different clock configurations. The test set-up was as follows.
To prevent corruption of the root file system, I modified /etc/fstab to mount the root filesystem read-only at boot by adding "ro" to the mount flags. To remount with read-write capability when necessary after boot (on a stable system), I ran "sudo mount -o remount,rw /dev/mmcblk0p2 /".
The main stability test was performed using the single-threaded memtester package (available in Raspbian and Debian) using the command line "memtester 16M 10" (16 MB memory region, 10 loops). In several cases four of these commands were run in parallel to fully occupy the CPU and provide reliable stability information. In unstable configurations, this test almost always shows errors.
Memory performance was tested using a slightly modified version of the fastarm package (https://www.github.com/hglm/fastarm) with the command line "for x in 0 1 2 3 4 5 6 7 8 9; do ./benchmark --duration 1 --repeat 1 --memcpy e --test 0; done". Because of result variation due to cache allocation effects, I took the best result out of ten. Tests number 0 (memcpy of varying size, aligned, depends on CPU as well as memory) and 43 (4K page-aligned memcpy, a more pure memory subsystem test) were used.
For a real-world CPU performance indication I used the command line "time zcat bullet3-Bullet-2.83-alpha.tar.gz >/dev/null" performed multiple times, which is effectively gzip decompression of a large file out of buffer cache memory.
Table with stability testing results
The following table shows stability testing results for a large number of CPU clock, core clock (L2 cache clock), and SDRAM clock configurations. Also included are some benchmark scores, including memory performance and CPU performance.
The stable configurations show "OK (multi-test)" in the stability column, meaning they were stable during a test with multiple memtester processes running concurrently. Most unstable configurations have an SDRAM clock speed of 500 MHz or higher, or a CPU speed higher than 1100 MHz.
CPU +Volt Core SDRAM +Volt Stability Memcpy perf. p i c (memtester) Varied 4K zcat Default: 900 ? 250 450 0 0 0 OK (slow) 716 1015 2.388s Standard overclock (raspi-config "Pi 2" option): 1000 2 500 500 0 0 0 Fail Other settings: 900 0 450 450 0 0 0 OK 778 1270 2.380s 900 0 600 467 0 0 0 Almost 804 1431 2.379s 900 2 600 467 0 0 0 OK (multi-test) 1000 0 467 467 0 0 0 OK (multi-test) 867 1410 2.146s 1000 0 500 483 0 0 0 OK (multi-test) 880 1502 2.146s 1000 0 500 483 2 0 0 OK (multi-test) 878 1502 2.169s 1000 2 500 500 0 0 0 Almost 1000 4 500 500 0 0 0 Almost 1000 0 500 500 2 2 0 Almost 1000 0 500 500 4 4 0 Almost? 1000 0 500 500 4 0 0 Fail 886 1415 2.143s 1000 2 500 500 4 0 0 Fail 1000 4 500 500 4 4 0 Fail (multi) 1000 0 500 500 6 6 6 ? 1000 2 600 467 0 0 0 OK (multi-test) 885 1518 2.145s 1000 2 600 500 4 0 0 OK (multi-test) 890 1553 2.142s 1000 2 667 500 4 0 0 Fail (freeze) 1000 6 667 500 6 0 0 Fail (freeze) 1050 0 466 466 4 4 4 OK 1050 0 466 533 4 4 4 Fail 1050 0 466 533 6 6 6 Fail (bitspr.) 1050 4 600 450 0 0 0 OK (multi-test) 916 1528 2.045s 1050 4 600 483 2 0 0 OK (multi-test) 924 1571 2.041s 1067 6 533 533 6 6 6 Fail 1067 4 533 533 8 8 0 Fail (bitflip) 1067 6 533 533 8 8 0 Fail (bitflip) 1067 6 533 500 4 4 0 Almost 1067 4 533 466 0 0 0 OK (multi test) 925 1521 2.010s 1100 0 466 466 0 0 0 Fail (boot) 1100 4 466 466 0 0 0 OK? 1100 4 600 467 0 0 0 Fail 1100 4 500 500 6 6 6 OK? 1100 4 500 500 6 6 0 OK? 1100 4 500 500 4 0 0 Almost 1100 4 500 500 6 0 0 OK? 950 1532 1.950s 1100 6 500 500 6 0 0 Almost 1100 4 533 533 6 0 4 Fail 962 1593 1.948s 1100 4 550 483 0 0 0 OK (multi-test) 944 1549 1.951s 1133 4 567 466 0 0 0 Almost 974 1578 1.893s 1133 4 567 467 4 0 0 Almost 1133 5 567 453 0 0 0 Almost 971 1571 1.896s 1133 8 567 453 0 0 0 Fail 1166 4 466 466 0 0 0 Almost 960 1451 1.841s 1167 4 466 466 2 2 4 Fail 1166 6 466 466 0 0 0 Fail 962 1451 1.841s 1167 8 500 500 4 0 0 Fail 1.839s 1167 8 500 500 8 8 8 Fail 1200 8 600 450 4 0 0 Fail
CPU frequency corresponds with the "arm_freq=" setting in /boot/config.txt. The CPU/main SoC voltage is set with over_voltage setting. The core clock (the L2 cache speed on the Raspberry Pi 2) is set with core_freq. The SDRAM frequency is set with sdram_freq, while voltage settings for the SDRAM physical layer, I/O and controller are set using over_voltage_sdram_p, over_voltage_sdram_i and over_voltage_sdram_c, of which the physical layer voltage seems to be the most relevant to overclocking. An example of the relevant lines in /boot/config.txt for a particular overclocking configuration (1000 MHz CPU, with stable 483 MHz SDRAM, as well as 256 MB memory reserved for GPU) follows.
See the official documentation for more details.arm_freq=1000 over_voltage=0 core_freq=500 sdram_freq=483 over_voltage_sdram_p=0 over_voltage_sdram_i=0 over_voltage_sdram_c=0 gpu_mem=256
Observations based on stability testing
The following is apparent from testing my device:
- The core_freq setting seems to be directly correlated with the L2 CPU cache in the new SoC, which has a large effect on performance. Depending on other frequencies, core_freq frequencies up to 600 MHz seem to be stable, giving a significant performance boost over the default configuration of 250 MHz.
- When increasing CPU speed beyond roughly 1000 MHz, the CPU core voltage has to be bumped up.
- Increasing SDRAM speed beyond about 483 MHz seems to cause instability on my device. Bumping up the SDRAM voltage (in particular the physical layer voltage, but not the I/O voltage or SDRAM controller voltage) may help a little for potential stability. However, SDRAM speeds of 500 MHz and higher tend to cause stability problems regardless of voltages on my device.
- Certain divisor relationships between CPU clock and core (L2 cache) clock (such as 2:1) seem to enhance stability and performance.
CPU overclocking conclusions
- The default Raspberry Pi 2 core_freq (L2 CPU cache) setting of 250 MHz appears to be extremely conservative. At the default CPU frequency of 900 MHz, 450 MHz (which has a nice divisor of two) appears to be very stable and even 600 MHz can be stable.
- Unfortunately, the standard Raspberry Pi 2 overclocking setting available in raspi-config at the time of writing (1000 MHz CPU, 500 MHz core clock, 500 MHz SDRAM) appears to be unstable on my device due to a SDRAM clock speed that is slightly too high. Instead of bumping the CPU voltage as performed by this setting, increasing the SDRAM voltage (primarily the physical layer voltage) may improve stability, but clocking the SDRAM slightly lower at 483 or 467 MHz seems to be the best solution.
- It seems likely that certain SDRAM parameters (CAS delay, etc) are set to fixed values by the kernel and that higher SDRAM speeds will be possible when these parameters are configurable or appropriately adjusted by the kernel for higher SDRAM clock speeds. However, the actual RAM chip used is an Elpida/Micron EDB8132B4PB-8D-F LPDDR2-800 chip specified for 400 MHz clock frequency, so the overclocking headroom may not be that high.
Table with stable high-performance clock configurations
The following table shows stable high-performance clock configurations tested on my device and their clock frequency ratios:
However, I may have to retest the configuration with an SDRAM frequency of 500 MHz because other configurations show such a setting to be unstable after extensive testing. Additionally, the 1100 MHz CPU frequency setting turned out not be completely stable.CPU Over- Core Base clock volt clock Clock CPU : Core SDRAM Overv. 1067 +4 533 533 2 : 1 467 1050 +4 600 150 7 : 4 483 +2 1000 +2 600 100 5 : 3 500 +4 1000 500 500 2 : 1 483 +2 900 +2 600 133 3 : 2 467 900 450 450 2 : 1 450
Overclocking the GPU
By default, the Raspberry Pi as well as the Raspberry Pi 2 will use dynamic clocking, whereby the CPU speed, "core_freq" speed and SDRAM frequency are dynamically ajdusted based on CPU load. Any GPU frequency settings, as governed by the "v3d", "h264_freq" and "isp_freq" settings in config.txt, are ignored by default.
Using "force_turbo=1" allows overclocking of the GPU using the "v3d_freq", "h264_freq" and "isp_freq" options. "v3d_freq" corresponds to the frequency of the 3D block (the most relevant for overclocking), while "h264_freq" is the H.264 video block and "isp_freq" governs the camera interface. However, "force_turbo=1" also disables dynamic clocking, locking the CPU, core and SDRAM speeds to fixed maximum values, which is highly undesirable. Also note that using "force_turbo=1" may void the warranty of the device.
There is another setting, "avoid_pwm_pll=1", that allows "core_freq" to be set independently from that of the GPU on the original Raspberry Pi, at the cost of slightly reducing analog audio output quality. However, "force_turbo=1" is still required to be able to modify the GPU clock frequencies.
Because the Raspberry Pi 2 has an independent GPU with its own independent L2 cache seperate from the L2 cache of the CPU, some of these limitations may have become unnecessary (in particular the requirement that the CPU is locked at a high speed with "force_turbo=1" in order to be able to overclock the GPU), and if that is the case these restrictions will hopefully be removed in the future.
When running 3D benchmarks, the following CPU and SDRAM settings were used (note that when using of "force_turbo=1" to overclock the GPU, these frequencies are locked and do not scale down when the CPU is idle):
When running 3D GPU benchmarks without overclocking the GPU (force_turbo=0), it looks like the CPU / L2 cache frequencies are scaled down quickly because the CPU load is relatively low, negatively affecting the throughput of the 3D benchmarks because of a CPU bottleneck, resulting in an initial peak in fps dropping to a lower base. To avoid this, we modify the sampling_down_factor of the ondemand cpufreq governor from 50 to 1000:cpu_freq=900 over_voltage=0 core_freq=450 sdram_freq=483
The following settings overclock the 3D block (V3D) of the GPU from 250 MHz to 300 MHz:sudo sh -c "echo 1000 >/sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor"
These are the results of benchmark testing with different V3D clock speeds:force_turbo=1 avoid_pwm_pll=1 v3d_freq=300
Although the clock frequency of either the CPU or the 3D block seemed to be scaled down in some cases at higher V3D speeds (presumably due to temperature measurements or voltage readings resulting in throttling), there were actually never any signs of stability issues when overclocking the GPU, up the maximum tested speed of 450 MHz. Thev3d_freq demo1 demo1 demo2 demo2 demo2 demo5 demo9 game lights lights shadows default 81.1 20.5 26.1 8.87 0.98 50.5 46.4 112 300 95.3 28.4 9.88 1.12 56.7 49.3 130 350 109* 27.4 29.9 10.9 1.24 62.3 51.6 148 400 120* 30.6 31.4 11.7 1.35 40-52* 53.5 108* 450 80* 33.7 20.2* 12.3 1.45 40-56* 55.0 111*
Regular dynamic downclocking of the CPU can occur due to USB power supply/cable issue
Initially, downclocking by the Raspberry Pi 2 kernel's under-voltage monitor seemed to be triggered a lot of more frequently than it is on the original Raspberry Pi. This results in a rainbow-colored icon being displayed in the top-right corner of the screen. This even happens briefly during boot. At such occasions, presumably the CPU and other components are downclocked in order to ensure stability.
The rainbow-colored square suggests a power supply issue since it indicates a voltage that is too low. As it turns out, replacing the USB power cable I was using with a shorter one that is better insulated eliminates the under-voltage warnings, with the same 5V/2A power supply.
Updated 1 March 2015 (update explanation for CPU speed throttling).
Updated 25 March 2015 (update with USB power cable findings).
Most useful overclocking guide for raspberry pi 2 i have found. Many thanks!
ReplyDeleteThanks for this great article. Trying my luck with 1067/533/467 now. 1100/550/483 doesn't seem to work for me, and anything with sdram 500 also seems to make it unstable.
ReplyDeleteI'm also getting the small rainbow square on boot-up... But I'm not sure if this might be a result of the on board regulator having too slow a slew rate and under volting the board for a very short time when it's first turned on. I can't say that I see the rainbow square any other time. I've tried various power supplies some of them even rated at 3A and it still shows the voltage drop warning rainbow in the top right of the screen at first start-up.
ReplyDeleteI use retropie to stress test overclock settings. Most emulators typically use just one cpu... but this one cpu really gets punished.
ReplyDeleteRunning rock solid at 1050/525/500 with 5 cpu over volt and a 19mm high aluminum heat sink on the CPU.
You report the default SDRAM clock speed of 400 MHz. In my RPi 2, I observed it is 450 MHz (by running 'vcgencmd get_config int' with no overclocking-related setting in boot.config.txt).
ReplyDeleteThanks for the correction, I have updated the article.
DeleteFirst of all thank you for the effort made - definitely a lot time was necessary to test, gather and present the results this way.
ReplyDeleteSince I actually would want to use one of the medium overclocking settings you have tested as stable, I wonder which of the following is actually correct.
In paragraph "Table with stability testing results" in line 5 you mention "1000, 0, 500, 483, 0, 0, 0, OK (multi-test)" indicating no overvolting was used but a stable result was achieved. In paragraph "Table with stable high-performance clock configurations" in line 4 you mention a setting with the same frequencies "1000, 500, 500, 2 : 1, 483, +2" indicating SDRAM needed overvolting. Please let us know - which one is correct? Or am I getting something wrong?
Line 6 - forgive me for not reading on.
DeleteAfter some dark places quakeworld team fortress multiplayer testing, cpu testing and some stability testing I came up with the following:
DeleteRaspberry pi 2 b +
arm_freq=1050
over_voltage=4
core_freq=525
gpu_freq=350
sdram_freq=480
over_voltage_sdram_p=2
over_voltage_sdram_i=2
over_voltage_sdram_c=2
gpu_mem=256
*note* It took me trying out 3 USB cables to get one that would power the pi, Shorter and thicker are better. I ended up re-using a motorola tablet cable. I ordered a 1.5ft cable and the cheap thing wouldnt even boot the pi and the next cable kept making the multi colored box appear in the right corner of the screen and the red status light blink indicating low voltage.
Picked core freq based on clock , 1050/2 = 525. picked gpu freq because 525 * (2/3) = 350. Picked ram of 480 because most people said the ram stability is best between 450 and 480 unless cas latency is adjusted and that cant be adjusted outside the kernel. Which brings me to my next point, the auto kernel update by hexxeh seemed faster and more stable.
I also added a heat sync to the cpu and gpu. Wasnt able to put one on the ram because of the case, however the case has vent holes for the ram.
Ran a couple of mvdsv quakeworld servers on stock settings for 130+ days of solid uptime. I normally run a headless unit with gpu mem set to 64 meg. Since my normal computer psu went out I decided to get the pi up and running the weekly quake match. The fps went from 25 to 60 fps to 60 to 110 fps in the dp quake client (I suspect that more optimized games like quake 3 would do much better with higher quality graphics)
The only other thing i was wondering about was if it were possible to bring ram speed to 525, would it help with stability and performance??? Also does cas latency play into the equation, if i remember right cas latency of 1 - 2 was high end for ddr 2 and edo ram I remember seeing cas 4 - 12 for ddr3 memory and remember it having an impact for overclocking and stability.
And so we'll see how the new settings hold up for the next 180 days. Cheers!!
ReplyDeleteThis was very informative. I learnt the basics of Raspberry Pi at http://au.rs-online.com/web/generalDisplay.html?id=infozone&file=expert-reviews/expert-reviews-raspberry
Hi.
ReplyDeleteGreat work !!!
really appreciated!!
One question: how do I get "memtester" to run on OSMC environment?
Thank a lot in advance,
cheers
An official Pi Foundation fix for RAM instability beyond ~480MHz is 'in the works'. Although it's not ready for general public release just yet, I've had the priviledge of playing with it on a Pi2B that was never completely stable at much beyond 475MHz with any amount of manual tweaking. It's now completely stable at 500MHz with no manual tweaks, or 550MHz with minimal manual tweaks. Another tester has complete stability at 600MHz with minimal tweaks. The really good news is that it looks as though everyone should be stable at 500MHz with this fix. I'll try to remember to post back here when the fix goes public. :)
ReplyDeleteThe official Pi Foundation fix for RAM stability problems has been pushed to 'rpi-update' for those brave enough to try it. Bear in mind that you will be running an experimental kernel that may lead to other breakages, so it's highly advisable to make a full backup before going down this route.
ReplyDeleteThis has taken a small group of test Pi2B from a stable RAM clock limit of around 480MHz up to complete stability at 600MHz. Although 600MHz is not guaranteed, running 'rpi-update' followed by a reboot then adding the following to 'config.txt' will most likely get you there if your hardware is capable of it...
sdram_freq=600
sdram_schmoo=0x02000020
over_voltage_sdram_p=6
over_voltage_sdram_i=4
over_voltage_sdram_c=4
If this proves unstable, either drop the RAM frequency a little or raise all three voltages by 1 or 2. As always, YMMV. Hopefully, this should find its way into the official stable Raspbian kernel at some stage in the near future. Have fun! :-)
Awesome! Thanks for the update, completely stable for me now. Just when everything gets tweaked out and I'm sitting happy with my pi2 ready to actually now *use* the darn thing, they go launch the pi3...
ReplyDeleteThis comment has been removed by the author.
DeleteYou're welcome. :-)
DeleteI won't be rushing out to buy a Pi3 just yet as some of the first batch of 300,000 are being reliably reported as reaching around 100°C just stress testing the CPUs at standard clock speeds with no GPU load. Let the early adopters be your guinea pigs. ;-)
The official Foundation advice is that "you may need a heatsink" on a Pi3 under some circumstances. After seeing the results from a precision Flirc camera of just the CPUs being stressed, I sure as hell wouldn't run a Pi3 without one...
https://imgur.com/gallery/tzgPU/
Having your own particular web flag making programming pays off by making more deals for your business. Standard advertisements are a standout amongst the most prevalent and powerful techniques for web promoting.
ReplyDeleteDesign dine egne banner, Svendborg
Hi! great research thanks! But the highest stable configuration I can run is:
ReplyDelete1067
4
533
466
I get no stability with 1100 or above.
Best Machinery Parts
ReplyDeletePlastic Surgery Houston Services 2019
ReplyDeletehttp://www.aiobjectives.com/2019/12/18/what-is-forgery-why-it-is-fraud/
ReplyDeleteWhat is forgery crimes and why it is fraud ?. Forgery crimes is fake signature and without
permission making a false documents or every where.
What is duplicate invoices and waht is processed of duplicate invoices and what
is detict of this crimres.
islamic dua fuaktsoft channel is the best channel to promote our children to religon and prompte the attention levelof our children.
ReplyDeleteIn Islam, Invocation (duʿāʾ) (Arabic: دُعَاء IPA: [duˈʕæːʔ], plural: ʾadʿiyah أدْعِيَة [ʔædˈʕijæ]) is a prayer of
supplication or request. ... Muhammad is reported to have said, "Dua is the very essence of worship."
poems for kids
ReplyDeletePoetry has so many benefits for kids. It is not only a great medium for rendering information
but children also find poems very delightful. Poetry recitation and memorising is a fun activity that you can
engage your kid in. Let’s take a look at some famous, funny and rhyming poems for kids. Along with that, we shall
discuss how you can select a poem and teach your kid to recite it.
ReplyDeletebarbaroslar episode 1 in urdu
Comment for barbaroslar episodes: Go and Watch the Turkish Drama Series of Engin Altan's "Barbarosslar" in Urdu only on our youtube channel "My Kids Tube".
Please subscribe and Stay Tuned!
https://www.youtube.com/watch?v=qmJQOLaK2IM
Roulette and Casino Site Review - Lucky Club
ReplyDeleteRoulette is one of the most popular variants of roulette, with the game being played with 6 or 7 hands, it is available for you luckyclub to play in one-
ต้องหาจุดอ่อนบาคาร่า autobet การเล่นเกม บาคาร่า ไม่ว่าจะเป็นการเล่นเกมที่ autobetcasino หรือเป็นการเล่นเกมกับเว็บไหนก็ตาม
ReplyDeleteCool and that i have a neat supply: How To Properly Renovate A House top home renovation companies
ReplyDelete