Page 1 of 2

Simple GPIO performance benchmark

Posted: Tue Jul 03, 2012 7:31 pm
by jokkebk
Today I finally had time to investigate GPIO programming further, and decided to see how fast the GPIO pins can be driven using different programming languages / libraries. The test setup was to generate a simple square wave as fast as possible and measure the frequency with an oscilloscope. Detailed results here:

http://codeandlife.com/2012/07/03/bench ... pio-speed/

Based on testing, pure C access is fastest, over 20 MHz wave could be generated (although the signal starts to look more like a sine wave at that point), I used the first example here as a basis:

http://elinux.org/RPi_Low-level_periphe ... re_hacking

The bcm2835 library (C as well) came second at over 5 MHz. Perl module using the same library clocked to 35 kHz. Simple shell access (bash script) got to 3400 Hz. Surprisingly, Python RPi.GPIO library finished dead last at 900 Hz. Definitely nothing to write home about (but still enough for all automation tasks etc.).

I'm planning to try out the sampling bandwith in the near future, too. What do you guys think, should the benchmark results be available in the RPI wiki somewhere?

Re: Simple GPIO performance benchmark

Posted: Tue Jul 03, 2012 9:00 pm
by Gert van Loo
I think an important parameter would be how big the gaps are.
So what is about the longest latency for servicing the GPIO pins.

Re: Simple GPIO performance benchmark

Posted: Wed Jul 04, 2012 5:46 am
by jokkebk
All the versions I tested had quite a small variance in the minimum, maximum and average frequency which seems to indicate the latency is most of the time very small (the frequency is basically a function of the latency, if the logic level takes 1ms to change, freq would be ~500 Hz).

Unfortunately I didn't make "longest change time" type of measurements (not sure if my Picoscope has those), so the only thing that can be deduced from results is that the latency is generally about 0.5s/freq.

Re: Simple GPIO performance benchmark

Posted: Wed Jul 04, 2012 2:35 pm
by Grumpy Mike
Interesting, thanks for doing them:-
I found there was sometimes a 10mS or so gap caused by ISRs.
You can see the scope trace in this thread:-
http://www.raspberrypi.org/phpBB3/viewt ... =33&t=7077
This could be a real problem for things that need to be processed in real time like audio signals. It means other techniques rather than straightforward code might have to be used.

Re: Simple GPIO performance benchmark

Posted: Wed Jul 04, 2012 6:35 pm
by Gert van Loo
There was a question related to regular reading of GPIO pins.
Have a look at this thread: http://www.raspberrypi.org/phpBB3/viewt ... 96#p113732

You can use the same (convoluted) method to output data.

Re: Simple GPIO performance benchmark

Posted: Wed Jul 04, 2012 8:29 pm
by texy
Please try the python experiment using the wiringpython method, see this thread -
http://www.raspberrypi.org/phpBB3/viewt ... 32&t=10010

I think you will find it to be considerably faster. It would be ineresting to see how it stacks up with the other languages.

Texy

Re: Simple GPIO performance benchmark

Posted: Tue Aug 14, 2012 7:49 pm
by jokkebk
I've now updated the benchmark with wiringPi results. I also re-ran the C benchmarks with a better 200 MHz scope and probes to ensure the reported frequencies and waveforms were accurate (they were, except for the detailed shape of 22 MHz signal which is now updated to the article as well).

Also, the list of different access methods started getting quite long, so there's a summary table in the beginning now:

http://codeandlife.com/2012/07/03/bench ... pio-speed/

Re: Simple GPIO performance benchmark

Posted: Tue Aug 14, 2012 8:38 pm
by trouch
Good article but incomplete in my minds : you benchmarked programming languages whereas GPIO implementations...
for example, just make a loop with print/echo in each languages and you will certainly got shell < python < C
that's it, C have better IO throughput than anything else.
you have to add some sync to measure the time between the call in the code and the IO raise/fall.
that's not easy, but you can also calculate the maximum rate for an empty loop, to make some kind of ratio.

but for a single programming language, it's good to compare each implementation !

Re: Simple GPIO performance benchmark

Posted: Tue Aug 14, 2012 8:41 pm
by texy
Thanks - very useful!
I,m very surprised at the differences between the two python methods. In real-life experiments on an Nokia LCD they seemed virtually identical timewise :
http://www.raspberrypi.org/phpBB3/viewt ... =32&t=9814

Then again, the RPi.GPIO version you used had a major bug and was only out for a very short while. You might like to try again with 0.3.1a

Texy

Re: Simple GPIO performance benchmark

Posted: Tue Aug 14, 2012 8:52 pm
by trouch
that's not surprising at all !
wiringpi library is interpreted by the wrapper which is itself intpreted by python.
whereas RPi.GPIO is only intepreted by Python and relies on C native call.

Re: Simple GPIO performance benchmark

Posted: Tue Aug 14, 2012 9:00 pm
by texy
OK. Then its really surprising that in a real-life situation, they perform very similairly!!

T.

Re: Simple GPIO performance benchmark

Posted: Tue Aug 14, 2012 9:15 pm
by trouch
it will depend on what you do...
if you just want to turn on something, you will not notice difference :
the human acceptable delay for usual things is 100ms (10hz), and there is no more difference under 10ms (100hz)
so even the poorly designed shell code will do the job.

but if you need to acquire some input data, compute something, and then decide or not to turn something on, you may notice differences depending on the computation complexity, because wiringpi is slower than "pure" python.

Re: Simple GPIO performance benchmark

Posted: Tue Aug 14, 2012 9:38 pm
by jbeale
Thanks for doing and writing up this experiment; very informative! I will try to compare some measurements on my own; as I recall I saw a little bit better risetime from my GPIO outputs, than the ~ 5 ns with a lot of overshoot, that I see in your 200 MHz scope plot.

About the alligator clip causing a reboot on your R-Pi, you probably know already, but you can get the proper grounding leads for your scope probe that connect without shorting to the 0.1" header pins. Or make your own by cutting in half a F-F header jumper cable.
ScopeProbe.jpg
ScopeProbe.jpg (23.98 KiB) Viewed 21462 times
Overshoot and ringing can be caused by long ground leads; the best signal will be with the shortest ground lead, for example a 2 cm long wire soldered to the ground plane, and wrapped around the scope probe shield, like below
SProbeTest.jpg
SProbeTest.jpg (60.54 KiB) Viewed 21457 times

Re: Simple GPIO performance benchmark

Posted: Wed Aug 15, 2012 10:09 am
by Heater
If your aim is to generate a clean continuous square wave you are going to be sadly disappointed.

Problem is:
1) When the kernel schedules other tasks to run your nice 5/20MHz square wave will be paused.
2) Whilst generating at 5/20 MHz you are using all of your CPU capacity.

A more robust way to do this is to use the real time features of Linux, Specifically give your program a high priority to make sure the GPIO toggling happens on time and is not paused by lower priority tasks. The sched_setscheduler() function is used for this. And use clock_nanosleep() to get the period right.

I am running a test program using these ideas based on the example here https://rt.wiki.kernel.org/index.php/Squarewave-example

At a period of 80us I am seeing a 50% load on my 800MHz Raspian.
That is only 12.5 KHz.

Now of course you can "bit bang" at 5/20Hz in a tight loop as you say, but be aware that it will be intermittent and will not be suitable for some tasks, like performing regular serial I/O.

Re: Simple GPIO performance benchmark

Posted: Wed Aug 15, 2012 2:27 pm
by jbeale
I believe the point of the exercise was just a "simple GPIO benchmark". If you did want a high frequency square wave, you could do better by using the clock signal from the onboard I2C or SPI hardware interfaces. According to the Broadcom data sheet that interface clock is programmable up to 150 MHz (although that won't actually be achievable as an output).

Code: Select all

BCM2835 manual p. 34: I2C Clock Divider: SCL = core clock / CDIV Where core_clk is nominally 150 MHz. 
Might be a fun experiment to see just how high it will go, however.

Re: Simple GPIO performance benchmark

Posted: Wed Aug 15, 2012 3:48 pm
by DexOS
Why not add a bare metal ASM example, as a comparison.

Re: Simple GPIO performance benchmark

Posted: Wed Aug 15, 2012 7:29 pm
by jokkebk
Thanks for jbeale for his excellent tips on improving the ground lead. I replaced the alligator clip -> male-to-female jumper wire -setup with a short piece of wire, and used the probe head directly on the GPIO pin generating the signal (the Pi resetted twice during those attempts...).

The resulting waveform was a _lot_ better, and I had to remove all my comments about signal roundness from the blog post. :D Just goes to show how much these things matter. I redid both scope screenshots from the C section, and also the one in wiringPi section, which showed a lot of ringing that wasn't really there:

http://codeandlife.com/2012/07/03/bench ... pio-speed/

And yes, to other commenters, the purpose was just to get some relative idea of the performance levels achievable with different programming languages and libraries. Keeping up with the latest versions and recompiling the Python / Perl / whatever bindings is too much of a hassle, so unless great errors are uncovered, I'll keep the post as it is and maybe do an update after six months or so (development is happening very fast currently and seems like at least one library has been updated every time I get the previous one up to date : )

Re: Simple GPIO performance benchmark

Posted: Wed Aug 15, 2012 7:59 pm
by jbeale
Yes, that is quite a difference! Might be fun to leave the old rounded waveform up too, just as an example of what that extra inductance from a few inches of wire in your ground connection can do. And that's just at 20 MHz, consider how impressive it is that USB2 cables carry data at a 480 Mbps rate...

Re: Simple GPIO performance benchmark

Posted: Sun Jan 04, 2015 4:30 pm
by tchiwam
Reviving an old thread :

My goal is to make a library with good enough timings to bitbang i2c, cameras, spi , serial stuff and address bus and data bus for PC104 8 and 16 bit(ISA bus with a pin trough connector). So I started to loop time the GPIO. I made some wrappers to make it easy and allow the very neat mmap GPIO methods to be used too when speed is really needed. Looks rather good as the bus should work from 4.77Mhz up to about 8Mhz.

Rpi B+ 1Ghz/500MHz/500Hhz
Function ptrgpio 14.985MHz min:0.066563 max:0.067093 avg:0.066733 count: 10
Function ptrgpio + sleep(0) 4.728MHz min:0.211349 max:0.211839 avg:0.211524 count: 10
Macro 21.578MHz min:0.046292 max:0.046576 avg:0.046344 count: 10
Macro + sleep (0) 5.269MHz min:0.189604 max:0.190194 avg:0.189791 count: 10
Macro + 2*sleep(0) 3.001MHz min:0.333006 max:0.333609 avg:0.333242 count: 10

Rpi B+ 900MHz/450MHz/450MHz

Function ptrgpio 14.987MHz min:0.066584 max:0.067110 avg:0.066723 count: 10
Function ptrgpio + sleep(0) 4.725MHz min:0.211473 max:0.211849 avg:0.211618 count: 10
Macro 21.543MHz min:0.046277 max:0.046707 avg:0.046420 count: 10
Macro + sleep (0) 5.267MHz min:0.189686 max:0.190008 avg:0.189848 count: 10
Macro + 2*sleep(0) 3.000MHz min:0.333154 max:0.333543 avg:0.333333 count: 10

Samples of the code I did, taken from http://elinux.org/RPi_Low-level_periphe ... .28GPIO.29 :

Code: Select all

  /* The Macro way*/
  t = ptrtimer_init(0);
  volatile unsigned *gpio = gpio1->gpio_map;
  printf("Macro                       ");
  for(j=0;j<10;j++)
  {
    ptrtimer_start(t);
    for(i=0; i<1000000; i++)
    {
      GPIO_SET = 1<<4;
      GPIO_CLR = 1<<4;
    }
    ptrtimer_stop(t);
  }
  printf("%.3fMHz ",10/t->timer);
  ptrtimer_report(t);
  ptrtimer_close(t);

  /* The Macro + sleep (0) */
  t = ptrtimer_init(0);
  printf("Macro + sleep (0)           ");
  for(j=0;j<10;j++)
  {
    ptrtimer_start(t);
    for(i=0; i<1000000; i++)
    {
      GPIO_SET = 1<<4;
      sleep(0);
      GPIO_CLR = 1<<4;
      sleep(0);
    }
    ptrtimer_stop(t);
  }
  printf("%.3fMHz ",10/t->timer);
  ptrtimer_report(t);
  ptrtimer_close(t);

  /* The Macro + 2*sleep (0) */
  t = ptrtimer_init(0);
  printf("Macro + 2*sleep(0)          ");
  for(j=0;j<10;j++)
  {
    ptrtimer_start(t);
    for(i=0; i<1000000; i++)
    {
      GPIO_SET = 1<<4;
      sleep(0);
      sleep(0);
      GPIO_CLR = 1<<4;
      sleep(0);
      sleep(0);
    }
    ptrtimer_stop(t);
  }
  printf("%.3fMHz ",10/t->timer);
  ptrtimer_report(t);
  ptrtimer_close(t);

The trouble I have is ... nanosleep and usleep are dead slow even with a usleep(0);

Code: Select all

 /* The Macro + nanosleep(0) */
  t = ptrtimer_init(0);
  printf("Macro + nanosleep(0)        ");
  struct timespec ns;
  ns.tv_sec = 0;
  ns.tv_nsec = 10L;
  
  for(j=0;j<10;j++)
  {
    printf("%d \n",j);
    ptrtimer_start(t);
    for(i=0; i<1000000; i++)
    {
      GPIO_SET = 1<<4;
      nanosleep(&ns,NULL);
      GPIO_CLR = 1<<4;
      nanosleep(&ns,NULL);
    }
    ptrtimer_stop(t);
  }
  printf("%.3fMHz ",10/t->timer);
  ptrtimer_report(t);
  ptrtimer_close(t);  
ptrblabla is my way to segregate my functions ... If someone is interested I can move my git stuff to an open repo somewhere like github. If no one asks I'll simply keep it here not to add yet another unused repo. PS sorry for the coding style, I'm willing to change it if someone want to join the fun.

Philippe

Re: Simple GPIO performance benchmark

Posted: Sun Jan 04, 2015 4:41 pm
by joan
The code will be more legible if you wrap it in

Code: Select all

 
quotes (Code button available when editing posts).

Just because you can toggle the gpio at x MHz doesn't mean you can reliably bit bang at x MHz. I'm not sure what the ratio of toggle to bit bang is, just > 1, perhaps 20 or so.

Re: Simple GPIO performance benchmark

Posted: Sun Jan 04, 2015 5:20 pm
by tchiwam
Thanks for the \[code\] tip

Code: Select all

Function ptrgpio            14.991MHz min:0.066564 max:0.066976 avg:0.066706 count: 10 
Function ptrgpio + sleep(0) 4.728MHz min:0.211327 max:0.212040 avg:0.211521 count: 10 
Macro                       21.559MHz min:0.046291 max:0.046563 avg:0.046384 count: 10 
Macro + sleep (0)           5.271MHz min:0.189596 max:0.189848 avg:0.189715 count: 10 
Macro + 2*sleep(0)          3.002MHz min:0.332890 max:0.333396 avg:0.333139 count: 10 
Macro + usleep(0)           6.332kHz min:0.156844 max:0.158509 avg:0.157918 count: 10 
Macro + usleep(5000)        98.301Hz min:1.017122 max:1.017452 avg:1.017285 count: 10 
Macro + nanosleep 0ns       6.356kHz min:0.155238 max:0.158968 avg:0.157323 count: 10 
Macro + nanosleep 5000000ns 98.312Hz min:1.016978 max:1.017345 avg:1.017166 count: 10 
Here are the latest results with usleep and nanosleep. Very interesting to see that even at 0, the performance hit is quite big.

I know the output of the GPIO is not going to be a nice square wave at 20MHz over a 6" 40pin ribbon cable. But it's a good start to be able to adjust some timings. For example some chips I have won't go over 4Mhz. The idea is the leave a maximum of sleep time to let everyone else do what they need too ;)

Re: Simple GPIO performance benchmark

Posted: Mon Oct 24, 2016 12:38 pm
by mc007ibi
Hi,
does someone knows who to make such benchmark without a oscilloscope? I would like to run this tests for "Jerry-Script" (a lightweight version of Node-JS) and on the native side the pigpio C library (https://github.com/joan2937/pigpio). Same for Node-JS with the same binding (using https://github.com/fivdi/pigpio#performance).

Re: Simple GPIO performance benchmark

Posted: Sat Oct 29, 2016 5:39 pm
by mc007ibi
Ok, got my oscilloscope. I did run just for reference on the recent PI-3 and the code from the original blog:

Python: 95 kHz

Javascript (uses C binding, see https://github.com/fivdi/pigpio): 50 kHz

XBlox (my own visual block language, uses the same pigpio lib): 91 kHz . I am confused about this result, because I literally do only forward the calls but yet its faster.

I guess to make the GPGIO really faster, in the desired MHz range for both languages, you could send batch jobs to a C daemon. Not sure how that could look like but I found recently some idiomatic language approaches to translate Python, JS from AST to C. See https://github.com/alehander42/pseudo for more.

Re: Simple GPIO performance benchmark

Posted: Sun Oct 30, 2016 7:39 am
by fivdi
Can you post the code that was used to test the pigpio JavaScript module?

50KHz doesn't seem correct. I'd expect a value closer to 1MHz.

Measured with software the following test can change the state of a GPIO approximately 2,000,000 times per ssecond on Pi 3 which means a frequency of 1MHz:
https://github.com/fivdi/pigpio/blob/ma ... ormance.js

Re: Simple GPIO performance benchmark

Posted: Sun Oct 30, 2016 9:29 am
by mc007ibi
Hi,
Ash on my head, I think I measured this wrong and my crappy 90$ oscilloscope doesn't measure beyond 1Mhz.
with a setting of 0.1V and 0.1ms :
- Python gives me 100Kc (0.1MHz)
- pigpio gives me 50Kc (node-v4.4.7)

not even sure how to translate this right to the numbers the benchmark article but given he said 90Khz for Python this numbers seem to sum up. I placed a video here, showing the test. Anyhow, I was more interested how the languages perform compared. Seeing Python double as fast as JS with a native binding makes me wonder where the juice went. I gonna try that with Samsung's "Jerry-Script. Possibly it removes some overhead you have in plain JS.