jokkebk
Posts: 7
Joined: Sun Jul 01, 2012 5:30 pm

Simple GPIO performance benchmark

Tue Jul 03, 2012 7:31 pm

Today I finally had time to investigate GPIO programming further, and decided to see how fast the GPIO pins can be driven using different programming languages / libraries. The test setup was to generate a simple square wave as fast as possible and measure the frequency with an oscilloscope. Detailed results here:

http://codeandlife.com/2012/07/03/bench ... pio-speed/

Based on testing, pure C access is fastest, over 20 MHz wave could be generated (although the signal starts to look more like a sine wave at that point), I used the first example here as a basis:

http://elinux.org/RPi_Low-level_periphe ... re_hacking

The bcm2835 library (C as well) came second at over 5 MHz. Perl module using the same library clocked to 35 kHz. Simple shell access (bash script) got to 3400 Hz. Surprisingly, Python RPi.GPIO library finished dead last at 900 Hz. Definitely nothing to write home about (but still enough for all automation tasks etc.).

I'm planning to try out the sampling bandwith in the near future, too. What do you guys think, should the benchmark results be available in the RPI wiki somewhere?

User avatar
Gert van Loo
Posts: 2486
Joined: Tue Aug 02, 2011 7:27 am
Contact: Website

Re: Simple GPIO performance benchmark

Tue Jul 03, 2012 9:00 pm

I think an important parameter would be how big the gaps are.
So what is about the longest latency for servicing the GPIO pins.

jokkebk
Posts: 7
Joined: Sun Jul 01, 2012 5:30 pm

Re: Simple GPIO performance benchmark

Wed Jul 04, 2012 5:46 am

All the versions I tested had quite a small variance in the minimum, maximum and average frequency which seems to indicate the latency is most of the time very small (the frequency is basically a function of the latency, if the logic level takes 1ms to change, freq would be ~500 Hz).

Unfortunately I didn't make "longest change time" type of measurements (not sure if my Picoscope has those), so the only thing that can be deduced from results is that the latency is generally about 0.5s/freq.

User avatar
Grumpy Mike
Posts: 916
Joined: Sat Sep 10, 2011 7:49 pm
Location: Manchester (England England)
Contact: Website

Re: Simple GPIO performance benchmark

Wed Jul 04, 2012 2:35 pm

Interesting, thanks for doing them:-
I found there was sometimes a 10mS or so gap caused by ISRs.
You can see the scope trace in this thread:-
http://www.raspberrypi.org/phpBB3/viewt ... =33&t=7077
This could be a real problem for things that need to be processed in real time like audio signals. It means other techniques rather than straightforward code might have to be used.

User avatar
Gert van Loo
Posts: 2486
Joined: Tue Aug 02, 2011 7:27 am
Contact: Website

Re: Simple GPIO performance benchmark

Wed Jul 04, 2012 6:35 pm

There was a question related to regular reading of GPIO pins.
Have a look at this thread: http://www.raspberrypi.org/phpBB3/viewt ... 96#p113732

You can use the same (convoluted) method to output data.

texy
Forum Moderator
Forum Moderator
Posts: 5160
Joined: Sat Mar 03, 2012 10:59 am
Location: Berkshire, England

Re: Simple GPIO performance benchmark

Wed Jul 04, 2012 8:29 pm

Please try the python experiment using the wiringpython method, see this thread -
http://www.raspberrypi.org/phpBB3/viewt ... 32&t=10010

I think you will find it to be considerably faster. It would be ineresting to see how it stacks up with the other languages.

Texy
Various male/female 40- and 26-way GPIO header for sale here ( IDEAL FOR YOUR PiZero ):
https://www.raspberrypi.org/forums/viewtopic.php?f=93&t=147682#p971555

jokkebk
Posts: 7
Joined: Sun Jul 01, 2012 5:30 pm

Re: Simple GPIO performance benchmark

Tue Aug 14, 2012 7:49 pm

I've now updated the benchmark with wiringPi results. I also re-ran the C benchmarks with a better 200 MHz scope and probes to ensure the reported frequencies and waveforms were accurate (they were, except for the detailed shape of 22 MHz signal which is now updated to the article as well).

Also, the list of different access methods started getting quite long, so there's a summary table in the beginning now:

http://codeandlife.com/2012/07/03/bench ... pio-speed/

trouch
Posts: 310
Joined: Fri Aug 03, 2012 7:24 pm
Location: France
Contact: Website

Re: Simple GPIO performance benchmark

Tue Aug 14, 2012 8:38 pm

Good article but incomplete in my minds : you benchmarked programming languages whereas GPIO implementations...
for example, just make a loop with print/echo in each languages and you will certainly got shell < python < C
that's it, C have better IO throughput than anything else.
you have to add some sync to measure the time between the call in the code and the IO raise/fall.
that's not easy, but you can also calculate the maximum rate for an empty loop, to make some kind of ratio.

but for a single programming language, it's good to compare each implementation !

WebIOPi - Raspberry Pi REST Framework to control your Pi from the web
http://store.raspberrypi.com/projects/webiopi
http://code.google.com/p/webiopi/
http://trouch.com

texy
Forum Moderator
Forum Moderator
Posts: 5160
Joined: Sat Mar 03, 2012 10:59 am
Location: Berkshire, England

Re: Simple GPIO performance benchmark

Tue Aug 14, 2012 8:41 pm

Thanks - very useful!
I,m very surprised at the differences between the two python methods. In real-life experiments on an Nokia LCD they seemed virtually identical timewise :
http://www.raspberrypi.org/phpBB3/viewt ... =32&t=9814

Then again, the RPi.GPIO version you used had a major bug and was only out for a very short while. You might like to try again with 0.3.1a

Texy
Various male/female 40- and 26-way GPIO header for sale here ( IDEAL FOR YOUR PiZero ):
https://www.raspberrypi.org/forums/viewtopic.php?f=93&t=147682#p971555

trouch
Posts: 310
Joined: Fri Aug 03, 2012 7:24 pm
Location: France
Contact: Website

Re: Simple GPIO performance benchmark

Tue Aug 14, 2012 8:52 pm

that's not surprising at all !
wiringpi library is interpreted by the wrapper which is itself intpreted by python.
whereas RPi.GPIO is only intepreted by Python and relies on C native call.

WebIOPi - Raspberry Pi REST Framework to control your Pi from the web
http://store.raspberrypi.com/projects/webiopi
http://code.google.com/p/webiopi/
http://trouch.com

texy
Forum Moderator
Forum Moderator
Posts: 5160
Joined: Sat Mar 03, 2012 10:59 am
Location: Berkshire, England

Re: Simple GPIO performance benchmark

Tue Aug 14, 2012 9:00 pm

OK. Then its really surprising that in a real-life situation, they perform very similairly!!

T.
Various male/female 40- and 26-way GPIO header for sale here ( IDEAL FOR YOUR PiZero ):
https://www.raspberrypi.org/forums/viewtopic.php?f=93&t=147682#p971555

trouch
Posts: 310
Joined: Fri Aug 03, 2012 7:24 pm
Location: France
Contact: Website

Re: Simple GPIO performance benchmark

Tue Aug 14, 2012 9:15 pm

it will depend on what you do...
if you just want to turn on something, you will not notice difference :
the human acceptable delay for usual things is 100ms (10hz), and there is no more difference under 10ms (100hz)
so even the poorly designed shell code will do the job.

but if you need to acquire some input data, compute something, and then decide or not to turn something on, you may notice differences depending on the computation complexity, because wiringpi is slower than "pure" python.

WebIOPi - Raspberry Pi REST Framework to control your Pi from the web
http://store.raspberrypi.com/projects/webiopi
http://code.google.com/p/webiopi/
http://trouch.com

User avatar
jbeale
Posts: 3516
Joined: Tue Nov 22, 2011 11:51 pm
Contact: Website

Re: Simple GPIO performance benchmark

Tue Aug 14, 2012 9:38 pm

Thanks for doing and writing up this experiment; very informative! I will try to compare some measurements on my own; as I recall I saw a little bit better risetime from my GPIO outputs, than the ~ 5 ns with a lot of overshoot, that I see in your 200 MHz scope plot.

About the alligator clip causing a reboot on your R-Pi, you probably know already, but you can get the proper grounding leads for your scope probe that connect without shorting to the 0.1" header pins. Or make your own by cutting in half a F-F header jumper cable.
ScopeProbe.jpg
ScopeProbe.jpg (23.98 KiB) Viewed 21104 times
Overshoot and ringing can be caused by long ground leads; the best signal will be with the shortest ground lead, for example a 2 cm long wire soldered to the ground plane, and wrapped around the scope probe shield, like below
SProbeTest.jpg
SProbeTest.jpg (60.54 KiB) Viewed 21099 times

Heater
Posts: 13878
Joined: Tue Jul 17, 2012 3:02 pm

Re: Simple GPIO performance benchmark

Wed Aug 15, 2012 10:09 am

If your aim is to generate a clean continuous square wave you are going to be sadly disappointed.

Problem is:
1) When the kernel schedules other tasks to run your nice 5/20MHz square wave will be paused.
2) Whilst generating at 5/20 MHz you are using all of your CPU capacity.

A more robust way to do this is to use the real time features of Linux, Specifically give your program a high priority to make sure the GPIO toggling happens on time and is not paused by lower priority tasks. The sched_setscheduler() function is used for this. And use clock_nanosleep() to get the period right.

I am running a test program using these ideas based on the example here https://rt.wiki.kernel.org/index.php/Squarewave-example

At a period of 80us I am seeing a 50% load on my 800MHz Raspian.
That is only 12.5 KHz.

Now of course you can "bit bang" at 5/20Hz in a tight loop as you say, but be aware that it will be intermittent and will not be suitable for some tasks, like performing regular serial I/O.
Memory in C++ is a leaky abstraction .

User avatar
jbeale
Posts: 3516
Joined: Tue Nov 22, 2011 11:51 pm
Contact: Website

Re: Simple GPIO performance benchmark

Wed Aug 15, 2012 2:27 pm

I believe the point of the exercise was just a "simple GPIO benchmark". If you did want a high frequency square wave, you could do better by using the clock signal from the onboard I2C or SPI hardware interfaces. According to the Broadcom data sheet that interface clock is programmable up to 150 MHz (although that won't actually be achievable as an output).

Code: Select all

BCM2835 manual p. 34: I2C Clock Divider: SCL = core clock / CDIV Where core_clk is nominally 150 MHz. 
Might be a fun experiment to see just how high it will go, however.

User avatar
DexOS
Posts: 876
Joined: Wed May 16, 2012 6:32 pm
Contact: Website

Re: Simple GPIO performance benchmark

Wed Aug 15, 2012 3:48 pm

Why not add a bare metal ASM example, as a comparison.
Batteries not included, Some assembly required.

jokkebk
Posts: 7
Joined: Sun Jul 01, 2012 5:30 pm

Re: Simple GPIO performance benchmark

Wed Aug 15, 2012 7:29 pm

Thanks for jbeale for his excellent tips on improving the ground lead. I replaced the alligator clip -> male-to-female jumper wire -setup with a short piece of wire, and used the probe head directly on the GPIO pin generating the signal (the Pi resetted twice during those attempts...).

The resulting waveform was a _lot_ better, and I had to remove all my comments about signal roundness from the blog post. :D Just goes to show how much these things matter. I redid both scope screenshots from the C section, and also the one in wiringPi section, which showed a lot of ringing that wasn't really there:

http://codeandlife.com/2012/07/03/bench ... pio-speed/

And yes, to other commenters, the purpose was just to get some relative idea of the performance levels achievable with different programming languages and libraries. Keeping up with the latest versions and recompiling the Python / Perl / whatever bindings is too much of a hassle, so unless great errors are uncovered, I'll keep the post as it is and maybe do an update after six months or so (development is happening very fast currently and seems like at least one library has been updated every time I get the previous one up to date : )

User avatar
jbeale
Posts: 3516
Joined: Tue Nov 22, 2011 11:51 pm
Contact: Website

Re: Simple GPIO performance benchmark

Wed Aug 15, 2012 7:59 pm

Yes, that is quite a difference! Might be fun to leave the old rounded waveform up too, just as an example of what that extra inductance from a few inches of wire in your ground connection can do. And that's just at 20 MHz, consider how impressive it is that USB2 cables carry data at a 480 Mbps rate...

tchiwam
Posts: 43
Joined: Mon Nov 24, 2014 4:01 pm

Re: Simple GPIO performance benchmark

Sun Jan 04, 2015 4:30 pm

Reviving an old thread :

My goal is to make a library with good enough timings to bitbang i2c, cameras, spi , serial stuff and address bus and data bus for PC104 8 and 16 bit(ISA bus with a pin trough connector). So I started to loop time the GPIO. I made some wrappers to make it easy and allow the very neat mmap GPIO methods to be used too when speed is really needed. Looks rather good as the bus should work from 4.77Mhz up to about 8Mhz.

Rpi B+ 1Ghz/500MHz/500Hhz
Function ptrgpio 14.985MHz min:0.066563 max:0.067093 avg:0.066733 count: 10
Function ptrgpio + sleep(0) 4.728MHz min:0.211349 max:0.211839 avg:0.211524 count: 10
Macro 21.578MHz min:0.046292 max:0.046576 avg:0.046344 count: 10
Macro + sleep (0) 5.269MHz min:0.189604 max:0.190194 avg:0.189791 count: 10
Macro + 2*sleep(0) 3.001MHz min:0.333006 max:0.333609 avg:0.333242 count: 10

Rpi B+ 900MHz/450MHz/450MHz

Function ptrgpio 14.987MHz min:0.066584 max:0.067110 avg:0.066723 count: 10
Function ptrgpio + sleep(0) 4.725MHz min:0.211473 max:0.211849 avg:0.211618 count: 10
Macro 21.543MHz min:0.046277 max:0.046707 avg:0.046420 count: 10
Macro + sleep (0) 5.267MHz min:0.189686 max:0.190008 avg:0.189848 count: 10
Macro + 2*sleep(0) 3.000MHz min:0.333154 max:0.333543 avg:0.333333 count: 10

Samples of the code I did, taken from http://elinux.org/RPi_Low-level_periphe ... .28GPIO.29 :

Code: Select all

  /* The Macro way*/
  t = ptrtimer_init(0);
  volatile unsigned *gpio = gpio1->gpio_map;
  printf("Macro                       ");
  for(j=0;j<10;j++)
  {
    ptrtimer_start(t);
    for(i=0; i<1000000; i++)
    {
      GPIO_SET = 1<<4;
      GPIO_CLR = 1<<4;
    }
    ptrtimer_stop(t);
  }
  printf("%.3fMHz ",10/t->timer);
  ptrtimer_report(t);
  ptrtimer_close(t);

  /* The Macro + sleep (0) */
  t = ptrtimer_init(0);
  printf("Macro + sleep (0)           ");
  for(j=0;j<10;j++)
  {
    ptrtimer_start(t);
    for(i=0; i<1000000; i++)
    {
      GPIO_SET = 1<<4;
      sleep(0);
      GPIO_CLR = 1<<4;
      sleep(0);
    }
    ptrtimer_stop(t);
  }
  printf("%.3fMHz ",10/t->timer);
  ptrtimer_report(t);
  ptrtimer_close(t);

  /* The Macro + 2*sleep (0) */
  t = ptrtimer_init(0);
  printf("Macro + 2*sleep(0)          ");
  for(j=0;j<10;j++)
  {
    ptrtimer_start(t);
    for(i=0; i<1000000; i++)
    {
      GPIO_SET = 1<<4;
      sleep(0);
      sleep(0);
      GPIO_CLR = 1<<4;
      sleep(0);
      sleep(0);
    }
    ptrtimer_stop(t);
  }
  printf("%.3fMHz ",10/t->timer);
  ptrtimer_report(t);
  ptrtimer_close(t);

The trouble I have is ... nanosleep and usleep are dead slow even with a usleep(0);

Code: Select all

 /* The Macro + nanosleep(0) */
  t = ptrtimer_init(0);
  printf("Macro + nanosleep(0)        ");
  struct timespec ns;
  ns.tv_sec = 0;
  ns.tv_nsec = 10L;
  
  for(j=0;j<10;j++)
  {
    printf("%d \n",j);
    ptrtimer_start(t);
    for(i=0; i<1000000; i++)
    {
      GPIO_SET = 1<<4;
      nanosleep(&ns,NULL);
      GPIO_CLR = 1<<4;
      nanosleep(&ns,NULL);
    }
    ptrtimer_stop(t);
  }
  printf("%.3fMHz ",10/t->timer);
  ptrtimer_report(t);
  ptrtimer_close(t);  
ptrblabla is my way to segregate my functions ... If someone is interested I can move my git stuff to an open repo somewhere like github. If no one asks I'll simply keep it here not to add yet another unused repo. PS sorry for the coding style, I'm willing to change it if someone want to join the fun.

Philippe
Last edited by tchiwam on Sun Jan 04, 2015 5:15 pm, edited 1 time in total.

User avatar
joan
Posts: 14472
Joined: Thu Jul 05, 2012 5:09 pm
Location: UK

Re: Simple GPIO performance benchmark

Sun Jan 04, 2015 4:41 pm

The code will be more legible if you wrap it in

Code: Select all

 
quotes (Code button available when editing posts).

Just because you can toggle the gpio at x MHz doesn't mean you can reliably bit bang at x MHz. I'm not sure what the ratio of toggle to bit bang is, just > 1, perhaps 20 or so.

tchiwam
Posts: 43
Joined: Mon Nov 24, 2014 4:01 pm

Re: Simple GPIO performance benchmark

Sun Jan 04, 2015 5:20 pm

Thanks for the \[code\] tip

Code: Select all

Function ptrgpio            14.991MHz min:0.066564 max:0.066976 avg:0.066706 count: 10 
Function ptrgpio + sleep(0) 4.728MHz min:0.211327 max:0.212040 avg:0.211521 count: 10 
Macro                       21.559MHz min:0.046291 max:0.046563 avg:0.046384 count: 10 
Macro + sleep (0)           5.271MHz min:0.189596 max:0.189848 avg:0.189715 count: 10 
Macro + 2*sleep(0)          3.002MHz min:0.332890 max:0.333396 avg:0.333139 count: 10 
Macro + usleep(0)           6.332kHz min:0.156844 max:0.158509 avg:0.157918 count: 10 
Macro + usleep(5000)        98.301Hz min:1.017122 max:1.017452 avg:1.017285 count: 10 
Macro + nanosleep 0ns       6.356kHz min:0.155238 max:0.158968 avg:0.157323 count: 10 
Macro + nanosleep 5000000ns 98.312Hz min:1.016978 max:1.017345 avg:1.017166 count: 10 
Here are the latest results with usleep and nanosleep. Very interesting to see that even at 0, the performance hit is quite big.

I know the output of the GPIO is not going to be a nice square wave at 20MHz over a 6" 40pin ribbon cable. But it's a good start to be able to adjust some timings. For example some chips I have won't go over 4Mhz. The idea is the leave a maximum of sleep time to let everyone else do what they need too ;)

User avatar
mc007ibi
Posts: 66
Joined: Wed Dec 16, 2015 7:36 pm
Location: barcelona

Re: Simple GPIO performance benchmark

Mon Oct 24, 2016 12:38 pm

Hi,
does someone knows who to make such benchmark without a oscilloscope? I would like to run this tests for "Jerry-Script" (a lightweight version of Node-JS) and on the native side the pigpio C library (https://github.com/joan2937/pigpio). Same for Node-JS with the same binding (using https://github.com/fivdi/pigpio#performance).

User avatar
mc007ibi
Posts: 66
Joined: Wed Dec 16, 2015 7:36 pm
Location: barcelona

Re: Simple GPIO performance benchmark

Sat Oct 29, 2016 5:39 pm

Ok, got my oscilloscope. I did run just for reference on the recent PI-3 and the code from the original blog:

Python: 95 kHz

Javascript (uses C binding, see https://github.com/fivdi/pigpio): 50 kHz

XBlox (my own visual block language, uses the same pigpio lib): 91 kHz . I am confused about this result, because I literally do only forward the calls but yet its faster.

I guess to make the GPGIO really faster, in the desired MHz range for both languages, you could send batch jobs to a C daemon. Not sure how that could look like but I found recently some idiomatic language approaches to translate Python, JS from AST to C. See https://github.com/alehander42/pseudo for more.

fivdi
Posts: 208
Joined: Sun Sep 23, 2012 8:09 pm
Contact: Website

Re: Simple GPIO performance benchmark

Sun Oct 30, 2016 7:39 am

Can you post the code that was used to test the pigpio JavaScript module?

50KHz doesn't seem correct. I'd expect a value closer to 1MHz.

Measured with software the following test can change the state of a GPIO approximately 2,000,000 times per ssecond on Pi 3 which means a frequency of 1MHz:
https://github.com/fivdi/pigpio/blob/ma ... ormance.js

User avatar
mc007ibi
Posts: 66
Joined: Wed Dec 16, 2015 7:36 pm
Location: barcelona

Re: Simple GPIO performance benchmark

Sun Oct 30, 2016 9:29 am

Hi,
Ash on my head, I think I measured this wrong and my crappy 90$ oscilloscope doesn't measure beyond 1Mhz.
with a setting of 0.1V and 0.1ms :
- Python gives me 100Kc (0.1MHz)
- pigpio gives me 50Kc (node-v4.4.7)

not even sure how to translate this right to the numbers the benchmark article but given he said 90Khz for Python this numbers seem to sum up. I placed a video here, showing the test. Anyhow, I was more interested how the languages perform compared. Seeing Python double as fast as JS with a native binding makes me wonder where the juice went. I gonna try that with Samsung's "Jerry-Script. Possibly it removes some overhead you have in plain JS.

Return to “Interfacing (DSI, CSI, I2C, etc.)”