LdB
Posts: 1373
Joined: Wed Dec 07, 2016 2:29 pm

Pi 3B+ Baremetal sample code

Sun Jan 15, 2017 10:57 am

After all the dramas to get the Pi3B+ pickup up Baremetal I threw an article and the code up on Code Project as it may help other people starting out

https://www.codeproject.com/Articles/11 ... -land-part

Hope it helps someone out.

User avatar
Ultibo
Posts: 160
Joined: Wed Sep 30, 2015 10:29 am
Location: Australia
Contact: Website

Re: Pi 3B+ Baremetal sample code

Mon Jan 16, 2017 10:36 pm

So it seems that you never actually resolved why you couldn't get your code to boot on a Pi3, that might be your choice but it is a shame you have decided to present this to others as facts.

There are a number of errors in your article, firstly there are multiple bare metal projects that run happily on Pi3 such as Circle and Ultibo and these projects boot from 0x8000 without needing to use kernel_old=1 or any such thing.

Equally your SD card image does not contain the fixup.dat file which is required to allow the firmware to be relocatable and therefore get access to the full 1GB on the Pi2 and Pi3.

If your bare metal code does not boot when loaded at 0x8000 then I can only suggest you have errors in your code, there is certainly no bug in the firmware that prevents this from working. The Pi is all about learning so if you provide information for others to use as a resource it is important that you make sure it is factually correct.

Garry.
Ultibo.org | Make something amazing
https://ultibo.org

Threads, multi-core, OpenGL, Camera, FAT, NTFS, TCP/IP, USB and more in 3MB with 2 second boot!

LdB
Posts: 1373
Joined: Wed Dec 07, 2016 2:29 pm

Re: Pi 3B+ Baremetal sample code

Tue Jan 17, 2017 2:54 am

Garry I am sorry that you feel like that but what the article was about was stopping others going thru the days of frustration I went thru. I will happily remove any reference to "bug" and instead call it undocumented behaviour of the Pi firmware loader.

Update: I see you are from Ultibo and I have very bad news for you about Ultibo your image file does not work on my Pi3B+.
If I take the image file from (https://ultibo.org/download/) I just get the same old startup color screen as every other code that fails. So you have the problem on my Pi3B+ as well ... so the question is have you ever seen Ultibo run on a Pi3 B+ or did you just assume it would work.
Here are the date timestamps on the SD card to verify it is your image file but it won't pickup my Pi3B+
http://s000.tinyupload.com/index.php?fi ... 7426447157

Sorry my problem is now your problem :-)
I shouldn't laugh but it is a relief to know it isn't just my code that has that problem and funny in context of your answer above.

I also received another Pi3B+ board today and it behaves exactly the same as first board .. so ruling out bad board. I already guessed that because Raspbian can kick it properly. Your image code does boot properly on my Pi1 and I get the yellow screen and the tutorial start. So you are in the exact same boat, Ultibo can boot from 0x8000 on a Pi1 but not on a Pi3B+.

I will put up a message if I manage to sort this mess out, could you do same if you solve it.

Update: Your circle repo may have given me the answer to the problem they have a cute macro that is executed immediately at startup and it is even subject to a copyright notice to Russell King at the top of the file (https://github.com/rsta2/circle/blob/ma ... /startup.S). It doesn't have a built img unfortunately so I have to do a full compile to check it works.

Thank you for the detail about the fixup.dat file. I will check it works and add that detail to the article and remove the word bug.

Side note where in Oz do you hail from .. I am Perth.

User avatar
rpdom
Posts: 15936
Joined: Sun May 06, 2012 5:17 am
Location: Chelmsford, Essex, UK

Re: Pi 3B+ Baremetal sample code

Tue Jan 17, 2017 5:49 am

LdB wrote:Update: I see you are from Ultibo and I have very bad news for you about Ultibo your image file does not work on my Pi3B+.
There is no such thing as a Pi3B+

Which model Pi do you really have?

User avatar
Ultibo
Posts: 160
Joined: Wed Sep 30, 2015 10:29 am
Location: Australia
Contact: Website

Re: Pi 3B+ Baremetal sample code

Tue Jan 17, 2017 7:37 am

LdB wrote:Update: I see you are from Ultibo and I have very bad news for you about Ultibo your image file does not work on my Pi3B+.
If I take the image file from (https://ultibo.org/download/) I just get the same old startup color screen as every other code that fails. So you have the problem on my Pi3B+ as well ... so the question is have you ever seen Ultibo run on a Pi3 B+ or did you just assume it would work
That's my point exactly, the Ultibo code is regularly tested on every known model of Pi including the recently released Pi2B rev 1.2 (with the 64-bit chip) and we test them pretty well 24/7.

I can't download your SD image from codeproject.com because I don't have a login but if you are using the set of files shown in the screenshot then I can guarantee you that Ultibo will NOT boot because you are missing the fixup.dat file and that will cause the Pi3B to report only 256MB of memory. I know that because we spent time working out why things work and why they don't so that we know the facts.
LdB wrote:Here are the date timestamps on the SD card to verify it is your image file but it won't pickup my Pi3B+
Sadly that file contains nothing but a piece of Malware so I can't see what you posted.
LdB wrote:Sorry my problem is now your problem :-)
I shouldn't laugh but it is a relief to know it isn't just my code that has that problem and funny in context of your answer above.
No the problem still seems to be yours alone, if you test the Ultibo demo image with the complete set of firmware files (bootcode.bin, start.elf and fixup.dat) or simply extract all the files in the zip, you will see that it works on all Pi models.

I don't really care to carry on this discussion any further but I think if you spent less time lamenting the lack of documentation and a little more time finding out the answers (and there are others here who can help with those answers) then you might just learn something.
LdB wrote:Update: Your circle repo may have given me the answer to the problem they have a cute macro that is executed immediately at startup and it is even subject to a copyright notice to Russell King at the top of the file (https://github.com/rsta2/circle/blob/ma ... /startup.S).
That's the code that deals with exiting HYP mode and returning to SVC mode, something that has been discussed in this forum many times.
Ultibo.org | Make something amazing
https://ultibo.org

Threads, multi-core, OpenGL, Camera, FAT, NTFS, TCP/IP, USB and more in 3MB with 2 second boot!

LdB
Posts: 1373
Joined: Wed Dec 07, 2016 2:29 pm

Re: Pi 3B+ Baremetal sample code

Tue Jan 17, 2017 8:16 am

http://www.dhgate.com/store/product/201 ... 29703.html
The store calls it a Raspberry Pi 3 Model B+ so I call it that, but apparently it doesn't exist ... yet I have one.

My article has the printout of the board serial number it reports and the CPU is definitely a cortex A-53 as I read the CPUID.

I will say the thing is however only labelled as a Raspberry Pi3B v1.2 on the board and my board looks exactly like it does on the RS website for the Raspberry Pi3 SBC.

I have ordered a Raspberry Pi3 SBC from RS which should be here in next few days to compare.
http://au.rs-online.com/web/p/processor ... 431027|alt

The thing that I purchased called a Pi3 Model B+ will not load your SD card image as on your website. Yes I just unzipped it and all I get is the coloured start screen. The image I uploaded showed the datetime stamps so you could see that is exactly what I did. I have no code involved in this and have nothing else on the SD card.

As you don't care to discuss this any further, and as I can't seem to convince you what I am seeing, I will leave it up to others to convince you if and when they run across the problem (which you don't have).

For my part there is one last thing I have just thought about which is to try a different SD card manufacturer. I have 3 SD cards but they are all ScanDisk Ultra 16GB and it's something I will quickly check. Seems weird it would pick it up from 0x0 but not from 0x8000 but hey I am running out of ideas.

User avatar
Ultibo
Posts: 160
Joined: Wed Sep 30, 2015 10:29 am
Location: Australia
Contact: Website

Re: Pi 3B+ Baremetal sample code

Tue Jan 17, 2017 10:13 am

How about a different tact, since you say you've tried almost every piece of bare metal code on the internet and none of them work on your Pi3 maybe you could post a small example piece of code (including the linker script and compile instructions) that doesn't work for you and others can test it to confirm (or deny) the results with their Pi3 boards.
Ultibo.org | Make something amazing
https://ultibo.org

Threads, multi-core, OpenGL, Camera, FAT, NTFS, TCP/IP, USB and more in 3MB with 2 second boot!

LdB
Posts: 1373
Joined: Wed Dec 07, 2016 2:29 pm

Re: Pi 3B+ Baremetal sample code

Tue Jan 17, 2017 11:59 am

Lets me see I have this background correct

I have spent about 6 weeks on code and $750 on Pi boards to write articles on those Pi boards, which I distribute the entire code for free, for no other reason to help others. However I have gone to all the trouble to LIE about the Pi3 board because I __insert reason here ____.

You say Ultibo is built on Free Pascal, take a look at the 1st name on the credits for Free Pascal (http://www.freepascal.org/credits.var). Yes more free code I have donated to the public domain because I am a sucker for the ungrateful public.

Now I wouldn't ever use the Pi board commercially because of the documentation issue and I am not going to waste further time and money dealing with this rubbish. I am happy picking the processor up from 0x0 which it does reliably everytime, at the end of the day what do I care about engaging the firmware bootloader to start at 0x8000 it doesn't give me anything extra I can't already do. The cost to pick the processor up at 0x0 is a couple of lines in a text file on a 16GB SD card .. I am sure that is a deal breaker to students etc out there trying to code for fun.

In all that response I am still looking for the part where what you think matters? I am pretty sure my reputation will survive what you make of me but the irony of you using FreePascal made me laugh.

However given your rather strenuous objection I will alter the article and say the Pi Community say there isn't a problem loading from 0x8000 but I had issues which may well be down to my code or the board I have but IDGAF ... end of story.

Hey I am probably lying but for what it's worth your SD image does flick up an partially start like my SD code on these two boards but it's always partial start. It looks weird like the cache is starting with junk in it or turned on in L2 mode or something but what would I know. Anyhow one day I might solve the LIE but for now I will just ignore it and keep on trucking.

You are convinced your code is good on all Pi3 boards and I couldn't be happier for you. I have submitted a disclaimer in the article and it has updated and that is about where this trash ends for me.

Now I think the discussion really is over .. before this gets heated. I am not sure I will tolerate being called a liar for a third time.

dwelch67
Posts: 961
Joined: Sat May 26, 2012 5:32 pm

Re: Pi 3B+ Baremetal sample code

Tue Jan 17, 2017 9:47 pm

Saw the article, not sure what the flamewar is really about. Didnt realize they added kernel8 this and that, so I have to back and play, they do this every so often, thanks for including that.

In case this isnt already covered, there is a huge difference between booting from 0x0000 and 0x8000. At 0x8000 the gpu arm loader has placed code that isolates the cores (on a pi2 and 3) so that only one goes through (to 0x8000) the other three are spinning in a loop watching a register (they each have their own) waiting for a non-zero address to show up so they can start running at that address. You also have the issue which may (must) be different now with this kernel8 64 and 32 bit img thing. That startup code the gpu places for the arm will put the arm in a certain mode, like leave it in 64 bit but perhaps put it in HYP mode or switch to 32 bit HYP or whatever.

0x8000 is the best place to start for a beginner on the pi2 or pi3 as you dont have to figure out how to sort the cores out and you dont have the four of them running on top of each other fighting over whatever peripheral you are trying to use (uart, etc).

Once you figure out or borrow code from the pi folks (just dump the beginning of memory and disassemble) or one of us here, to sort out the cores. Then start booting from 0x0000 and/or form your own personal opinion. If you take someones code and just build and use it you have to load it at the address it was designed and/or linked for otherwise it just wont work. You can be a little position independent but I dont think completely, not without extra work and why bother with that extra work, doesnt gain you anything.

I will say that providing a big package of already done code be it a library or an environment is not really bare metal so just taking something and trying it is not a bare metal experience, taking it and reading it and understanding it and owning it is. 100 lines of code is a lot easier to read and then own than 10000 or 100000, so shop around there are a number of folks with bare metal examples. Language doesnt matter pascal, C are both good bare metal high level languages, easy to read. I am very happy to see Pascal staying alive and those tools being usable for bare metal. Probably an easier language to learn and use if coming at this from nothing, both are worth learning, but are just a means to peek and poke at addresses where peripherals live which I would argue is the reason for bare metal, not to make library calls.

So not really sure what the issue is here...

LdB
Posts: 1373
Joined: Wed Dec 07, 2016 2:29 pm

Re: Pi 3B+ Baremetal sample code

Tue Jan 17, 2017 11:35 pm

Hey David, it was you or Peter Lemon I really needed to talk to as I sort have identified the problem by pure accident last night while playing with trying to get it to boot from USB, as now I have the USB running. Can you tell me anything about the dt overlay files with the changes to the latest boot loader and the start frequency.

Really what I need to establish is what the switch between using dt and atags is at a code level. It is almost like it looks to be getting a mix of both and takes what appears to be weird parameters and screws the cache on load up. So being specific I am trying to understand part 3 from this from a baremetal perspective given my kernel image won't have a DT-compatibility trailer (https://www.raspberrypi.org/documentati ... e.md#part3).

The second thing I would love to know is does the firmware loader turn the L2 cache on at any point?

I also noted the bit at the bottom on the documentation on https://github.com/raspberrypi/document ... /bootmodes
It almost sounds like they know a certain number or batch of Pi 3's will fail to boot is that related?

To be honest it took me less to sort out the core access to the peripherals than this boot mess as that one is clearly documented. The trap for young players is when you switch to 64 bit mode, that did take me a bit of working out.

I get the advantage the loader gives to all the various linux flavours, but I don't see any advantage it gives baremetal code just headaches. For example you talked about the cores 1,2,3 being parked waiting for an instruction. I assume that is done via the mailbox and is it documented anywhere? I am just trying to sort that out myself for a multicore version and would be interested. I have just completed a funny demo kernel.img that will run seamlessly on a Pi1, Pi2 & Pi3 as it stubs code in and out at runtime based upon the CPU it sees and the Peripheral IO address it detects that will go up in an article shortly which is rather cute. For most of us in the embedded market we are very much familiar with auto-detecting versions and dealing with them the idea of needing different "image" files is a bit of anathema to us. I would rather have a shadow rom to hold last image if you are going to give me the choice of changing image files.

Update: Ignore mailbox question it's answered and works
viewtopic.php?f=72&t=98904&start=25

Anyhow I have ignored the boot problem for now as I have the other boot modes to play with for now and it really doesn't effect my play code. It's nice to not be on a commercial footing and having to solve problems for a change, I am getting to old for that rubbish :-)

AlfredJingle
Posts: 69
Joined: Thu Mar 03, 2016 10:43 pm

Re: Pi 3B+ Baremetal sample code

Wed Jan 18, 2017 5:36 pm

The discussion made me curious!

So I downloaded Ultibo, put the files on an empty SD card and booted a Raspi 3 with it. Worked first time! (and very nicely too!) The same for the examples from dwelch (which I knew would run anyhow as I use his bootloaders on a daily basis on a raspi 3).

@LdB: maybe you can post some of your code which does not boot on a Raspi 3. I am more than willing to look at it. Including the config.txt please, as that is where most of my problems with booting stem from in the past.
going from a 6502 on an Oric-1 to an ARMv8 is quite a big step...

dwelch67
Posts: 961
Joined: Sat May 26, 2012 5:32 pm

Re: Pi 3B+ Baremetal sample code

Thu Jan 19, 2017 2:51 am

Well I thought I was catching up tonight

https://github.com/dwelch67/raspberrypi ... i3/imgtest

but reading your reply sounds like I am another two steps behind. Have not messed with the boot over other media or anything like that, and I guess maybe I am lucky that my pi3 just boots every time with the sd card I have, so have not run into this needing a delay or anything like that. Can now run without a config.txt and get aarch64 or aarch32.

As to starting with he cache on I have some commented out code in one of my bootstraps

;@ stop caching !!!
;@ mrc p15,0,r2,c1,c0,0
;@ bic r2,#0x1000
;@ bic r2,#0x0004
;@ mcr p15,0,r2,c1,c0,0

but I commented it out, so what does that mean? It was a bootloader so if there was caching I would definitely need it. should be easy enough to just read and print to see if they are leaving it in that mode.

There are some broadcom docs that talk about the mailbox registers in that where they are but that the other three cores are polling and what they are looking for is as far as I remember not shown. Ultibo and other folks as well as myself dumped and disassembled the code they place at 0x0000 when allowing them to boot then jump to 0x8000 (basically the without config.txt setup) and in there you can see the other cores being sent to poll their mailboxes and that they are looking for a non-zero addresses in those mailboxes. And playing with that feature seems to work can get the other cores to come up and do something independent of each other.

The interesting thing I also learned tonight is before we didnt have a native, from Broadcom/raspberrypi.org 64 bit non-zero address boot solution. Now that we do it appears for kernel8.img they are booting to 0x80000 and for the 32 bit modes 0x8000, that or I have a bug in.

Code: Select all

.globl GETPC
GETPC:
    mov x0,x30
    ret
As far as atags vs device tree, I dont use them at all so have not needed to care, it would assume if possible it would likely be yet another config.txt entry that you would have to research. Perhaps they dropped ATAGS or perhaps you can ask them to add an entry if they dont have one already, perhaps kernel.img does ATAGS and the others do device tree. I dont know, dont use them. Like anything baremetal, gotta research and/or hack your way through it...

David

User avatar
Gavinmc42
Posts: 4240
Joined: Wed Aug 28, 2013 3:31 am

Re: Pi 3B+ Baremetal sample code

Thu Jan 19, 2017 4:48 am

Ouch, makes me glad I found Ultibo and saw that it was good.
No need for barearsemetal, whew.

Mostly past the age of reading manuals to look at registers, would rather make code that lets Pi's do stuff.

There is baremetal for learning how the Pi works and doing everything from nothing.
Then there is using a tool that makes code with no need for an OS.

Who made the first blacksmith hammer, tongs, anvil?
You can go get all the tools and start making things or you can start with lumps of metal and tongs made from wet sticks and make your first metal tongs. Er bit hard to do without the first hammer, rock on a stick?

I know how micros work, programmed them in hex from a hex keyboard, just after the toggle switch era.
Change this bit in that register..... yes I am grumpy old man too :lol:

If LdB wants to show everyone how to make hammers, tongs and anvils that fine with me.
But there is no need to whinge about how hard it is, others have been there and done that.

With Ultibo I get the anvil, tongs and hammer plus I can get other FPC tools that mostly just work.
Still working in metal and making things and I have been for nearly a year :P
Saying you cannot get the tool working when others have no problems means YOU are doing something wrong.
Or you just got unlucky and got one of those edge case Pi's.

Bleeding edge man, stick a bandaid on it and move on.
I'm dancing on Rainbows.
Raspberries are not Apples or Oranges

LdB
Posts: 1373
Joined: Wed Dec 07, 2016 2:29 pm

Re: Pi 3B+ Baremetal sample code

Wed Jan 25, 2017 6:04 pm

David the Pi3 boot sequence is still doing my head in.

So I now have 5 Pi3B's and of the 5 only 2 will reliably start at 0x8000.It's looking like an SD card issue because if I use one particular SD card one more starts.

Update: Got more working by playing with delay

This leaves me still pondering why only the 0x8000 boot is affected by SD card issue the 0x0000 boot never fails?????

dwelch67
Posts: 961
Joined: Sat May 26, 2012 5:32 pm

Re: Pi 3B+ Baremetal sample code

Thu Jan 26, 2017 1:52 am

yeah, that is a head scratcher, perhaps all the sd cards you have are marginal with respect to the pi. Is there something in the config.txt that helps boot from 0x0000. it is strange that all work from zero implying that they can all successfully use whatever sd card you have navigate through the filesystem find config.txt in the directory, then find the sector or sectors that make it up and not slip a bit anywhere. but by the time they want to read kernelx.img, they cant do that?

I have not researched what the problem is or perhaps they have not really described it in enough detail. It just seems like a broken product that the pi folks should take returns on and replace with working product.

Oh, and to even get to config.txt they have to have read bootcode.bin and start.elf without error (well I assume config.txt comes after those). If the products from my day job had this problem, these units would be coming back and get replaced, and/or we would very clearly have documented exactly what parts work, which for sd cards in a pi is a bit silly but in other businesses, it may be the norm that you have to pick from the list.

So trying to understand again what config.txt (is it config.txt) entry made your stuff work? I see these on your blog

kernel_old=1
disable_commandline_tags=1
disable_overscan=1
framebuffer_swap=0

I cant see how any of those have anything to do with why copying kernelx.img to 0x0000 or 0x8000 and adding some bootstrap code makes a difference. If you use a real kernelx.img from the raspberry pi folks, one of the linux ones, does it work on these pi3 cards?

I think I only have two pi3's and they were bought right when the came out, and have done bare metal on both. Or maybe I only have one but it has always worked for me obviously.

I have had lots of problems with the original pis and sd cards, not the brand or make, but the formatting. And if it was not super careful with syncing and unmounting the pi would just not boot, do a format of the card, try again, success. So I dont really know what was going on their, either my habits got better or perhaps their bootcode had a problem with maybe fragmentation on the filesystem or who knows...Maybe if you reformat the cards, that might help? Actually if you have a card that works in one and not in the others then it is probably not this problem either.

hmmm...

LdB
Posts: 1373
Joined: Wed Dec 07, 2016 2:29 pm

Re: Pi 3B+ Baremetal sample code

Thu Jan 26, 2017 4:36 am

Ok I finally got an SD card that works in all the 5 PI's, so it is most definitely an SD card issue causing the problem.

It is marked as UHS U3 speed rating while my other SD cards are all the others are marked as Class 10. Did a quick search and it would appear the UHS U3 card is a faster card.

I don't understand why a copy to 0x0000 would be better than a copy to 0x8000 other than if it's some alignment issue to the write process.

My smart start test image is identical for both starts, I mean I don't even recompile it .. It is physically the same code. The first bit of code looks at the PC address it's been loaded to and then starts copying blocks of code (all based on the start PC) manually to new fixed memory positions that the code is designed to run at. So it's like a mini boot loader stub it doesn't care if it starts at 0x0000 or 0x8000 or any other address it boots the same which was the intention hence it's name smartstart. In my case it is reading the exact same image file, so the problem has to be in the write process which is all that is really different for me.

Update: Found another quirky thing the bootloader plays with which is reversing the Red and Blue channels on the screen. You can control it with a line in config.txt but it's strange they do it. Now my smart image will auto boot I couldn't get the colours right for both and I thought it was a bug and was scratching my head. I am supposed to go and read the order it wants RGB in apparently so another thing to do. They cover it here
viewtopic.php?t=22802&p=214478

I was tearing my hair out thinking there was something still wrong with the loader even on my new SD card. I laughed and sighed relief when I found it out but a real gotcha again.

ChattanoogaDave
Posts: 4
Joined: Tue Dec 13, 2016 10:52 am

Re: Pi 3B+ Baremetal sample code

Fri Jan 27, 2017 3:57 pm

Hi all,

I am quite new to the entire topic of bare-metal programming, so this is probably a stupid question.

I define a storage at 0x8000000 (128 MB) and make it 1 MB long, i.e. 8 Mbit. This storage is first cleaned (zero) and then I read out GPIO 25 status to fill into a bit at a time (capture) while the time (0x3f003004) is running. You can see the code below. I compiled with arm-none-eabi on a Pi 3 B

The problem which I came across: I realized by switching another pin on when this sequence starts, and off when it is finished, that it takes around 10 s. I was rather expecting a higher speed and would like to know any tips how to make this specifically for ARM since I only started the whole thing about 3 weeks ago.

Thanks for all helpful hints to increase the speed.
Best wishes
Dave

Code: Select all

ldr r9, =1				@blink pin 21 HIGH
lsl r9, #21
str r9, [r12, #28]
@		counts from PCM
ldr r7, =0x8000000		@record data to address 128 MB
ldr r3, =1000000		@measurement size 1 MB
ldr r8, =0				@r8 always counter register
ldr r0, =0
zero_start:			@clear entire storage
	str r0, [r7, r8]
	add r8, #4
	cmp r8, r3
bne zero_start

ldr r3, =8000000		@measurement size in bit
ldr r8, =0
ldr r4, =0x3f003004		@cpu cycle counter clock (START)

capture_start:
	ldr r0, [r12, #52]	
	lsr r0, #25		@GPIO 25
	and r0, #1	

	and r5, r8, #31		@position within a Dword
	lsl r0, r5

	lsr r5, r8, #5		@new 32-bit-register
	lsl r5, #2
	ldr r1, [r7, r5]
	orr r1, r0, r1		@store new bit into the row
	str r1, [r7, r5]

	add r8, #1
	cmp r3, r8
bne capture_start

@store time difference in front of the count package
ldr r5, =0x3f003004		@cpu cycle counter clock (END)
sub r5, r4
ldr r4, =0x7FFFFFC		@data base-address
str r5, [r4]
ldr r9, =1				@blink pin 21 LOW
lsl r9, #21
str r9, [r12, #40]

dwelch67
Posts: 961
Joined: Sat May 26, 2012 5:32 pm

Re: Pi 3B+ Baremetal sample code

Fri Jan 27, 2017 9:06 pm

1MB is 2^20 bytes or 1048576 bytes.

You are writing 4 bytes per loop, that is going to take a while much of your time fetching instructions, etc.. Try something like this and see if it changes things.

Code: Select all

mov r0,#0x08000000
mov r1,#0x10000
mov r2,#0
mov r3,#0
mov r4,#0
mov r5,#0
top:
   stria r0!,{r2,r3,r4,r5}
   subs r1,r1,#1
   bne top
And turn on the instruction cache, dram is really slow in general and certainly for a loop like this.

dwelch67
Posts: 961
Joined: Sat May 26, 2012 5:32 pm

Re: Pi 3B+ Baremetal sample code

Fri Jan 27, 2017 9:11 pm

with your loop you were doing 32 bit writes per transaction so when it gets to the memory which I assume is 64 bits, it has to do a 64 bit read, modify 32 bits then write them back.

Even something like this may help quite a bit.

Code: Select all

mov r0,#0x08000000
mov r1,#0x20000
mov r2,#0
mov r3,#0
top:
    strd r2,[r0],#8
    subs r1,r1,#1
    bne top
64 bit aligned writes per transaction rather than 32. in theory more than twice as fast, perhaps 4 times as fast.

ChattanoogaDave
Posts: 4
Joined: Tue Dec 13, 2016 10:52 am

Re: Pi 3B+ Baremetal sample code

Wed Feb 01, 2017 9:32 am

Thanks a lot, guys.
I've adapted the "1 MB" to a hexadecimal 0x100000

By help of a friend, we made the "capture loop" a bit shorter the last days and it gathered points from a function generator (8 kHz input to GPIO 25) with a frequency of around 1.1 MHz (before: 730 kHz). Here it is (r12 = GPIO Base address, r7 = storage target base, r8 = counter register, r3 = storage target size):

Code: Select all

capture_start:
@obtain HI/LO from A (GPIO 25) store to r7++
	ldr r0, [r12, #52]	
	ubfx r0,r0, #25, #1
	strb r0, [r7, r8]
	add r8, #1
	cmp r3, r8
bne capture_start
The old one is in last week's post if you scroll down my code quote. We changed from a bit-to-bit write to a byte-to-byte. That decreased our entire measurement from 8E6 points to 1E6 - fine words butter no parsnips.
Frankly, we expected a bit more improvement (maybe around 5 MHz). If anybody would kindly share experience with me, I'd be more than happy.

Do you think, it's good to avoid instructions like "strb" which eats two cycle counts* and also accesses the storage maybe? What I'm going to try next is to count the reads into a byte (for example). Something like

Code: Select all

r8=0
r2=0
capture:
	ldr r0, [r12, #52]		@GPLEV0
	ubfx r0, r0, #25, #1	@GPIO 25
	count_up:			@add the next 256 reads to r0
		ldr r1, [r12, #52]
		ubfx r1, r1, #25, #1
		add r0, r0, r1
		add r2, #1
		cmp r2, #255		@1 Byte. Can also be smaller
	bne count_up
	strb r0, [r7, r8]
	add r8, #1
	cmp r3, r8
bne capture_start
* http://infocenter.arm.com/help/index.js ... BCJII.html

LdB
Posts: 1373
Joined: Wed Dec 07, 2016 2:29 pm

Re: Pi 3B+ Baremetal sample code

Wed Feb 01, 2017 2:18 pm

@David Welch

David do you have a contact at Pi HQ you could ask if this problem has been fixed it says its still open.
https://github.com/raspberrypi/firmware/issues/577

I started playing with the register because in secure AArch32, AArch64 mode if you write to it, then it lifts an EL-3 fault. I was playing around checking I could detect it trying reading and writing to it in different modes. Lets just say it does some interesting things if you miss-write the register or use the reserved bits they seem to have undocumented functions.

The Cortex A-53 manual, Section 4.5.32. Non-Secure Access Control Register describes everything correct just don't break the rules or prepare for fun as the reserved bits seem to actually do things.

User avatar
Ultibo
Posts: 160
Joined: Wed Sep 30, 2015 10:29 am
Location: Australia
Contact: Website

Re: Pi 3B+ Baremetal sample code

Thu Feb 02, 2017 9:33 am

LdB wrote:do you have a contact at Pi HQ you could ask if this problem has been fixed
The ARM boot stubs were open sourced in April 2016 so you can check the code yourself to see if the problem still exists.

The 3 different variants are here:

https://github.com/raspberrypi/tools/tr ... r/armstubs
Ultibo.org | Make something amazing
https://ultibo.org

Threads, multi-core, OpenGL, Camera, FAT, NTFS, TCP/IP, USB and more in 3MB with 2 second boot!

LdB
Posts: 1373
Joined: Wed Dec 07, 2016 2:29 pm

Re: Pi 3B+ Baremetal sample code

Thu Feb 02, 2017 3:08 pm

Cheers for that I haven't got my head around the politics and detail supply lines in Pi Land. If that is what they are using it's writing 0x63fff and now I get the behaviour and time to see if I can follow the instructions and build my own fix.

Oh sweet I see I can also set my own special kernel entry address in that code that will solve something that has been annoying the heck out of me in trying to host AArch32 kernel inside a AArch64 kernel. Thank you very much for that.

I joined your site the other night I am hoping to have a present for you next week, depends on work load and getting this new article finished. I was having some fun with old Pascal code from my archive libraries. I enjoyed your system it was a lot of fun although I am not sold on that yellow couldn't we go for a calming blue :-)

On a sidenote can I ask have you noticed how hot Ultibo gets the cpu on a Pi3 board. I know it's not your fault I get what it's doing and if I do any similar task switcher it heats up the same. Several I tried were much worse. I had to throw one of those glue on heatsinks on it as it was alarming me. I haven't looked much at the temp and voltage protect stuff on the PI3 yet to see if my fears were unfounded and it would look after itself. You sort of end up with your thumb on the CPU to try and slide the SD card in and out and it was very hot on my thumb and shocked me the first time.

dwelch67
Posts: 961
Joined: Sat May 26, 2012 5:32 pm

Re: Pi 3B+ Baremetal sample code

Thu Feb 02, 2017 10:15 pm

First, I have no more special connections to the pi folks than you do. I am in no way shape or form connected to arm or broadcom or anyone related to anything PI. Just bought one and started playing, now have almost one of every flavor.

Second, realistically, even commercial parts like these, if it is not melting your finger print off, boiling your skin, it really isnt even remotely hot and likely doesnt need a heat sink. I wouldnt be surprised if they can handle a junction temp of 80-100C somewhere in there and so case temp would need to be a little lower than that. I have touched heat sinks that are 80+ and it wasnt fun, the chip in that case was fine as it can operate up to 105 junction. chips are screened at near their max temp. Sure, extends the life, and can use less power, can allow you to overclock, etc if you run much lower, so it shouldnt hurt to put a heat sink on so long as it doesnt short things out, but I cant see needing it. If you can hold your finger on it indefinitely (careful not to zap it) it is cold from a chip perspective.

Third you want to do copies and such aligned and in as large a size per transfer as you can stand. Byte at a time is absolutely the worst for performance. For this platform try for at least 64 bits at a time or multiples of and aligned on 64 bit boundaries. In general for 32 bit platforms you are not saving anything by using char or short sized variables in your code, they cost you more than they save you, aim for register size or bus size unless you have a specific reason not to, doesnt matter if you are counting to 10 or counting to a million. I didnt analyze your code, I hope you understood what mine was doing though. just trying to show aligned larger transfers. I dont know about this core or I guess we have at least three cores on the pi platforms but I know another arm core pretty well and despite what the instruction is I think the writes are all converted into 64 bits at a time so even an 8 register write in a single stm instruction gets turned into multiple axi bus cycles, but a large read is taken as is. We were trying to validate our axi interface and couldnt get the arm to produce all the transfer lengths...

Are you making a logic analyzer of some sort, maybe you said this and I missed it, and maybe this is what your code is trying to do but you could gather up 32 or 64 or 128 bits at a time then do one write (instruction) , should greatly improve performance. Or if you are gathering 8 bits of something per loop you could still go 4 or 8 loops collecting the data in a register or two then do a 32 or 64 bit write. What I would do and maybe you did this is see how fast you can poll, not doing anything else but the polling loop of the gpio. Then separately see how fast you can save data using various sizes, if the performance is fail already you are done move to another platform, then put the two together, it is possible that you can approach the speed of the slower one as you are talking to two different things, reading from a peripheral and writing to ram. If it is more than the slow plus the fast plus some margin for other code, possible but worthy of investigation or discussion.

LdB
Posts: 1373
Joined: Wed Dec 07, 2016 2:29 pm

Re: Pi 3B+ Baremetal sample code

Fri Feb 03, 2017 3:19 am

The copying data I have no comment on I will leave that for the other guy, you seem to have mixed the response for both of us together.

Thank you for clarifying your position with Pi, somehow I got mixed signals you had a relationship to Pi Foundation people. One of the things I have noticed around the Pi is the community which is good and bad. The bad being there lacks definitive authority for answers on things it's often a bit gray where you go for simple authoritative answers.

Let me preface the next bit by saying I had a history of using the Pi3 for several weeks on code which was only using Core 0 and the Pi3 barely gets warm to touch ... hence my surprise and some alarm.

You seem to be going off about me suggesting the processor got surprisingly hot. So lets start by saying I thought the processor for the Pi3 was supposed to be from the ARM wearables(tm) range, you couldn't wear that processor. I regularly play with Intel wearable CPU's and they don't even get warm and run at similar speeds and specs. It would appear somewhere between the marketing and the numbers I suffered dyslexia and confused the Cortex-A53 with the Cortex-A35 and the range that targets 125mW maximum draw. I know Arm make ultra-low power range stuff and thought the Pi was using it which sort of matched my experience with the single Core 0 programs ... my bad.

I know the problem a good ultra-low power CPU will set you back more than what you pay for the entire PI3 board and they couldn't make the price targets they are after. My comment was and stands was that I was surprised it got so hot, I wasn't expecting it. I will ignore the rest about temperatures CPU's can run at because it sort of some sort of generalization you are attempting but lets just agree the temperature range a CPU is supposed to run is that which is on the data sheet. I have worked on Ceramic Xilinx SOC Arm9's that run in the leave your skin on the top of the CPU if you touch it, down to CPU's that you can't pick from room temperature. The heat versus power trade off has an inverse relationship to the price you pay for the CPU and I get why they had to go for the cheaper silicon.

I can't imagine you wearing a Pi3 board so it's probably not going to be a problem and lets just leave it at that and agree it's normal for the Pi3 to get quite warm when you get all 4 cores do some exercise on a task switcher and that is normal.

Return to “Bare metal, Assembly language”