zeoneo
Posts: 13
Joined: Sun Sep 30, 2018 6:54 am

RPI 3b (not b+) Bootloader in 32 bit mode.

Wed Nov 28, 2018 6:47 pm

Hi guys,

I am trying boot loader mentioned here (https://github.com/mrvn/raspbootin) on RPI3b. It didn't work even after Base address modification.

So I decided to give it try using UART0 code mentioned here https://wiki.osdev.org/Raspberry_Pi_Bare_Bones. The serial program works and I am able to send and receive data using this code.

I am using https://github.com/mrvn/raspbootin/tree ... aspbootcom raspbootcom on host computer(OSX).

when kernel8-32.img size is 4120 bytes which = 0x1018 in hex. I am trying to send this raspberry pi instead raspberry pi received it as 0x1810.

What could be the problem?

RPI side code when it receives kernel size.

Code: Select all

	uint32_t size = uart_getc();
    size |= uart_getc() << 8;
    size |= uart_getc() << 16;
    size |= uart_getc() << 24;




kernel.c

Code: Select all

#include <stddef.h>
#include <stdint.h>
 
extern void BRANCHTO ( unsigned int );

// Memory-Mapped I/O output
static inline void mmio_write(uint32_t reg, uint32_t data)
{
	*(volatile uint32_t*)reg = data;
}
 
// Memory-Mapped I/O input
static inline uint32_t mmio_read(uint32_t reg)
{
	return *(volatile uint32_t*)reg;
}
 
// Loop <delay> times in a way that the compiler won't optimize away
static inline void delay(int32_t count)
{
	asm volatile("__delay_%=: subs %[count], %[count], #1; bne __delay_%=\n"
		 : "=r"(count): [count]"0"(count) : "cc");
}
 
enum
{
    // The GPIO registers base address.
    GPIO_BASE = 0x3F200000, // for raspi2 & 3, 0x20200000 for raspi1
 
    // The offsets for reach register.
 
    // Controls actuation of pull up/down to ALL GPIO pins.
    GPPUD = (GPIO_BASE + 0x94),
 
    // Controls actuation of pull up/down for specific GPIO pin.
    GPPUDCLK0 = (GPIO_BASE + 0x98),
 
    // The base address for UART.
    UART0_BASE = 0x3F201000, // for raspi2 & 3, 0x20201000 for raspi1
 
    // The offsets for reach register for the UART.
    UART0_DR     = (UART0_BASE + 0x00),
    UART0_RSRECR = (UART0_BASE + 0x04),
    UART0_FR     = (UART0_BASE + 0x18),
    UART0_ILPR   = (UART0_BASE + 0x20),
    UART0_IBRD   = (UART0_BASE + 0x24),
    UART0_FBRD   = (UART0_BASE + 0x28),
    UART0_LCRH   = (UART0_BASE + 0x2C),
    UART0_CR     = (UART0_BASE + 0x30),
    UART0_IFLS   = (UART0_BASE + 0x34),
    UART0_IMSC   = (UART0_BASE + 0x38),
    UART0_RIS    = (UART0_BASE + 0x3C),
    UART0_MIS    = (UART0_BASE + 0x40),
    UART0_ICR    = (UART0_BASE + 0x44),
    UART0_DMACR  = (UART0_BASE + 0x48),
    UART0_ITCR   = (UART0_BASE + 0x80),
    UART0_ITIP   = (UART0_BASE + 0x84),
    UART0_ITOP   = (UART0_BASE + 0x88),
    UART0_TDR    = (UART0_BASE + 0x8C),
};
 
void uart_init()
{
	// Disable UART0.
	mmio_write(UART0_CR, 0x00000000);
	// Setup the GPIO pin 14 && 15.
 
	// Disable pull up/down for all GPIO pins & delay for 150 cycles.
	mmio_write(GPPUD, 0x00000000);
	delay(150);
 
	// Disable pull up/down for pin 14,15 & delay for 150 cycles.
	mmio_write(GPPUDCLK0, (1 << 14) | (1 << 15));
	delay(150);
 
	// Write 0 to GPPUDCLK0 to make it take effect.
	mmio_write(GPPUDCLK0, 0x00000000);
 
	// Clear pending interrupts.
	mmio_write(UART0_ICR, 0x7FF);
 
	// Set integer & fractional part of baud rate.
	// Divider = UART_CLOCK/(16 * Baud)
	// Fraction part register = (Fractional part * 64) + 0.5
	// UART_CLOCK = 3000000; Baud = 115200.
 
	// Divider = 3000000 / (16 * 115200) = 1.627 = ~1.
	mmio_write(UART0_IBRD, 1);
	// Fractional part register = (.627 * 64) + 0.5 = 40.6 = ~40.
	mmio_write(UART0_FBRD, 40);
 
	// Enable FIFO & 8 bit data transmissio (1 stop bit, no parity).
	mmio_write(UART0_LCRH, (1 << 4) | (1 << 5) | (1 << 6));
 
	// Mask all interrupts.
	mmio_write(UART0_IMSC, (1 << 1) | (1 << 4) | (1 << 5) | (1 << 6) |
	                       (1 << 7) | (1 << 8) | (1 << 9) | (1 << 10));
 
	// Enable UART0, receive & transfer part of UART.
	mmio_write(UART0_CR, (1 << 0) | (1 << 8) | (1 << 9));
}
 
void uart_putc(unsigned char c)
{
	// Wait for UART to become ready to transmit.
	while ( mmio_read(UART0_FR) & (1 << 5) ) { }
	mmio_write(UART0_DR, c);
}
 
unsigned char uart_getc()
{
    // Wait for UART to have received something.
    while ( mmio_read(UART0_FR) & (1 << 4) ) { }
    return mmio_read(UART0_DR);
}
 
void uart_puts(const char* str)
{
	for (size_t i = 0; str[i] != '\0'; i ++)
		uart_putc((unsigned char)str[i]);
}

void hexstrings ( unsigned int d )
{
    //unsigned int ra;
    unsigned int rb;
    unsigned int rc;

    rb=32;
    while(1)
    {
        rb-=4;
        rc=(d>>rb)&0xF;
        if(rc>9) rc+=0x37; else rc+=0x30;
        uart_putc(rc);
        if(rb==0) break;
    }
    uart_putc(0x20);
}


#if defined(__cplusplus)
extern "C" /* Use C linkage for kernel_main. */
#endif
void kernel_main(uint32_t r0, uint32_t r1, uint32_t atags)
{
	// Declare as unused
	(void) r0;
	(void) r1;
	(void) atags;
 
	uart_init();

again:
	uart_puts("####################\r\n");
	uart_puts("Inside Bootloader\r\n");
	uart_puts("####################\r\n");

	uart_puts("Requesting kernel\r\n");

	hexstrings(0x1018);
	uart_putc(3);
	uart_putc(3);
	uart_putc(3);

	
	uint32_t size = uart_getc();
    size |= uart_getc() << 8;
    size |= uart_getc() << 16;
    size |= uart_getc() << 24;


	 if (0x8000 + size > 0x2000000) {
		uart_puts("SE");
		goto again;	
    } else {
		uart_puts("OK");
    }
	
	uint8_t *kernel = (uint8_t*)0x8000;
    while(size-- > 0) {
		// uart_putc('k');
		*kernel++ = uart_getc();
    }


	unsigned int addr  = 0x8000;

	BRANCHTO(addr);

	uart_puts("KERNEL FAILED");

	while (1)
		uart_putc(uart_getc());
}

boot.S

Code: Select all

// To keep this in the first portion of the binary.
.section ".text.boot"
 
// Make _start global.
.globl Start
.globl branch_to_kernel
 
// Entry point for the kernel.
// r15 -> should begin execution at 0x8000.
// r0 -> 0x00000000
// r1 -> 0x00000C42
// r2 -> 0x00000100 - start of ATAGS
// preserve these registers as argument for kernel_main
Start:
	// Setup the stack.
	mov sp, #0x8000
 

 .relocate:
	// copy from r3 to r4.
	mov	r3, #0x8000
	ldr	r4, =_start
	ldr	r9, =_data_end

1:
	// Load multiple from r3, and store at r4.
	ldmia	r3!, {r5-r8}
	stmia	r4!, {r5-r8}

	// If we're still below file_end, loop.
	cmp	r4, r9
	blo	1b


	// Clear out bss.
	ldr r4, =_bss_start
	ldr r9, =_bss_end
	mov r5, #0
	mov r6, #0
	mov r7, #0
	mov r8, #0
	b       2f
 
1:
	// store multiple at r4.
	stmia r4!, {r5-r8}
 
	// If we are still below bss_end, loop.
2:
	cmp r4, r9
	blo 1b
 
	// Call kernel_main
	ldr r3, =kernel_main
	blx r3
 
	// halt
halt:
	wfe
	b halt

.globl BRANCHTO
BRANCHTO:
    bx r0

.globl dummy
dummy:
    bx lr

linker.ld

Code: Select all

/* link-arm-eabi.ld - linker script for arm eabi */
/* Copyright (C) 2013 Goswin von Brederlow <goswin-v-b@web.de>

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/

ENTRY(Start)

SECTIONS
{
    /* Starts at LOADER_ADDR. */
    . = 0x2000000;
    _start = .;
    _text_start = .;
    .text : {
        KEEP(*(.text.boot))
        *(.text)
    }
    . = ALIGN(4096); /* align to page size */
    _text_end = .;
    _rodata_start = .;
    .rodata : {
	*(.rodata)
    }
    . = ALIGN(4096); /* align to page size */
    _rodata_end = .;
    _data_start = .;
    .data : {
        *(.data)
    }
    . = ALIGN(4096); /* align to page size */
    _data_end = .;
    _bss_start = .;
    .bss : {
        bss = .;
        *(.bss)
    }
    . = ALIGN(4096); /* align to page size */
    _bss_end = .;
    
    _end = .;
}




config.txt

Code: Select all

# For more options and information see
# http://rpf.io/configtxt
# Some settings may impact device functionality. See link above for details

# uncomment if you get no picture on HDMI for a default "safe" mode
#hdmi_safe=1

# uncomment this if your display has a black border of unused pixels visible
# and your display can output without overscan
#disable_overscan=1

# uncomment the following to adjust overscan. Use positive numbers if console
# goes off screen, and negative if there is too much border
#overscan_left=16
#overscan_right=16
#overscan_top=16
#overscan_bottom=16

# uncomment to force a console size. By default it will be display's size minus
# overscan.
#framebuffer_width=1280
#framebuffer_height=720

# uncomment if hdmi display is not detected and composite is being output
#hdmi_force_hotplug=1

# uncomment to force a specific HDMI mode (this will force VGA)
#hdmi_group=1
#hdmi_mode=1

# uncomment to force a HDMI mode rather than DVI. This can make audio work in
# DMT (computer monitor) modes
#hdmi_drive=2

# uncomment to increase signal to HDMI, if you have interference, blanking, or
# no display
#config_hdmi_boost=4

# uncomment for composite PAL
#sdtv_mode=2

#uncomment to overclock the arm. 700 MHz is the default.
#arm_freq=800

# Uncomment some or all of these to enable the optional hardware interfaces
#dtparam=i2c_arm=on
#dtparam=i2s=on
#dtparam=spi=on

# Uncomment this to enable the lirc-rpi module
#dtoverlay=lirc-rpi

# Additional overlays and parameters are documented /boot/overlays/README

# Enable audio (loads snd_bcm2835)
init_uart_clock=3000000
enable_uart=1
core_freq=250
dtparam=audio=on_
dtoverlay=pi3-disable-bt

User avatar
DavidS
Posts: 3794
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: RPI 3b (not b+) Bootloader in 32 bit mode.

Thu Nov 29, 2018 12:32 am

Looks like a problem with endians between the two systems. Not sure what CPU is used in the OSX computers these days, though the order the bytes are being sent is backwards, just correct your endian and you should not have a problem
RPi = Way for me to have fun and save power.
100% Off Grid.
Household TTL Electricity Usage = 1.4KW/h per day.
500W Solar System, produces 2.8KW/h per day average.

zeoneo
Posts: 13
Joined: Sun Sep 30, 2018 6:54 am

Re: RPI 3b (not b+) Bootloader in 32 bit mode.

Thu Nov 29, 2018 6:20 pm

Thanks @DavidS.

When I debugged below line in original source(raspbootcom.cc) was throwing compilation error.

Code: Select all

    size = htole32(off);
So I googled and found random solution that I didn't understand. Solution was to replace above line with following

Code: Select all

    size = OSSwapBigToHostConstInt32(off);
which was the root cause of problem. It changed kernel size 0x1018 with 0x10180000. After commenting that line and using original size i.e. variable

Code: Select all

off
in code solved the issue.


P.S I am using MacOs Mojave(10.14.1 ) processor intel i5

bzt
Posts: 311
Joined: Sat Oct 14, 2017 9:57 pm

Re: RPI 3b (not b+) Bootloader in 32 bit mode.

Thu Nov 29, 2018 6:53 pm

Hi,

About raspbootin: You shouldn't need to modify the MMIO address as raspbootin detects it. The problem with it's UART code is that you must set the UART clock first (that's not fixed on Rpi3 as it was on older models, depends on GPU clock). You have to add a few lines to uart.cc, to get it work, see here. The Osdev wiki and your kernel.c also makes that mistake:

Code: Select all

	// Divider = 3000000 / (16 * 115200) = 1.627 = ~1.
	mmio_write(UART0_IBRD, 1);
If you don't set UART_CLOCK explicitly to 3000000 with a MailBox property call, nothing guarantees that assumption, and therefore your divider could be incorrect. That's a kind of error that really hard to debug, as it will work most of the time, but not all the time, and you won't understand how that could be?

You could use the miniAUX instead, as that's clock is fixed. As far as I can tell, it has to do something with the new bluetooth chip which is also connected there, therefore the designers moved the fixed clock option from PL11 to miniAUX (but correct me if I'm wrong about this).

About 0x1018 vs. 01810 problem: your Rpi code reads the value as little endian. You should not need to do anything on the PC side, as Intel chips are little endian too. Simply

Code: Select all

size = (uint32_t)off;
should be enough. But to create a portable C code, you can use the htole32() macro, which (according to the C standard) is defined in "endian.h". If you forget to include that, you will get that compilation error you saw. I suppose it's not included by default on MacOSX, so you have to add

Code: Select all

#include <endian.h>
Cheers,
bzt

zeoneo
Posts: 13
Joined: Sun Sep 30, 2018 6:54 am

Re: RPI 3b (not b+) Bootloader in 32 bit mode.

Fri Nov 30, 2018 7:12 pm

Hi bzt,

I think it's other wise miniUart(AuxUart) has variable frequency problems, PL011 (UART0) doesn't. Please refer https://www.raspberrypi.org/documentati ... on/uart.md. MiniUart section.

Thanks,
zeo

bzt
Posts: 311
Joined: Sat Oct 14, 2017 9:57 pm

Re: RPI 3b (not b+) Bootloader in 32 bit mode.

Sat Dec 01, 2018 12:55 pm

zeoneo wrote:
Fri Nov 30, 2018 7:12 pm
Hi bzt,

I think it's other wise miniUart(AuxUart) has variable frequency problems, PL011 (UART0) doesn't. Please refer https://www.raspberrypi.org/documentati ... on/uart.md. MiniUart section.

Thanks,
zeo
On older models. With my Rpi3 it's the other way around, I suppose it has something to do with the new bluetooth. Others have reported the same uart0 clock problems too, see the issue I've linked for example. Setting the uart clock will fix that, just give it a try.

Cheers,
bzt

Return to “Bare metal, Assembly language”