Sharvin
Posts: 16
Joined: Tue Feb 19, 2019 1:19 pm

Doubt related to the OTA update Mechanism for Debian Operating System

Tue Feb 19, 2019 1:29 pm

I am working on the OTA update for an embedded system project. My hardware is Raspberry Pi and the operating system is a Debian 9.

This OTA update mechanism will be A + B Redundancy update. Where there will be 3 total partitions available. Partition A will have a Debian Operating system and Partition B will have a separate Debian operating system. Partition C will be a persistent partition where I'll be storing all my Configuration which I don't want to change or get affected by an OTA update. There will be one Active and another inactive partition.

I'm planning to have 2 Types of the update mechanism. One will be a File Level update and another is an OS level update.

File-level Update:

Now at first Instance let's consider Partition A is an active Partition and Partition B is an inactive partition. Partition C will always be a Persistent partition for configuration and will never be affected by an OTA update. A file level update will be deployed by the server and the hardware will be continuously polling for the update. ( Every 24 hrs possibly ) Once the update is downloaded from the network it will be copied to the inactive partition ( Partition B ) and all the files will be placed in proper directories according to the package configuration and requirement.

Now Once the OTA update is done the Boot Priority of the hardware will change ( Here Partition B will become Active Partition ) So that once the user reboots the hardware it will be rebooted into an updated Partition ( In this instance Which will be Partition B ).

OS-level Update:

For OS-Level update, I'll be sending a new version of the operating system as an update to the user. In this update, the Same mechanism will be followed like the File level only a minor difference will be that here the inactive partition will be flashed with the new image of the operating system. Here also U-boot bootloader can help in configuring the Boot Priority Same as above.

I am using a Rauc Framework for the OTA update mechanism at the client side. I hope the above explanation gave a clear idea about the mechanism. I have a doubt related to the above mechanism as I am a newbie:

Question:

According to my understanding, the above mechanism will provide a fail-safe update ( Offcourse I need to add some more validations for that purpose ) but I have a doubt related to the mechanism. Is this the right way to provide the OTA Update for Embedded Device? Or is there any other mechanism which is more feasible and scalable than the above one? I would like to have feedback on the above implementation and where I might have gone wrong?

Thanks

incognitum
Posts: 279
Joined: Tue Oct 30, 2018 3:34 pm

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Tue Feb 19, 2019 3:17 pm

Sharvin wrote:
Tue Feb 19, 2019 1:29 pm
the operating system is a Debian 9.
How attached are you to Debian?

In embedded development it is more common to build a custom operating system with tools like Buildroot, containing just the software you really need, and no extra's like documentation and man pages.
That typically results in images that are say 25 MB instead of 250 MB and that makes it a lot easier to just replace the entire image on updates, instead of messing with file/package updates that are not atomic.

While most projects still have their own custom update mechanism, lately some more standardized tools have been appearing like SWUpdate.

http://events17.linuxfoundation.org/sit ... Update.pdf
https://sbabic.github.io/swupdate/

epoch1970
Posts: 3045
Joined: Thu May 05, 2016 9:33 am
Location: Paris, France

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Tue Feb 19, 2019 3:26 pm

I don't understand the semantic difference between "file level" update and "OS-level" update.
These A/B systems all work the same: perform any sort of update, switch partition, reboot and see...

ostree is different, it really works at FS level on a single partition.
ubuntu snappy core is again different, it uses containers throughout so you can upgrade/downgrade easily.

As incognitum says, a custom OS might help. In these A/B systems you basically fallback once you've failed booting enough times. Meaning you need unleash the watchdog. Binary distros like Debian might lack the kernel settings you want to harness the boot process. Systemd has a vague idea of what a watchdog is used for.
"S'il n'y a pas de solution, c'est qu'il n'y a pas de problème." Les Shadoks, J. Rouxel

incognitum
Posts: 279
Joined: Tue Oct 30, 2018 3:34 pm

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Tue Feb 19, 2019 4:17 pm

epoch1970 wrote:
Tue Feb 19, 2019 3:26 pm
I don't understand the semantic difference between "file level" update and "OS-level" update.
Well, downloading just individual changed files instead of an entire file system image may save some download time.
It does come with the downside that if power loss occurs on update it may be corrupting the file system of the other partition.
If you retry updating on the next boot, the corruption may go unnoticed, until you actually reboot to the other partition.

Power loss while writing a file system image to the other partition, is less problematic.
If you retry the update on next boot, the entire file system of the other partition is going to be replaced again anyway, so no corruption from a previous update attempt to worry about.

chwe
Posts: 114
Joined: Tue Jul 31, 2018 1:35 pm

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Tue Feb 19, 2019 4:45 pm

considering you don't update u-boot nor binaries provided by RPT to get the board booting.. Never saw someone repairing a broken u-boot OTA.. :P As someone dealing with a bunch of SBCs (most of them not RPis), tied to a distro which delivers (has delivered) kernel and u-boot updates via 'apt-get upgrade' for various SBCs, I saw update broke my board more often that I wanted. :lol: It happens, and especially if you support various SoCs with various kernels with a small team, things happen which you didn't test before.. If it's only one series of RPis you'll maintain I think you could deliver your updates without having full redundancy, assuming you're aware how to properly pack debian packages etc...
Delivering boot-loader updates is always risky. Often when the bootloader is broken, getting the board back to live isn't as easy (for those familiar with the way this things works it's remove the SD-card + a few dd commands and you're done, explain this to a user and make sure he doesn't dd his computers hard-disk is the harder part.. :lol: )..
epoch1970 wrote:
Tue Feb 19, 2019 3:26 pm
I don't understand the semantic difference between "file level" update and "OS-level" update.
These A/B systems all work the same: perform any sort of update, switch partition, reboot and see...
I assume file level is updating some of his custom scripts he use etc. whereas OS-level might be getting from stretch to buster or kernel/bootloader updates (IMO as long as the bootloader doesn't have a known security-issue just leave it as is.. :lol: ). In this case I assume a two partition setup is a way more failsafe than a one-partition. I personally never updated an SBCs debian version, It's just to messy compared to install it new..
epoch1970 wrote:
Tue Feb 19, 2019 3:26 pm
Meaning you need unleash the watchdog. Binary distros like Debian might lack the kernel settings you want to harness the boot process.
I'm not sure if the watchdog is already up when most kernel related crashes I saw were coming (well, if you will have one board only to support with a defined use-case it actually shouldn't happen that you deliver a broken kernel at all - you had only one job.. but different story). But if it's only a kernelconfig issue, it's not that you can't activate the needed modules etc. right? I don't know anyone using a 'stock' debian kernel on his SBC..
But I would partly agree that debian might be 'to bloated' for such an use-case, on the other hand, someone maintains the packages you need for free and the man-power behind debian is a way bigger... :)

incognitum
Posts: 279
Joined: Tue Oct 30, 2018 3:34 pm

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Tue Feb 19, 2019 5:22 pm

chwe wrote:
Tue Feb 19, 2019 4:45 pm
I'm not sure if the watchdog is already up when most kernel related crashes I saw were coming
Haven't tried on the Pi, but can't you get u-boot to let the dog out?
Even if nobody wrote proper u-boot support for the Pi's dog, you may be able to use its generic commands to write to memory/registers and activate it shortly before booting the kernel.

epoch1970
Posts: 3045
Joined: Thu May 05, 2016 9:33 am
Location: Paris, France

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Tue Feb 19, 2019 5:37 pm

incognitum wrote:
Tue Feb 19, 2019 5:22 pm
Haven't tried on the Pi, but can't you get u-boot to let the dog out?
Yes, you can start it from u-boot, no problem.
"S'il n'y a pas de solution, c'est qu'il n'y a pas de problème." Les Shadoks, J. Rouxel

Sharvin
Posts: 16
Joined: Tue Feb 19, 2019 1:19 pm

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Thu Feb 21, 2019 8:24 am

incognitum wrote:
Tue Feb 19, 2019 3:17 pm
How attached are you to Debian?

In embedded development it is more common to build a custom operating system with tools like Buildroot, containing just the software you really need, and no extra's like documentation and man pages.
That typically results in images that are say 25 MB instead of 250 MB and that makes it a lot easier to just replace the entire image on updates, instead of messing with file/package updates that are not atomic.

While most projects still have their own custom update mechanism, lately some more standardized tools have been appearing like SWUpdate.

http://events17.linuxfoundation.org/sit ... Update.pdf
https://sbabic.github.io/swupdate/
Thanks for the Suggestion. Your suggestion seems to be right But I am working on the OTA update mechanism for the Debian OS.
epoch1970 wrote:
Tue Feb 19, 2019 3:26 pm
I don't understand the semantic difference between "file level" update and "OS-level" update.
These A/B systems all work the same: perform any sort of update, switch partition, reboot and see...

ostree is different, it really works at FS level on a single partition.
ubuntu snappy core is again different, it uses containers throughout so you can upgrade/downgrade easily.

As incognitum says, a custom OS might help. In these A/B systems you basically fallback once you've failed booting enough times. Meaning you need unleash the watchdog. Binary distros like Debian might lack the kernel settings you want to harness the boot process. Systemd has a vague idea of what a watchdog is used for.
incognitum wrote:
Tue Feb 19, 2019 4:17 pm
Well, downloading just individual changed files instead of an entire file system image may save some download time.
It does come with the downside that if power loss occurs on update it may be corrupting the file system of the other partition.
If you retry updating on the next boot, the corruption may go unnoticed, until you actually reboot to the other partition.

Power loss while writing a file system image to the other partition, is less problematic.
If you retry the update on next boot, the entire file system of the other partition is going to be replaced again anyway, so no corruption from a previous update attempt to worry about.
Boot priority will change only if the OTA update has been successful. So if there's any power loss it will not complete the update and once the update is installed again it will replace the corrupted file.

I have 3 Partitions on my MMC ( Sd card ) and one on the eMMC. Partition on the eMMC will be used for storing all the Common files which I don't want to get affected by the OTA update. Partition A on MMC is used to store the uboot environment file uEnv.txt while Partition B and Partition C have Debian operating system installed on them. When an OTA update is available it will be downloaded on the active partition and then it will be installed on the inactive partition. Once the installation is done successfully uEnv.txt will be configured to boot from the updated partition on the reboot.

Question:
I would like to have feedback/comment/conclusion on the above implementation.

User avatar
scruss
Posts: 2261
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Thu Feb 21, 2019 11:32 am

Sharvin wrote:
Thu Feb 21, 2019 8:24 am
I have 3 Partitions on my MMC ( Sd card ) and one on the eMMC. …
I guess you're using the CM3? Regular Raspberry Pis don't have eMMC.
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.

Sharvin
Posts: 16
Joined: Tue Feb 19, 2019 1:19 pm

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Thu Feb 21, 2019 11:44 am

scruss wrote:
Thu Feb 21, 2019 11:32 am
Sharvin wrote:
Thu Feb 21, 2019 8:24 am
I have 3 Partitions on my MMC ( Sd card ) and one on the eMMC. …
I guess you're using the CM3? Regular Raspberry Pis don't have eMMC.
Yes, I am using CM3.

User avatar
scruss
Posts: 2261
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Thu Feb 21, 2019 11:50 am

Probably best to ask in the Compute Module forum, since most users on the General forum don't have experience with that hardware
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.

incognitum
Posts: 279
Joined: Tue Oct 30, 2018 3:34 pm

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Thu Feb 21, 2019 12:33 pm

Sharvin wrote:
Thu Feb 21, 2019 8:24 am
Boot priority will change only if the OTA update has been successful. So if there's any power loss it will not complete the update and once the update is installed again it will replace the corrupted file.
You are now assuming that on power loss only the single file you were replacing can get corrupted, and that can be simply fixed by overwriting said file again.
Do keep in mind that it is also modifying directory listings and other file system meta data though...
And that due to the way flash storage works the SD card may have to read, erase and reprogram a block size area that is larger than just the bytes the OS wants to change.

Sharvin
Posts: 16
Joined: Tue Feb 19, 2019 1:19 pm

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Thu Feb 21, 2019 1:27 pm

incognitum wrote:
Thu Feb 21, 2019 12:33 pm
You are now assuming that on power loss only the single file you were replacing can get corrupted, and that can be simply fixed by overwriting said file again.
Do keep in mind that it is also modifying directory listings and other file system meta data though...
And that due to the way flash storage works the SD card may have to read, erase and reprogram a block size area that is larger than just the bytes the OS wants to change.
Thank you for the Feedback. I understood the part related to the Directory listing, metadata and yes your right it can corrupt the file system.
But I am unable to understand this part:
And that due to the way flash storage works the SD card may have to read, erase and reprogram a block size area that is larger than just the bytes the OS wants to change.
Can you make it more clear, By telling an example or little bit of explanation perhaps?

epoch1970
Posts: 3045
Joined: Thu May 05, 2016 9:33 am
Location: Paris, France

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Thu Feb 21, 2019 1:49 pm

Sharvin wrote:
Thu Feb 21, 2019 8:24 am
Partition A on MMC is used to store the uboot environment file uEnv.txt while Partition B and Partition C .. [are rootfs A and B]. When an OTA update is available it will be ... installed on the inactive partition. Once the installation is done successfully uEnv.txt will be configured to boot from the updated partition on the reboot.
This is the basic mode of operation for A/B OTA updating.
What is your question? If this design exists and is used in so many devices I guess it has some merit.

It's not magic either:
- If you install garbage to the machine it will reboot (hopefully) into garbage. What happens next is open.
- Reboot is mandatory. The outcome of an update is unknown until reboot into the new install is successful and the machine can report "all good".
In Swupdate, the env variable ustate is set post install, and used after reboot for reporting success/failure to the remote server. No idea how this is done in Rauc.
https://sbabic.github.io/swupdate/suricatta.html#introduction wrote:Suricatta regularly polls a remote server for updates, downloads, and installs them. Thereafter, it reboots the system and reports the update status to the server, based on an update state variable currently stored in bootloader’s environment ensuring persistent storage across reboots. Some U-Boot script logics or U-Boot’s bootcount feature may be utilized to alter this update state variable, e.g., by setting it to reflect failure in case booting the newly flashed root file system has failed and a switchback had to be performed.
"S'il n'y a pas de solution, c'est qu'il n'y a pas de problème." Les Shadoks, J. Rouxel

Sharvin
Posts: 16
Joined: Tue Feb 19, 2019 1:19 pm

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Thu Feb 21, 2019 2:04 pm

epoch1970 wrote:
Thu Feb 21, 2019 1:49 pm
This is the basic mode of operation for A/B OTA updating.
What is your question? If this design exists and is used in so many devices I guess it has some merit.

It's not magic either:
- If you install garbage to the machine it will reboot (hopefully) into garbage. What happens next is open.
- Reboot is mandatory. The outcome of an update is unknown until reboot into the new install is successful and the machine can report "all good".
In Swupdate, the env variable ustate is set post install, and used after reboot for reporting success/failure to the remote server. No idea how this is done in Rauc.
https://sbabic.github.io/swupdate/suricatta.html#introduction wrote:Suricatta regularly polls a remote server for updates, downloads, and installs them. Thereafter, it reboots the system and reports the update status to the server, based on an update state variable currently stored in bootloader’s environment ensuring persistent storage across reboots. Some U-Boot script logics or U-Boot’s bootcount feature may be utilized to alter this update state variable, e.g., by setting it to reflect failure in case booting the newly flashed root file system has failed and a switchback had to be performed.
Yes in rauc also Once the update is completed it is reported back to the Server as success or failure. I'm not sure about the reboot (I'll check this ) But according to my test, once the update was completely installed I got notified back at my server that update has been completed.
Thanks for the Feedback.

epoch1970
Posts: 3045
Joined: Thu May 05, 2016 9:33 am
Location: Paris, France

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Thu Feb 21, 2019 2:27 pm

Update > Install(s)

An install can be a failure (e.g. bad download, signature doesn't check) or a success
If the install is a failure the update is a failure.

If the install is a success,
An update can be a failure (e.g. boot failed 3 times, fallback) or a success.
You only know if the update was a success once you've tried it.

Until you reach successful update, the system must be able to fallback to the last known good rootfs, so that another OTA update attempt is possible. Otherwise it's game over.
"S'il n'y a pas de solution, c'est qu'il n'y a pas de problème." Les Shadoks, J. Rouxel

incognitum
Posts: 279
Joined: Tue Oct 30, 2018 3:34 pm

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Thu Feb 21, 2019 2:35 pm

Sharvin wrote:
Thu Feb 21, 2019 1:27 pm
And that due to the way flash storage works the SD card may have to read, erase and reprogram a block size area that is larger than just the bytes the OS wants to change.
Can you make it more clear, By telling an example or little bit of explanation perhaps?
For simplicity look at how consumer websites explain how SSDs work.

https://www.anandtech.com/show/2738/8

Works the same with other flash storage like SD cards.
You cannot simply overwrite individual bytes of flash memory that already has data in it.
The OS may only intend to replace a 4 KB block of data. But under the hood the SD card may need to erase 4 MB (yes, megabyte) worth of flash first, before it is able to write new data to that region.
(SSDs do have some tricks to try to reduce the times this happens, e.g. it will try to write to another block that is still free instead of overwriting the original one, if that is possible. But that does require that it knows of free blocks. It is not a given that your average SD card is so sophisticated)

Sharvin
Posts: 16
Joined: Tue Feb 19, 2019 1:19 pm

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Fri Feb 22, 2019 5:23 am

incognitum wrote:
Thu Feb 21, 2019 2:35 pm
For simplicity look at how consumer websites explain how SSDs work.

https://www.anandtech.com/show/2738/8

Works the same with other flash storage like SD cards.
You cannot simply overwrite individual bytes of flash memory that already has data in it.
The OS may only intend to replace a 4 KB block of data. But under the hood the SD card may need to erase 4 MB (yes, megabyte) worth of flash first, before it is able to write new data to that region.
(SSDs do have some tricks to try to reduce the times this happens, e.g. it will try to write to another block that is still free instead of overwriting the original one, if that is possible. But that does require that it knows of free blocks. It is not a given that your average SD card is so sophisticated)
Thank you for the Response that was indeed a good explanation. Just had a small doubt from that article
Now let’s say that the user goes back and deletes that original text file. This request doesn’t ever reach our controller, as far as our controller is concerned we’ve got three valid and two empty pages.
If the original file is deleted then How the request doesn't get back to the controller. I didn't get that point from the article.
Last edited by Sharvin on Fri Feb 22, 2019 5:40 am, edited 1 time in total.

Sharvin
Posts: 16
Joined: Tue Feb 19, 2019 1:19 pm

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Fri Feb 22, 2019 5:25 am

epoch1970 wrote:
Thu Feb 21, 2019 2:27 pm
Update > Install(s)

An install can be a failure (e.g. bad download, signature doesn't check) or a success
If the install is a failure the update is a failure.

If the install is a success,
An update can be a failure (e.g. boot failed 3 times, fallback) or a success.
You only know if the update was a success once you've tried it.

Until you reach successful update, the system must be able to fallback to the last known good rootfs, so that another OTA update attempt is possible. Otherwise it's game over.
Thank you for the feedback.

incognitum
Posts: 279
Joined: Tue Oct 30, 2018 3:34 pm

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Fri Feb 22, 2019 9:31 am

Sharvin wrote:
Fri Feb 22, 2019 5:23 am
Thank you for the Response that was indeed a good explanation. Just had a small doubt from that article
Now let’s say that the user goes back and deletes that original text file. This request doesn’t ever reach our controller, as far as our controller is concerned we’ve got three valid and two empty pages.
If the original file is deleted then How the request doesn't get back to the controller. I didn't get that point from the article.
Article is a bit old.

The file do will get marked deleted in the file system’s meta data.
But in the past the controller did not know that the page that contains the contents of the text file no longer is data we want to keep.
The OS knows the file system and therefore knows which pages are free and which are not, but the controller did not. Once data was written to a certain address once, the page was occupied with data as far as the controller was aware.

Nowadays SSDs do get informed of that with TRIM/DISCARD commands.
Allowing them to pre-emptively erase entire blocks if a larger file is removed, and even move pages around when smaller ones are removed, a bit similar to file system defragmentation.
So that by the time the user wants to write new data, there is a free page available, ready to be written to.

Sharvin
Posts: 16
Joined: Tue Feb 19, 2019 1:19 pm

Re: Doubt related to the OTA update Mechanism for Debian Operating System

Fri Feb 22, 2019 12:18 pm

incognitum wrote:
Fri Feb 22, 2019 9:31 am
Article is a bit old.

The file do will get marked deleted in the file system’s meta data.
But in the past the controller did not know that the page that contains the contents of the text file no longer is data we want to keep.
The OS knows the file system and therefore knows which pages are free and which are not, but the controller did not. Once data was written to a certain address once, the page was occupied with data as far as the controller was aware.

Nowadays SSDs do get informed of that with TRIM/DISCARD commands.
Allowing them to pre-emptively erase entire blocks if a larger file is removed, and even move pages around when smaller ones are removed, a bit similar to file system defragmentation.
So that by the time the user wants to write new data, there is a free page available, ready to be written to.
Okay, it's clear now. Thanks for the Feedback

Return to “Compute Module”