Page 1 of 1

RPi3 Random Reboot Issues

Posted: Tue Jun 14, 2016 5:23 pm
by scollins15
I have 10 Pi 3's that I run through a reboot test cycle. After a random number of boots the reboot fails. I am scratching my head as to what is causing this. Have any others noticed this? Last test rebooted succesfully 80 times before failing. Upon a reboot I run a script via rc.local that calls a script which sleeps for 30 seconds, then issues a reboot.

Code: Select all

~ $ uname -a
Linux trusign-e7b7 4.4.7-v7+ #6 SMP Mon Apr 18 18:03:48 UTC 2016 armv7l GNU/Linux
I thought this could be temperature related, so I logged date time and temp to a file prior to each reboot. Results are below:

Code: Select all

11:54:28 06/14/16 50464
11:55:24 06/14/16 54768
11:56:19 06/14/16 56920
11:57:14 06/14/16 57996
11:58:12 06/14/16 59072
11:59:09 06/14/16 60148
12:00:06 06/14/16 61224
12:01:03 06/14/16 61224
12:02:00 06/14/16 61761
12:02:58 06/14/16 62300
12:03:54 06/14/16 62300
12:04:52 06/14/16 61761
12:05:49 06/14/16 62838
12:06:46 06/14/16 62300
12:07:44 06/14/16 62838
12:08:43 06/14/16 62300
12:09:39 06/14/16 63914
12:10:34 06/14/16 63914
12:11:32 06/14/16 63376
12:12:26 06/14/16 64451
12:13:24 06/14/16 63914
12:14:20 06/14/16 65528
12:15:16 06/14/16 64451
12:16:14 06/14/16 65528
12:17:12 06/14/16 63914
12:18:10 06/14/16 64990
12:19:07 06/14/16 64451
12:20:03 06/14/16 66066
12:20:59 06/14/16 64451
12:21:56 06/14/16 64451
12:22:54 06/14/16 65528
12:23:49 06/14/16 64990
12:24:46 06/14/16 65528
12:25:40 06/14/16 65528
12:26:35 06/14/16 65528
12:27:32 06/14/16 64451
12:28:31 06/14/16 65528
12:29:31 06/14/16 65528
12:30:29 06/14/16 65528
12:31:24 06/14/16 65528
12:32:21 06/14/16 66066
12:33:19 06/14/16 64451
12:34:16 06/14/16 65528
12:35:12 06/14/16 65528
12:36:10 06/14/16 65528
12:37:08 06/14/16 66066
12:38:06 06/14/16 66066
12:39:06 06/14/16 65528
12:40:01 06/14/16 66604
12:41:00 06/14/16 66604
12:41:58 06/14/16 65528
12:42:55 06/14/16 65528
12:43:51 06/14/16 66604
12:44:50 06/14/16 65528
12:45:49 06/14/16 65528
12:46:46 06/14/16 66604
12:47:43 06/14/16 67142
12:48:41 06/14/16 66604
12:49:37 06/14/16 66604
12:50:35 06/14/16 65528
12:51:33 06/14/16 66604
12:52:31 06/14/16 67142
12:53:27 06/14/16 66604
12:54:24 06/14/16 67142
12:55:22 06/14/16 66604
12:56:17 06/14/16 66604
12:57:11 06/14/16 66604
12:58:08 06/14/16 66604
12:59:06 06/14/16 66604
13:00:02 06/14/16 67142
13:01:00 06/14/16 66604
13:01:57 06/14/16 67142
13:02:53 06/14/16 67679
13:03:50 06/14/16 66604
13:04:47 06/14/16 67142
13:05:44 06/14/16 67142
13:06:41 06/14/16 66604
13:07:38 06/14/16 67142
13:08:36 06/14/16 67142
13:09:33 06/14/16 67142

Re: RPi3 Random Reboot Issues

Posted: Tue Jun 14, 2016 7:54 pm
by RoundDuckMan
Probably relates to my issue, I'm having a similar issue where the RPi3 will corrupt the OS after certain reboots for no reason. I'm using Raspbian under NOOBS 1.9, are you too? If you are, maybe it's an issue with NOOBS? If not, maybe Raspbian? What kind of SD cards as well? :?

Re: RPi3 Random Reboot Issues

Posted: Mon Jun 20, 2016 4:55 pm
by scollins15
I am using SanDisk commercial product, provided through San Disk direct. I am using Raspbian Jessie Lite. The filesystem has no corruption when this happens and is 100% healthy with a brute force power cycle.

Re: RPi3 Random Reboot Issues

Posted: Mon Jun 20, 2016 7:55 pm
by Cancelor
scollins15 wrote:.... I logged date time and temp to a file prior to each reboot. ....
Could you add to your script and get it to record free space on the SD card and free system memory?
I'm assuming that after a power reset all runs normal again for a while?

Re: RPi3 Random Reboot Issues

Posted: Tue Jun 28, 2016 1:11 pm
by scollins15
On the Raspberry Pi 1's and 2's I have noticed that some reboots failed to get beyond the rainbow flash screen. A solution has never been found for this problem, although several say its possibly an SD card issue or Power supply. In an effort to eliminate the issue I add a line to /etc/rc.local which calls a script that does a reboot after updating a text file for record-keeping. This script has logic to self delete the line in /etc/rc.local after a certain number of reboots have been completed. Currently I am trying to reboot 2000 times. On the new Pi3 this takes about 30sec for each reboot and therefore I should see about 16-17 hours to complete the test. Unfortunately it fails after some random number. The kernel messages are below.


For full disclosure I have 3 partitions as follows
/boot -
/root - contains symbolic links - will eventually be mounted read only, but not now!
tmp -> /third/linuxReadWrite/tmp/
var -> /third/linuxReadWrite/var/
/third - contains home directory for user excluding root

The kernel messages are below:
[14.450911] Unable to handle kernel pagin request at virtual address ad2d6180
[14.462776] pgd = ac87c000
[14.467211] [ad2d6180] *pgd=2d21141e(bad)
[14.473799] Internal error: Oops: 8000000d [#1] SMP ARM
[14.482347] Modules linked in: brcmfmac brcmutil cfg80211 rfkill snd_bcm2835 snd_pcm rtc_pcf2127 snd_tmer snd bcm2835_wdt i2c_bcm2708 evdev uio_prdv_genirq uio ipv6
[14.509036] CPU: 1 PID: 223 COmm: systemd-fsck Not tainted 4.4.14-v7+ #1
[14.519998] Hardware name: BCM2709
[14.525562] task: adb18000 ti: ac9c6000 task.ti: ac9c6000
[14.534396] PC is at 0xad2d6180
[14.539542] LR is at tty_ldisc_close+0x4c/0x64
[14.546812] pc : [<ad2d6180>] lr : [<80378424>] psr: a0000013
[14.546812] sp : ac9c7e88 ip : 00000000 fp : ac9c7e9c
[14.565599] r10: 00000001 r9 : 00000002 r8 : ac862240
[14.574149] r7 : ab09c1c0 r6 : ac814218 r5 : ab09c680 r4 : ac814200
[14.584830] r3 : ad2d6180 r2 : 00400000 r1 : ac81435c r0 : ac814200
[14.595513] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
[14.607190 Control: 10c5383d Table: 2c87c06a DAC: 00000055
[14.616590 Process systemd-fsck (pid: 223, stack limit = 0xac9c6210)
[14.627129] Stack: (0xac9c7e88 to 0xac9c8000)
[14.634260] 7e80: ac814200 ac814200 ac9c7ebc ac9c7ea0 803784e8 803783e4
[14.647647] 7ea0: 00400000 00000000 ac814200 ac814218 ac9c7edc ac9c7ec0 80378e78 803784d4
[14.661035] 7ec0: ac814200 00000000 8095ae98 ab09c1c0 ac9c7f1c ac9c7ee0 80372080 80378d54
[14.674423] 7ee0: 00000008 80130af4 ac814314 ac814314 00000000 ac862240 ad178910 ad423550
[14.687810] 7f00: ad9b4f78 00000000 ad9b4f78 00000008 ac9c7f5c ac9c7f20 80158cb4 8037ad0c
[14.701196] 7f20: 00000000 00000000 ac862240 ac862248 ac9c7f54 adb183a0 808cded0 00000000
[14.714583] 7f40: adb18000 8000fd08 ac9c6000 00000000 ac9c7f6c ac9c7f60 80158e74 80158c2c
[14.727971] 7f60: ac9c7f8c ac9c7f70 800409b0 80158e68 ac9c6000 ac9c6010 8000fd08 ac9c7fb0
[14.741356] 7f80: ac9c7fac ac9c7f90 80013914 8004091c 54fc20e8 00000000 00000000 00000006
[14.754741] 7fa0: 00000000 ac9c7fb0 8000fb68 80013854 00000000 00000005 fbad2402 76e67390
[14.768127] 7fc0: 54fc20e8 00000000 00000000 00000006 00000000 00000000 54b379ac 7ed4bd3c
[14.781514] 7fe0: 00000000 7ed4bb60 76e6880c 76e673a0 60000010 00000005 00000000 00000000
[14.794911] [<80378424.] (tty_ldisc_close) from [<803784e8>] (tty_ldisc_kill+0x20/0x90)
[14.808019] [<803784e8>] (tty_ldisc_kill) from [<80378e78>] (tty_ldisc_release+0x130/0x154)
[14.821694] [<80378e78>] (tty_ldisc_release) from [<80372080>] (tty_release+0x380/0x4fc)
[14.834946] [<80372080>] (tty_release) from [<80158cb4>] (__fput+0x94/0x1e4)
[14.846492] [<80158cb4>] (__fput) from [<80158e74>] (___fput+0x18/ox1c)
[14.857471] [<80158e74>] (___fput) from [<800409b0>] (task_work_run+0xa0/0xd4)
[14.869448] [<800409b0>] (task_work_run) from [<80013914>] (do_work_pending+0xcc/0xd0)
[14.882416] [<80013914>] (do_work_pending) from [<8000fb68>] (slow_work_pending+0xc/0x20)
[14.895803] Code: 00000000 00000000 00000000 00000000 (ad2d6540)
[14.905777] ---[ end trace ef393c8432f970a5 ]---

Re: RPi3 Random Reboot Issues

Posted: Wed Jun 29, 2016 8:09 pm
by scollins15
Cancelor wrote:
scollins15 wrote:.... I logged date time and temp to a file prior to each reboot. ....
Could you add to your script and get it to record free space on the SD card and free system memory?
I'm assuming that after a power reset all runs normal again for a while?
this happens right after a boot...when trying to shutdown or reboot. I don't believe free space can be an issue.