----
Background:
I have a situation resulting in the following symptoms on a MMIPShEX Srunning ROSv6.48.1 (but I tried restoring the backups on netinstalled versions from 6.45 through the latest 7.1 beta with similar results):
- /system sup-outputfails or freezes
- /exportstalls or freezes(sometimes CTRL-c will recover, other times console frozen until reboot)
- /system backupwillsave and restore .backup files without any reported errors
- Even though manually running/system sup-outputfails, as recently as last night(before I formatted the drive with netinstall), anautosupout.rifis recreated on most(all?)system crashes
- The system crashes and reboots either around 3 minutes 20 seconds or 5 minutes and 10 seconds, depending on ...(the installed packages, maybe??)
- LEDs, USB, netinstall, and theModeandResetbuttons appear to work perfectly.
- My only system access in this situation is through aWoobm-USBconsole, and whatever files I can read or write to the flash and microSD card.
- I believe this all happened yesterday when a watchdog timer tripped. There may have been a pendingios版雷竞技官网入口update that ran on that reboot, but I don't think so.
This all happenseven if the only package installedissystem-n.nn.n-mmips.npk.
There is anautosupout.riffile just sitting there, mocking me, seemingly untouchable.
----
Half-baked theories:
One of my novice theories is that due to some corruption in the configuration(that restoring the backup file brings back to a clean netinstall)the switch chip goes offline or the associated software crashes or maybe disables the GPIO pins controlling the switch chip, or some linux kernel extension just crashes. The linux system boots and RouterOS starts configuration, then the interfaces disappear, and something just can't cope with that reality.
It seems unlikely the actually plain-text configuration could cause this sort of thing(surely that is screened for syntax and sanitized to prevent buffer-overflow exploits), but perhaps there is some sort of compiled representation of the configuration(i.e. binary data structures/store stored on disk and cached in memory, or at least in an intermediate form such as that used to store scripts in ROS)that has been somehow corrupted. This is based the symptoms listed above, plus:
- Port activity lights for connected switch ports work on boot(i.e. flash to show line activity), butall go offlate in the boot process
- Device isnot visiblein winbox
- Icanprintorexportinformation from parts of the cli tree that seem to have no nexus with a physical interface (e.g./system routerboard,/ip dhcp vendor-table,/ip dns)
- I cannot evenprintinformation from many other places. Sometimes after aCTRL-cthe console will recover and the headers for that section will be displayed with no data.
...(print count-onlyalso fails in these cases) - At least once, I noticed what seemed like a partialdefault-configurationscript in/environment(it looked official, and didn't contain any of my config data)
- There is little to nothing in the logs about these failures(and I have turned on many logging topics, both to microSD, memory, and echo)
- The only log entry of note, logged at about 17 seconds of system uptime is:system,error,criticalrouter rebooted because some critical program crashed
- One time only, this log entry occurred immediately following that one:system,error,criticalrouter was rebooted without proper shutdown
- If I start with a clean configuration or reload an older backup file, all the problems magically seem to go away.
- If I restore this backup file(or any others created after yesterdays crash)on a working router, all of the above symptoms return*.
*(which I guess means that the autosupout.rif file and the backup file may be ... ?mostly the same thing?, since in this case the backup file is only 10 KiB larger than the autosupout.rif file. That, or maybe the backup file contains more data, plus a compressed version of the autosupout.rif file contents. ... one/both of the files in this case are corrupted / incomplete.)
----
What I'm asking for / Tl;dr:
I would love one or more of these to be a possible solution:
- 复制the file to a USB drive or microSD card
- specify an alternate location forautosupout.rifif external storage is mounted(i.e. automatically, or configured through either a file of a specific name existing on the internal flash storage, a ROS $global variable, or a normal system configuration flag settable through winbox and the cli)
- display the file in console(even if it was just in hex, so it could be copied/pasted to a pc where one could reconstitute the original file)
One final idea I had was to spin up a CHR instance and try restoring the.backupfile there, but since they are a completely different architecture I figure it's a long-shot.(I believe I have seen in several places that they are only designed to be restored to the same model of device as where they were created).
-->Are any of those possible today, perhaps with some hidden configuration setting or some magic scripting?-->Did I just miss something super-obvious?
-->Any other ideas?-->If this is not possible now, can some functionality be added to a future release?(Kermit, anyone? ;)
感谢你的阅读!如果我发现something useful myself, I'll try to come back and post it to help others in the future.
--
P.S. Yes, yes, I know I should have had a better backup solution.Someone will probably tell me anyway. Fair.(I receive autosupouts via e-mail from the watchdog timer, but the last one is over a month old because this device is normally so stable.)Still, six weeks of minor firewall and QoS tweaks made to facilitate pandemic-related home work add up to something more significant that I had predicted... if any of that data is still there, I'd love to see it!