I have a little (headless) ARM box running Debian with a built in clock module, but whose battery had died. On replacement the box failed to get an IP address via DHCP.
On investigation it appears that because the clock was "wrong" it couldn't perform DHCP queries ("Unable to setup timer"). I think "wrong" doesn't just mean inaccurate (it was claiming to be in 1928) but corrupt (hwclock reported corrupt registers).
The clock would get reset by a time server as soon as it got online, but without valid clock it couldn't get online to get the valid clock...
Anyone know any more about this or able to point to a general workaround for next time this happens?
All I have found (after the event) is here: http://www.solid-run.com/community/viewtopic.php?f=8&t=525 .. which confirms what I thought but doesn't really offer a solution (logging in to set the clock on a remote headless unit isn't particularly convenient!)
Mark
On Thu, 5 Feb 2015 16:39:42 +0000 Mark Rogers mark@more-solutions.co.uk wrote:
I have a little (headless) ARM box running Debian with a built in clock module, but whose battery had died. On replacement the box failed to get an IP address via DHCP.
On investigation it appears that because the clock was "wrong" it couldn't perform DHCP queries ("Unable to setup timer"). I think "wrong" doesn't just mean inaccurate (it was claiming to be in 1928) but corrupt (hwclock reported corrupt registers).
The clock would get reset by a time server as soon as it got online, but without valid clock it couldn't get online to get the valid clock...
Anyone know any more about this or able to point to a general workaround for next time this happens?
All I have found (after the event) is here: http://www.solid-run.com/community/viewtopic.php?f=8&t=525 .. which confirms what I thought but doesn't really offer a solution (logging in to set the clock on a remote headless unit isn't particularly convenient!)
I can't offer a solution but if you think back to a Raspberry Pi, they're not guaranteed to be online or to have a valid clock time so I imagine they save the clock value from when they were last online and use that the next time they fire up. So at least the clock would be near the local time.
If nothing else, it might help you find a solution that works for you.
On 5 February 2015 at 17:08, Chris Walker alug_cdw@the-walker-household.co.uk wrote:
I can't offer a solution but if you think back to a Raspberry Pi, they're not guaranteed to be online or to have a valid clock time so I imagine they save the clock value from when they were last online and use that the next time they fire up. So at least the clock would be near the local time.
If nothing else, it might help you find a solution that works for you.
That would be a good workaround, provided it only fires when the date isn't set by the RTC.
To be honest I find the whole time configuration thing to be pretty messy - lots of different tools, lots of different ways to achieve the same thing, to the point that I don't understand any of them properly!
From my point of view, because this box logs data, what I really want
is that if the clock isn't set then it picks a known time that it hasn't already used (eg 1/1/1970 00:00, but next time it picks 2/1/1970, etc) because otherwise any data it logs gets overwritten in the database because of duplicate timestamps. In which case I think I have some coding to play with :-)
On 05/02/15 16:39, Mark Rogers wrote:
I have a little (headless) ARM box running Debian with a built in clock module, but whose battery had died. On replacement the box failed to get an IP address via DHCP.
On investigation it appears that because the clock was "wrong" it couldn't perform DHCP queries ("Unable to setup timer"). I think "wrong" doesn't just mean inaccurate (it was claiming to be in 1928) but corrupt (hwclock reported corrupt registers).
The clock would get reset by a time server as soon as it got online, but without valid clock it couldn't get online to get the valid clock...
Anyone know any more about this or able to point to a general workaround for next time this happens?
All I have found (after the event) is here: http://www.solid-run.com/community/viewtopic.php?f=8&t=525 .. which confirms what I thought but doesn't really offer a solution (logging in to set the clock on a remote headless unit isn't particularly convenient!)
Mark
Well, I could be wrong but....
If it's on all the time, then there won't be a problem, as it will maintain it's own time (more or less), and if it drifts, you could always install NTPD (I think - Network Time Protocol Daemon which should keep time synced to a internet clock)
If it's rebooted frequently, then it will rely on the clock for its initial time. You'll just have to make sure it's got a decent battery all the time.
If it's rebooted only occasionally, and it fails as described while you're there, replacing the battery, then simply setting the date/time at a prompt should be sufficient to get it to reboot and get an IP address etc.
Steve
On 5 February 2015 at 22:04, steve-ALUG@hst.me.uk wrote:
Well, I could be wrong but....
All good points, but:
If it's rebooted only occasionally, and it fails as described while you're there, replacing the battery, then simply setting the date/time at a prompt should be sufficient to get it to reboot and get an IP address etc.
You missed: If it's rebooted only occasionally, and it fails as described while I'm not there (and quite possibly not even on the same continent), then I can instruct someone to replace the battery and I can fix everything else remotely once I can get a remote connection, which is how I got stuck this time. In the event they sent the box back to me (as we didn't know what the problem was) but for future use I'd like to be able to help it get back on its feet a little easier.
I don't think the problem is the invalid time as aside from maybe expiring leases before they are due DHCP should recover from the clock being wrong. I think the problem was more to do with the corruption you saw hwclock reporting as a lot of the RTC chips like the Dallas 130x series don't come up in a clean state after a battery change.
I found this during testing of some self built embedded devices with RTC chips and had to add specific code to look for a flag that tells you the RTC isn't properly initialised (PC bios does this as well and generates a prompt)
Sadly not all embedded devices handle this properly and the clock can give all sorts of problems until it is reset (including outputting an impossible date string)
Do you actually need the RTC ? Could you rely on NTP to set the time if the devices are on all a network ?
Just wondering if there is a kernel boot flag to ignore the RTC, or if you could but a script in init.d to run before the network scripts which resets the RTC with hwclock to a clean state (even if it is the wrong time) ?
Not elegant I know, but unless you have control of how the embedded system behaves when the clock isn't properly initialised I am not sure what else you could do.
On 7 February 2015 at 16:49, Wayne Stallwood ALUGlist@digimatic.co.uk wrote:
Do you actually need the RTC ? Could you rely on NTP to set the time if the devices are on all a network ?
Network can't be relied upon so unfortunately RTC is required.
Just wondering if there is a kernel boot flag to ignore the RTC, or if you could but a script in init.d to run before the network scripts which resets the RTC with hwclock to a clean state (even if it is the wrong time) ?
Just thinking about it, I wonder whether simply putting: hwclock --systohc .. into init.d would suffice? By that point the kernel should have a time (whether it's correct or not is largely moot, but if the hwclock is accurate then it should (I assume?) have been read long before this), so hwclock would thereafter be a non-corrupt (albeit not necessarily accurate) value. That should prevent the DHCP issue.
It does mean it needs to occur before DHCP though, so maybe /etc/network/if-pre-up.d would be a better place?
Can anyone think of a downside to this?
On Thu, 5 Feb 2015 16:39:42 +0000 Mark Rogers mark@more-solutions.co.uk allegedly wrote:
I have a little (headless) ARM box running Debian with a built in clock module, but whose battery had died. On replacement the box failed to get an IP address via DHCP.
On investigation it appears that because the clock was "wrong" it couldn't perform DHCP queries ("Unable to setup timer"). I think "wrong" doesn't just mean inaccurate (it was claiming to be in 1928) but corrupt (hwclock reported corrupt registers).
The clock would get reset by a time server as soon as it got online, but without valid clock it couldn't get online to get the valid clock...
Anyone know any more about this or able to point to a general workaround for next time this happens?
Obvious question. Do you have to use DHCP? I always give servers fixed IP addresses. A headless debian box sounds like a "server" to me.
Mick ---------------------------------------------------------------------
Mick Morgan gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312 http://baldric.net
---------------------------------------------------------------------
On 6 February 2015 at 14:55, mick mbm@rlogin.net wrote:
Obvious question. Do you have to use DHCP? I always give servers fixed IP addresses. A headless debian box sounds like a "server" to me.
It's a data logging box. We used to set a static IP but as it has to rely on whatever network it gets connected to we've decided DHCP gives us more flexibility; it emails us its address on connection, and if we need to access it we can usually do it via a local VPN.
For a while we had both set up (eth0 static, eth0:0 on DHCP) but seemed to have more problems that way, from memory it was DHCP obliterating DNS configuration which I am sure we could have fixed, but moving to DHCP also means less bespoke configuration of each box (there's about half a dozen of them so it's far from being a major enterprise!)