[LinuxPPS] convergence patch

Fri May 8 22:39:46 CEST 2009

On Friday 08 May 2009 01:14:57 am Bernhard Schiffner wrote:
> On Friday 08 May 2009 03:45:32 Hal V. Engel wrote:
> > On Thursday 07 May 2009 11:16:35 am Bernhard Schiffner wrote:
snip
> Yesterday I had a netbook (with standard Debian Lenny): after 1 day with
> internet and ntp connection it was off by 56 ms(!).
>
> This indicates, that there are problems.

Using public time servers is a real crap shoot unless they are carefully 
selected.  Before I started using LinuxPPS I found that if I used the time 
server pools that some days I would see offsets in the 10ms to 20ms range and 
on other days it would be 50ms to 60ms and other days somewhere in between.  
When using the time server pools you get different servers on each restart of 
ntp.  Most distros are configured to use server pools by default.   

Later on I configured specific servers instead of using the server pools and 
the offsets where more consistent and were in the 10ms to 15ms range.  Later 
still I spent some time locating time servers that were "close" to me.  Close 
meaning low ping times and minimal number of network hops - close on the 
network.  This reduced the offsets into the 2ms to 5ms range.  But it took a 
lot of effort to find these servers since there are no utilities to help with 
this that I could locate.  So even a very carefully configured machine using 
public time servers would more than likely not meet your 1ms requirement.  I 
should add that all of this was well before the nanokernel was released.

> I observed the behaviour, John described: starting ntp, calming down a
> little bit, something internal switches, offset rises and comes back to
> zero over _hours_.
>
> If you start a host for doing some "coordinated" measuring you should be
> sure about timekeeping in a scale the others do too very soon (minutes).
> (video: 30 ms, machines 10 ms, sound 5ms, networking 1 ms)
> It's much better to have some reserves as LinuxPPS provides.
>
> > The kernel convergence patch should help with startup convergence issues
> > for current linux kernels and also help keep offsets low after startup. 
> > In addition, I use the --panicgate parm when starting the ntp daemon
> > which will cause ntp to sync the system clock to the reference clock at
> > start up.
>
> I missed this. Thanks for this hind!
>
> I watched an ntpd:
> 1.) it starts
> 2.) it has a big offest to the ntp server (0,5 s)
> 3.) it "suddenly" reduces this offset.
> 4.) it starts long-time behavior.
> (seems to grep the drift, use something calculated while booting etc.)
> 5.) if the "variables" gotten in 4. are wrong estimates it now takes hours
> to correct it.
> 6.) You switch off before ntpd stabilzes and 5.) will have a problem netxt
> time too.

I found that it takes ntp a long time to stabilize it's drift file setting 
since the PLL takes a while to get a really good lock and until that happens 
the frequency will be off by a significant amount when rebooted.  I found that 
letting my machine run for 24 hours so that the drift file had a value that 
was in the right ball park helped on subsequent restarts.

> > What I
> > find is my starting offset is typically under 50 microseconds once ntp
> > causes the time to sync with the ref clock when I do this and that ntp
> > without the convergence patch will keep offsets below +-100 microseconds
> > as the oscillator temperature comes up to normal operating temperature
> > (perhaps 15 minutes).  I am hoping that ntp will do better during this
> > warm up period with the convergence patch in place.
>
> My bet: "Yes, you can!" :-)

It appears that this is the case.  Today my system very quickly got the offset 
to under 10us and within 10 minutes had it below 5us and close to 1us in about 
an hour.  I am a happy camper since this is the best that ntp has functioned 
on my system since the release of the nanokernel.

Hal