[LinuxPPS] [PATCH 02/11] Streamline interrupt handling; get rid of a lock and idr_find()

Tue Feb 10 11:42:07 CET 2009

Rodolfo Giometti <giometti at enneenne.com> wrote:
>>    (I could make it totally safe using a seq_lock-like mechanism
>>    if necessary.)
>
> Better. :)

Fine.  As I said, you won't get a crash or anything, just a fucked-up
timestamp.  Under circumstances (5s scheduling delays) when timekeeping
isn't working all that well anyway.  You could get the same thing if
some nut was doing strange things with settimeofday().

But okay, I'll complexify PPS_FETCH to close the hole.

> This still considers a particular implementation. You cannot provide a
> core implementation of anything which delegates locking of its
> internal data to the upper layers, if so your code is broken by
> design!

If I may be so blunt, nonsense.

All code has a contract with the code calling it.
In this case, the contract is, "don't call pps_unregister_source while any
pps_event calls are in progress".  This is quite a reasonable contract,
widely used in multiple bits of the kernel.  the unregister function
frees the pointer, so you shouldn't have anything referencing that
pointer still running.

Kind of like you shouldn't free_pages() until the DMA is complete.

I agree that using an index looked up in an IDR makes the code crash-proof
(you can always detect invalidity of the index), but there are ways of
doing that with pointers, too.

More to the point, using an index looked up in an IDR does NOT
make the code bug-free.  Consider the following hypothetical code
that calls pps_unregister_source in parallel with pps_event:

CPU0			CPU1			CPU2
						call pps_register_source
						allocate PPS struct
call pps_event		
			call pps_unregister_source
			lock IDR
						lock IDR...(wait)
lock IDR...(wait)
			idr_find(/dev/pps0)
			decrement usage to 0
			remove pps from IDR
			unlock IDR
			deallocate PPS
						... lock acquired
						Allocate ID 0
						unlock IDR
...lock acquired
idr_find(/dev/pps0)
Get unexpected PPS device!
pps->usage++
unlock IDR
Do things to wrong PPS device

Now, the current pps_register_source code actually delays initializing
the pps struct until *after* idr_get_new, so the CPU0 pps_event()
call can crash on an uninitialized structure, but even if that's fixed
(which isn't too hard), you still have the pps_event() calling idr_find
and getting the WRONG pps device.

There is flatly NO WAY to fix that bug unless you can bound the time
interval AFTER pps_unregister_source during which pps_event is
legal.  And that requires the cooperation of the caller.  And once
you have a bound, what's simpler than a bound of 0?

Remember what torpedoed every last attempt to make loadable modules
responsible for their own locking?  On a preemptible kernel, there is no
bound on the time that may elapse between the call of a function and the
execution of the first instruction in that function.  It can't increase
its own reference count, because that's already too late!

> Again, forgot the ldisc and any particular implementation. The
> question is: does your locking scheme provide 100% race free code? You
> cannot relay on which PPS client a user may implement. Users can be
> evil! :)

You can't NOT rely on the PPS client.  The client could scribble on
memory and crash the machine.  It could pass in timestmp pointers
to hardware registers with read side effects.  It could do all kinds
of strange things.

Even if you assume that all parameters are correct, and only the timing
can be odd, the client could deallocate the PPS device, wait for the
module to be unloaded, and then start making pps_event calls.  Kaboom.

>> Um... okay, style question.  I'm given to adding comments as I'm
>> puzzling out code, to save the next person the trouble.  In this case,
>> the important side effect was buried in the middle of an expression,
>> and I didn't see it the first time.
>> 
>> Admittedly, it's the only patch to pps_cdev_open() and not strictly
>> related, but is it really that much clutter to add a comment here and
>> there to a patch?  Additional patches are are form of clutter, too,
>> and the alternative is to discourage explanatory comments.
>
> If you wish adding "explanatory comments", please provide a patch
> titled "explanatory comments". If your patch fixes some bugs or adds
> new features you may add whatever comments you wish but related with
> the change you are doing.
>
> Apart those stuff, really do you consider "Bump refcount" an
> "explanatory comment"? ;)

Yes, for the resons explained in the last sentence of the first paragraph
above.  "In this case, the important side effect was buried in the middle
of an expression, and I didn't see it the first time."  Admittedly, it's
not very important.  It just came up while I was editing and I didn't
bother to make it a separate commit.

> > > Please, solve (or explain) the locking issue during
> > > pps_unregister_source() and pps_event(), other things are cosmetics!
> > > :)
> > 
> > Hopefully, I've addressed that.  Would you like me to expand the
>
> I disagree. You explanation relays on ldisc implementation. Please,
> check the code considering only the core and any possible case of
> methods call.

I just demonstrated above that's flatly impossible.  Unless you can
hold off reallocating a PPS ID until all other API calls have finished
(or at least acquired the pps_idr_lock), you will have bugs.

Given that you have to impose *some* bounded time limit between
pps_unregister_source and other calls, why not make that bound 0?

That at least makes the PPS code simple; other things make it more
complicated, and I don't think they help the callers.

The closest you can get is RCU trickery.

The pps-ldisc client relies on the guarantees that the ldisc layer
makes.  Your code does that already.  Suppose that pps_tty_dcd_change
could overlap with pps_tty_close.  Well, after pps_tty_close comes
tty_kref_put, which calls release_one_tty, which calls free_tty_struct(tty);

CPU 0			CPU 1

pps_tty_dcd_change(tty)
			pps_tty_close(tty)
			tty_kref_put(tty)
			 release_one_tty(tty)
			  free_tty_struct(tty)
			   kfree(tty)
Fetch tty->id: SEGFAULT!

(This doesn't happen because the code in tty_release_dev that calls
all of this takes care to ensure that the ldisc is *fully* shut down
before getting on to tty_kref_put().)