Last Comment Bug 526805 - Users of multiple PRRWLock reader locks may deadlock
: Users of multiple PRRWLock reader locks may deadlock
Status: ASSIGNED
:
:
Product: NSPR
NSPR
: other
: x86_64 Linux
: -- normal (vote)
: ---
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2009-11-05 10:32 PST by
Modified: 2009-11-19 11:36 PST (History)

0 users (edit)
See Also:
blocking-fennec: ---
blocking2.0: ---
status2.0: ---
blocking1.9.2: ---
status1.9.2: ---
blocking1.9.1: ---
status1.9.1: ---


Attachments
Test Program Code (2.92 KB, text/plain)
2009-11-05 10:36 PST,
no flags Details

[reply] [-] Description 2009-11-05 10:32:17 PST
User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.3)
Gecko/20090909 Fedora/3.5.3-1.fc11 Firefox/3.5.3
Build Identifier: 4.8.1

Details from NSPR development mailing list:

> Sorry to bring up such an old message thread, but an issue has come up
> > related
> > to re-entrant use of PR_RWLock_Rlock().
> >
> > The way PR_RWLock works is that a waiting writer will block any threads
> > attempting to get a new read lock.  If you use read locks in a re-entrant
> > manor, a request for a write lock between the two read lock calls will cause
> > a
> > deadlock (the writer is waiting for the reader to exit, and the reader can't
> > get the re-entrant lock since the writer is waiting).  I am running into
> > this
> > deadlock in my application.
> >
> > I'd like to propose that we modify PR_RWLock to behave differently when a
> > re-entrant read lock is made.  If a thread already holds a read lock and
> > tries
> > to get another readlock, this should be allowed, even if a writer is waiting
> > on
> > the write lock.  Any other threads attempting to get a read lock will have
> > to
> > wait on the writer since it is given priority.  This approach would prevent
> > the
> > writer from being starved due to many active readers, yet it would also
> > allow
> > for safe re-entrant use of read locks without chance of a deadlock.
> >
> > Does the above proposal sound feasible?
Hi Nathan,

We have three implementations of PRRWLock.
Two of them (HAVE_UNIX98_RWLOCK and HAVE_UI_RWLOCK)
are based on native thread libraries. Unless the behavior you
proposed is documented in the Unix 98 reader-writer locks,
if we make this change, we won't be able to use Unix 98
reader-writer locks.

It seems that native thread libraries must solve this problem
if they allow a thread to hold multiple concurrent read locks,
and it seems that your proposal is the obvious solution.
I'm worried that native thread libraries actually implement
your solution but fail to document it.

Wan-Teh


Reproducible: Always
[reply] [-] Comment 1 2009-11-05 10:36:17 PST
Created attachment 410550 [details]
Test Program Code

This is the code for a test program that demonstrates the deadlock.

The program creates 2 threads (a reader and a writer).  The PR_Sleep function
is used to guarantee the following order of events:

- Reader thread obtains a reader lock.
- Writer threads attempts to get writer lock (and is blocked by the reader).
- Reader thread attempts to get a second reader lock (and is blocked by the
waiting writer).
[reply] [-] Comment 2 2009-11-05 10:39:46 PST
I have run the test program on a HP-UX 11.23 ia64 system (where
HAVE_UNIX98_RWLOCK is defined), and the deadlock still occurs.  Does this mean
that it is out of the question to modify the way that NSPR's portable
reader-writer locks work?
[reply] [-] Comment 3 2009-11-05 11:13:17 PST
Nathan, thanks for the bug report.  Could you also run your test program
on Solaris?  That's the other platform where NSPR uses the native reader
writer locks.

I thought about this issue a bit, and I believe that unless we remember
all the threads that own a reader lock, we will need to give up giving
preference to waiting writers (which means writers may be starved)
to allow readers to lock recursively.
[reply] [-] Comment 4 2009-11-05 13:59:13 PST
I have run my test program on a Solaris 9 sparc system and it does not
deadlock.  Here is the stdout output from my test program on this system:

Reader running.  Attempting to get first reader lock.
Reader obtained first reader lock.
Reader sleeping for 10 seconds.
Writer running.  Sleeping for 5 seconds.
Writer attempting to get writer lock.
Reader attempting to get second reader lock.
Reader obtained second reader lock.
Reader releasing second reader lock.
Reader releasing first reader lock.
Writer obtained writer lock.
Writer releasing writer lock.
[reply] [-] Comment 5 2009-11-05 14:18:18 PST
(In reply to comment #3) 
> I thought about this issue a bit, and I believe that unless we remember
> all the threads that own a reader lock, we will need to give up giving
> preference to waiting writers (which means writers may be starved)
> to allow readers to lock recursively.

I believe that you are correct that we need to keep track of all of the threads
who own a reader lock as opposed to simply keeping a count of the number of
readers.

Is it legal for a thread that does not own a reader or writer lock to call
PR_RWLock_Unlock()?  It seems like nothing is preventing this currently.  If we
keep track of readers, it seems as if this should not be allowed since there
would be no way of knowing which thread to remove from the owners list.
[reply] [-] Comment 6 2009-11-16 10:34:53 PST
In light of my test findings, do my suggestions from comment#5 seem like a
viable and acceptable approach for fixing this issue?
[reply] [-] Comment 7 2009-11-16 11:06:17 PST
Nathan, keeping track of all the threads who own a reader lock
may slow down PRRWLock.  It also means we won't be able to use
pthread_rwlock_t on HP-UX.  Since Solaris is open source now,
we should study its implementation of reader-writer locks.  I
did a quick search for pthread_rwlock_rdlock and rw_rdlock at
http://src.opensolaris.org/source/ and found:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/port/threads/rwlock.c

Could you also study the code to find out what its algorithm
is?  Thanks.
[reply] [-] Comment 8 2009-11-19 08:40:52 PST
(In reply to comment #7)
> Nathan, keeping track of all the threads who own a reader lock
> may slow down PRRWLock.  It also means we won't be able to use
> pthread_rwlock_t on HP-UX.  Since Solaris is open source now,
> we should study its implementation of reader-writer locks.  I
> did a quick search for pthread_rwlock_rdlock and rw_rdlock at
> http://src.opensolaris.org/source/ and found:
> 
> http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/port/threads/rwlock.c
> 
> Could you also study the code to find out what its algorithm
> is?  Thanks.

The Solaris implementation of reader-writer locks keeps a list of readers.  It
also keeps a reference count for each reader to keep track of how many locks it
holds.  When a thread that already holds a reader lock attempts to get another
reader lock, the list of readers is checked to see if a reader lock is already
held by the thread.  If so, the reference count for that thread is simply
increased.

The list of reader threads is also used by the unlock function to ensure that
the calling thread actually holds a reader lock.
[reply] [-] Comment 9 2009-11-19 11:36:38 PST
Nathan: thanks for studying the Solaris code.

This is a tough call.  On the one hand, it is reasonable to expect
that a thread should be able to acquire a reader lock recursively
because the thread should be able to share read-only access to the
resource with itself.  On the other hand, to support this the
reader-writer lock implementation will become very heavyweight,
making it even less appealing than regular locks.

If you really can't restructure your code to avoid recursive
reader locks, we'll have to implement the Solaris algorithm
and also stop using the pthread reader-writer locks on HP-UX.

:

Status:
ASSIGNED