Last Comment Bug 354593 - crash/data corruption and assert in MT_safe_localtime() function
: crash/data corruption and assert in MT_safe_localtime() function
Status: RESOLVED FIXED
:
:
Product: NSPR
NSPR
: other
: x86 Windows XP
: -- normal (vote)
: 4.6.5
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2006-09-27 17:26 PDT by
Modified: 2006-11-30 16:35 PST (History)

3 users (edit)
  • gavin.sharp@gmail.com
  • marks@coral8.com
  • nelson@bolyard.me


See Also:
blocking-fennec: ---
blocking2.0: ---
status2.0: ---
blocking1.9.2: ---
status1.9.2: ---
blocking1.9.1: ---
status1.9.1: ---


Attachments
A possible fix for this bug (2.12 KB, patch)
2006-09-28 17:02 PDT,
no flags Details | Diff
proposed patch (5.42 KB, patch)
2006-11-29 19:05 PST,
nelson: review+
nelson: superreview+
Details | Diff
Proposed patch v2 (2.18 KB, patch)
2006-11-30 11:36 PST,
no flags Details | Diff
Proposed patch v2 (correct) (5.65 KB, patch)
2006-11-30 11:40 PST,
nelson: review+
Details | Diff

[reply] [-] Description 2006-09-27 17:26:19 PDT
http://lxr.mozilla.org/nspr/source/nsprpub/pr/src/misc/prtime.c#559

Inside MT_safe_localtime() function we create "on demand" lock to protect it in
MT environment. However, if this function is called from multiple threads
*first time* then we have a problem:
- two threads enter MT_safe_localtime() function in the same time 
- static "monitor" variable is not initialized yet so both threads decide to
create it (lines 560-562)
- note that second thread overwrites the value of "monitor" variable!
- now we have two threads executing this code that MUST be executed from one
thread only (this is a potential crash or data corruption)
- then we go to unlock on line 602 and since we overwrite the monitor value for
one of the threads, the code will assert either because thread tries to unlock
"monitor" locked by another thread or because thread tries to unlock "monitor"
which is not locked

we are working on a patch but there might be more code like this....
[reply] [-] Comment 1 2006-09-27 19:12:49 PDT
ok, fixed... the new libxml2 version slightly change the return protocol for
strings and old xpath code we had silently broke... I upgraded operator to use
new XmlForest code and removed the old code completely
[reply] [-] Comment 2 2006-09-27 19:14:11 PDT
(In reply to comment #1)

ops... wrong bug,, wrong bugzilla.... :(
[reply] [-] Comment 3 2006-09-28 17:02:18 PDT
Created attachment 240553 [details]
A possible fix for this bug

Created "localtime" lock during NSPR initialization.
[reply] [-] Comment 4 2006-09-28 18:42:13 PDT
Comment on attachment 240553 [details]
A possible fix for this bug

Thank you.  This patch is correct but incomplete because
there are three implementations of PR_Cleanup.  The other
two implementations are in ptthread.c and btthread.c.
See http://lxr.mozilla.org/nspr/ident?i=PR_Cleanup
[reply] [-] Comment 5 2006-11-29 19:05:10 PST
Created attachment 247027 [details]
proposed patch

Mark, Alexey, could you please test this patch?

Note that I still keep the "if (needLock)" test in MT_safe_localtime
because I haven't verified that NSPR is guaranteed to be initialized
when MT_safe_localtime is called.  Do you remember if you verified
that's the case?
[reply] [-] Comment 6 2006-11-29 19:06:29 PST
Added Mark to the cc list.  Mark, please see my comment 5.  Thanks.
[reply] [-] Comment 7 2006-11-30 09:03:00 PST
yes, the patch works just fine, thanks!
[reply] [-] Comment 8 2006-11-30 09:42:21 PST
Comment on attachment 247027 [details]
proposed patch

Aleksey, thanks for testing the patch.

When you review the patch, note that there is only one implementation
of NSPR initialization (_PR_InitStuff) but there are two implementations
of NSPR cleanup (PR_Cleanup), one for Unix (in ptthread.c) and one for
Windows/OS2/others (in prinit.c).
[reply] [-] Comment 9 2006-11-30 10:15:18 PST
Comment on attachment 247027 [details]
proposed patch

Two issues. 

1. In mozilla/nsprpub/pr/src/misc/prtime.c The patch adds this line:

>+#define _PR_HAVE_LOCALTIME_MONITOR 1

I'm not sure what this symbol is here to accomplish, but it appears
to me that if this code were to be compiled with the compiler option
-u _PR_HAVE_LOCALTIME_MONITOR 
the code would be incorrect.  

I'd suggest either (a) getting rid of that symbol and the ifdefs that
use it, making the code bracketed by those ifdefs be unconditionally 
compiled or (b) instead of deleting 3 lines of code in MT_safe_localtime,
make those 3 lines of code be conditionally compiled with the condition
#ifndef MT_safe_localtime

2. I think new function _PR_CleanupTime shuold test global variable
"monitor" to ensure that it it non-NULL before passing it to PR_DestroyLock.
Otherwise, I think we're going to start seeing crashes there.
[reply] [-] Comment 10 2006-11-30 10:29:57 PST
Comment on attachment 247027 [details]
proposed patch

I'm using the _PR_HAVE_LOCALTIME_MONITOR macro to get a simpler
C preprocessor boolean expression than
!defined(HAVE_INT_LOCALTIME_R) && !defined(HAVE_POINTER_LOCALTIME_R).
It specifies the condition under which there is a static PRLock *monitor
variable that we need to create or destroy.

The reason for this design is to allow _PR_InitTime and _PR_CleanupTime
to do more than creating and destroying the localtime monitor in the
future.  I'll implement your suggested change if you insist because I
want to close this bug.  Just let me know.

None of the _PR_CleanupXXX functions check for null pointers before
destroying the objects they point to.  This is because the _PR_CleanupXXX
functions are only called by PR_Cleanup, which returns immediately
if NSPR isn't initialized.  I can add the null pointer check.
[reply] [-] Comment 11 2006-11-30 11:29:43 PST
Comment on attachment 247027 [details]
proposed patch

I see that I accidentally replaced aleksey's r+ with an r-, when I intended to
change sr? to sr-.  Wish I could undo that.  I will simply set r- back to r+,
but it will not restore aleksey's name to that review. 

In light of Wan-Teh's explanation for that new 
symbol, I will change my 
review to sr+.
[reply] [-] Comment 12 2006-11-30 11:36:16 PST
Created attachment 247085 [details]
Proposed patch v2

Nelson, I implemented your second suggested change, to test
the pointer 'monitor' for NULL before passing it to PR_DestroyLock.

I didn't implement your first suggested change, but addressed
the issue by renaming the macro (removing the _PR_ prefix) and
adding a comment to explain it.

Do you like this patch better?
[reply] [-] Comment 13 2006-11-30 11:40:10 PST
Created attachment 247086 [details]
Proposed patch v2 (correct)

This is the complete patch.  Sorry about the mistake.
[reply] [-] Comment 14 2006-11-30 13:25:58 PST
Comment on attachment 247027 [details]
proposed patch

I checked in the original "proposed patch" on the NSPR trunk (NSPR 4.7)
first.  Once Nelson approves the minor changes in "proposed patch v2
(correct)", I'll check that in.

Checking in pr/include/private/primpl.h;
/cvsroot/mozilla/nsprpub/pr/include/private/primpl.h,v  <--  primpl.h
new revision: 3.86; previous revision: 3.85
done
Checking in pr/src/misc/prinit.c;
/cvsroot/mozilla/nsprpub/pr/src/misc/prinit.c,v  <--  prinit.c
new revision: 3.45; previous revision: 3.44
done
Checking in pr/src/misc/prtime.c;
/cvsroot/mozilla/nsprpub/pr/src/misc/prtime.c,v  <--  prtime.c
new revision: 3.25; previous revision: 3.24
done
Checking in pr/src/pthreads/ptthread.c;
/cvsroot/mozilla/nsprpub/pr/src/pthreads/ptthread.c,v  <--  ptthread.c
new revision: 3.68; previous revision: 3.67
done
[reply] [-] Comment 15 2006-11-30 16:08:52 PST
Comment on attachment 247086 [details]
Proposed patch v2 (correct)

r=nelson.  Thanks, Wan-Teh.
[reply] [-] Comment 16 2006-11-30 16:35:16 PST
I checked in "proposed patch v2 (correct)" on the NSPR trunk (NSPR 4.7)
and NSPR_4_6_BRANCH (NSPR 4.6.5).

Checking in prtime.c;
/cvsroot/mozilla/nsprpub/pr/src/misc/prtime.c,v  <--  prtime.c
new revision: 3.26; previous revision: 3.25
done

Checking in include/private/primpl.h;
/cvsroot/mozilla/nsprpub/pr/include/private/primpl.h,v  <--  primpl.h
new revision: 3.84.2.1; previous revision: 3.84
done
Checking in src/misc/prinit.c;
/cvsroot/mozilla/nsprpub/pr/src/misc/prinit.c,v  <--  prinit.c
new revision: 3.43.2.1; previous revision: 3.43
done
Checking in src/misc/prtime.c;
/cvsroot/mozilla/nsprpub/pr/src/misc/prtime.c,v  <--  prtime.c
new revision: 3.21.2.2; previous revision: 3.21.2.1
done
Checking in src/pthreads/ptthread.c;
/cvsroot/mozilla/nsprpub/pr/src/pthreads/ptthread.c,v  <--  ptthread.c
new revision: 3.66.2.1; previous revision: 3.66
done

:

Status:
RESOLVED FIXED