[JIRA] Issue Comment Edited: (SNOW-196) Deadlock in LLTextureFetchWorker::lockWorkMutex / LLThread::lockData / LLTextureFetch::lockQueue

Aleric Inglewood (JIRA) no-reply at lindenlab.cascadeo.com
Wed Jan 20 16:27:32 PST 2010


    [ http://jira.secondlife.com/browse/SNOW-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=160888#action_160888 ] 

Aleric Inglewood edited comment on SNOW-196 at 1/20/10 4:25 PM:
----------------------------------------------------------------

Hi Bao, some random thoughts:


Consider the following locking theory:

If a thread locks 'A' while holding lock 'B' you can see that as a directional edge
with two nodes A and B, with the arrow going from A to B.

If all those edges together form a a graph with with a loop in it, and each edge
of the loop can below to a different thread, then you have a potential deadlock.

In this case we have a loop with three nodes and three edges:

{noformat}
                        A
                       / \
                      B---C
{noformat}

arrows clockwise or whatever ;).

I've only ever seen this bug directly after a message about a missing texture, so
my guess would be that this code path (detection of a missing texture) causes 
one of the edges.

It seem logical to cut the loop by removing that particular edge (since it's the
least likely to occur).


      was (Author: Aleric Inglewood):
    Hi Bao, some random thoughts:


Consider the following locking theory:

If a thread locks 'A' while holding lock 'B' you can see that as a directional edge
with two nodes A and B, with the arrow going from A to B.

If all those edges together from a a graph with with a loop in it, and each edge
of the loop can below to a different thread, then you have a potential deadlock.

In this case we have a loop with three nodes and three edges:

{noformat}
                        A
                       / \
                      B---C
{noformat}

arrows clockwise or whatever ;).

I've only ever seen this bug directly after a message about a missing texture, so
my guess would be that this code path (detection of a missing texture) causes 
one of the edges.

It seem logical to cut the loop by removing that particular edge (since it's the
least likely to occur).

  
> Deadlock in LLTextureFetchWorker::lockWorkMutex / LLThread::lockData / LLTextureFetch::lockQueue
> ------------------------------------------------------------------------------------------------
>
>                 Key: SNOW-196
>                 URL: http://jira.secondlife.com/browse/SNOW-196
>             Project: 6. Second Life Snowglobe - SNOW
>          Issue Type: Bug
>          Components: Source Code 
>    Affects Versions: Snowglobe 1.2
>            Reporter: Aleric Inglewood
>            Assignee: bao linden
>             Fix For: Snowglobe 1.3
>
>
> Dead lock involving three threads:
> {noformat}
> (gdb) info threads
>   6 Thread 0x7f01ed04f950 (LWP 8014)  0x00007f020b7078f6 in *__GI___poll (fds=0x7f01ed04ef00, nfds=1, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:87
>   5 Thread 0x7f01fd744950 (LWP 8012)  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:130
>   4 Thread 0x7f01fdf45950 (LWP 8011)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:261
>   3 Thread 0x7f01fe746950 (LWP 8010)  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:130
>   2 Thread 0x7f01fef47950 (LWP 8009)  0x00007f020f61ff41 in nanosleep () from /lib/libpthread.so.0
> * 1 Thread 0x7f0201813850 (LWP 8005)  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:130
> {noformat}
> Thread 1 (main thread) waits on lock owned by thread 5:
> {noformat}
> #0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:130
> #1  0x00007f020f61aab0 in _L_lock_102 () from /lib/libpthread.so.0
> #2  0x00007f020f61a39e in __pthread_mutex_lock (mutex=0x7f01f25f1bd0) at pthread_mutex_lock.c:86
> #3  0x00007f020c31bf7e in apr_thread_mutex_lock (mutex=0x7f01f25f1bc8) at ../locks/unix/thread_mutex.c:92
> #4  0x00000000021ea81c in LLMutex::lock (this=0x7f01e78ccfc8) at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llcommon/llthread.cpp:307
> #5  0x00000000016335d2 in LLTextureFetchWorker::lockWorkMutex (this=0x7f01e78cce90)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/newview/lltexturefetch.cpp:196
> #6  0x000000000162614f in LLTextureFetch::receiveImagePacket (this=0x7f01f81fb8c0, host=@0x7f01f1f92b08, id=@0x7fff1985ac50, packet_num=38, data_size=1000, 
>     data=0x7f01e7a25300 "#(f]�9�\231���I��p\206��\021\017�)B��\237�8B��\003\221\177G�\237:k?{\032�\204?�\231��.�\237�\215�\224\"��0�\a�\va\vu�;�E����\023a�ڰ\036'e\003") at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/newview/lltexturefetch.cpp:1994
> #7  0x00000000017cef4e in LLViewerImageList::receiveImagePacket (msg=0x7f01f1f89640, user_data=0x0)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/newview/llviewerimagelist.cpp:1225
> #8  0x0000000001de11f9 in LLMessageTemplate::callHandlerFunc (this=0x7f01f217d0a0, msgsystem=0x7f01f1f89640)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llmessage/llmessagetemplate.h:376
> #9  0x0000000001e169a6 in LLTemplateMessageReader::decodeData (this=0x7f01f21e1000, buffer=0x7f01f1f8da78 "@", sender=@0x7fff1985b240)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llmessage/lltemplatemessagereader.cpp:714
> #10 0x0000000001e16f3d in LLTemplateMessageReader::readMessage (this=0x7f01f21e1000, buffer=0x7f01f1f8da78 "@", sender=@0x7fff1985b240)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llmessage/lltemplatemessagereader.cpp:795
> #11 0x0000000001dd6281 in LLMessageSystem::checkMessages (this=0x7f01f1f89640, frame_count=567)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llmessage/message.cpp:727
> #12 0x0000000001dd6625 in LLMessageSystem::checkAllMessages (this=0x7f01f1f89640, frame_count=567, http_pump=0x7f01f1f52ad0)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llmessage/message.cpp:4023
> #13 0x0000000000925f23 in LLAppViewer::idleNetwork (this=0x68ea7f0) at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/newview/llappviewer.cpp:3642
> #14 0x000000000094350d in LLAppViewer::idle (this=0x68ea7f0) at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/newview/llappviewer.cpp:3287
> #15 0x0000000000944c26 in LLAppViewer::mainLoop (this=0x68ea7f0) at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/newview/llappviewer.cpp:912
> #16 0x0000000001c40a81 in main (argc=1, argv=0x7fff1985bc78) at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/newview/llappviewerlinux.cpp:209
> (gdb) fr 2
> #2  0x00007f020f61a39e in __pthread_mutex_lock (mutex=0x7f01f25f1bd0) at pthread_mutex_lock.c:86
> 86	      LLL_MUTEX_LOCK (mutex);
> (gdb) p *mutex
> $9 = {__data = {__lock = 2, __count = 0, __owner = 8012, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
>   __size = "\002\000\000\000\000\000\000\000L\037\000\000\001", '\0' <repeats 26 times>, __align = 2}
> {noformat}
> Thread 3 waits on lock owned by thread 1:
> {noformat}
> #0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:130
> #1  0x00007f020f61aab0 in _L_lock_102 () from /lib/libpthread.so.0
> #2  0x00007f020f61a39e in __pthread_mutex_lock (mutex=0x7f01f8204bf0) at pthread_mutex_lock.c:86
> #3  0x00007f020c31bf7e in apr_thread_mutex_lock (mutex=0x7f01f8204be8) at ../locks/unix/thread_mutex.c:92
> #4  0x00000000021ea81c in LLMutex::lock (this=0x7f01f81fc978) at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llcommon/llthread.cpp:307
> #5  0x00000000016335f0 in LLTextureFetch::lockQueue (this=0x7f01f81fb8c0)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/newview/lltexturefetch.h:83
> #6  0x0000000001637221 in LLTextureFetchWorker::DecodeResponder::completed (this=0x7f01e003a570, success=true, raw=0x7f01fa01afa0, aux=0x0)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/newview/lltexturefetch.cpp:121
> #7  0x0000000001cdf80b in LLImageDecodeThread::ImageRequest::finishRequest (this=0x7f01e7f56770, completed=true)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llimage/llimageworker.cpp:166
> #8  0x00000000021bb20e in LLQueuedThread::processNextRequest (this=0x7f01f81e34f0)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llcommon/llqueuedthread.cpp:431
> #9  0x00000000021bb364 in LLQueuedThread::run (this=0x7f01f81e34f0) at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llcommon/llqueuedthread.cpp:491
> #10 0x00000000021eb672 in LLThread::staticRun (apr_threadp=0x7f01f81e4638, datap=0x7f01f81e34f0)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llcommon/llthread.cpp:83
> #11 0x00007f020c328c01 in dummy_worker (opaque=0x7f01f81e4638) at ../threadproc/unix/thread.c:142
> #12 0x00007f020f618faa in start_thread (arg=<value optimized out>) at pthread_create.c:300
> #13 0x00007f020b71029d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
> #14 0x0000000000000000 in ?? ()
> (gdb) fr 2
> #2  0x00007f020f61a39e in __pthread_mutex_lock (mutex=0x7f01f8204bf0) at pthread_mutex_lock.c:86
> 86	      LLL_MUTEX_LOCK (mutex);
> Current language:  auto; currently c
> (gdb) p *mutex
> $10 = {__data = {__lock = 2, __count = 0, __owner = 8005, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
>   __size = "\002\000\000\000\000\000\000\000E\037\000\000\001", '\0' <repeats 26 times>, __align = 2}
> {noformat}
> Thread 5 waits on lock owned by thread 3:
> {noformat}
> #0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:130
> #1  0x00007f020f61aab0 in _L_lock_102 () from /lib/libpthread.so.0
> #2  0x00007f020f61a39e in __pthread_mutex_lock (mutex=0x7f01f81e6650) at pthread_mutex_lock.c:86
> #3  0x00007f020c31bf7e in apr_thread_mutex_lock (mutex=0x7f01f81e6648) at ../locks/unix/thread_mutex.c:92
> #4  0x00000000021ea81c in LLMutex::lock (this=0x7f01f80090c0) at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llcommon/llthread.cpp:307
> #5  0x00000000021bd79c in LLThread::lockData (this=0x7f01f81e34f0) at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llcommon/llthread.h:179
> #6  0x00000000021bae80 in LLQueuedThread::generateHandle (this=0x7f01f81e34f0)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llcommon/llqueuedthread.cpp:196
> #7  0x0000000001ce0ee5 in LLImageDecodeThread::decodeImage (this=0x7f01f81e34f0, image=0x7f01fa0463c0, priority=649623872, discard=1, needs_aux=0, 
>     responder=0x7f01e00058c0) at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llimage/llimageworker.cpp:71
> #8  0x000000000162df63 in LLTextureFetchWorker::doWork (this=0x7f01e78cce90, param=0)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/newview/lltexturefetch.cpp:940
> #9  0x00000000021f764f in LLWorkerThread::WorkRequest::processRequest (this=0x7f01e728f670)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llcommon/llworkerthread.cpp:167
> #10 0x00000000021bb1d5 in LLQueuedThread::processNextRequest (this=0x7f01f81fb8c0)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llcommon/llqueuedthread.cpp:425
> #11 0x00000000021bb364 in LLQueuedThread::run (this=0x7f01f81fb8c0) at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llcommon/llqueuedthread.cpp:491
> #12 0x00000000021eb672 in LLThread::staticRun (apr_threadp=0x7f01f81fcba8, datap=0x7f01f81fb8c0)
>     at /usr/src/secondlife/secondlife/snowglobe/snowglobe-svn/indra/llcommon/llthread.cpp:83
> #13 0x00007f020c328c01 in dummy_worker (opaque=0x7f01f81fcba8) at ../threadproc/unix/thread.c:142
> #14 0x00007f020f618faa in start_thread (arg=<value optimized out>) at pthread_create.c:300
> #15 0x00007f020b71029d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
> #16 0x0000000000000000 in ?? ()
> (gdb) fr 2
> #2  0x00007f020f61a39e in __pthread_mutex_lock (mutex=0x7f01f81e6650) at pthread_mutex_lock.c:86
> 86	      LLL_MUTEX_LOCK (mutex);
> Current language:  auto; currently c
> (gdb) p *mutex
> $11 = {__data = {__lock = 2, __count = 0, __owner = 8010, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
>   __size = "\002\000\000\000\000\000\000\000J\037\000\000\001", '\0' <repeats 26 times>, __align = 2}
> {noformat}

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.secondlife.com/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       


More information about the Jira-notify mailing list