Opened 10 years ago
Closed 10 years ago
#45309 closed defect (fixed)
p11-kit child atfork handler causes gimp forks to crash due to gegl/glib2 dlclose(), no gimp plugins available
Reported by: | anddam (Andrea D'Amore) | Owned by: | ryandesign (Ryan Carsten Schmidt) |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | ports | Version: | |
Keywords: | Cc: | cooljeanius (Eric Gallager), eborisch (Eric A. Borisch), jon.fairbairn@…, ewen-naos-nz (Ewen McNeill), kurtjaeke@…, mdkcore@…, p-bro, wdc@…, su-v, machsna (J. ‘mach’ Wust), ClintO (Clint O), Crosmatron, dershow, lalakerhater@…, skymoo (Adam Mercer), jpenney (Jason Penney), jeremyhu (Jeremy Huddleston Sequoia), dtakahashi@…, EnekoGotzon | |
Port: | gegl p11-kit gimp2 glib2 |
Description
GIMP doesn't load its plug-ins at boot, it prints out a dbus-related error on standard error
Error spawning command line `launchctl getenv DBUS_LAUNCHD_SESSION_BUS_SOCKET': Child process killed by signal 11
and then
LibGimpBase-WARNING **: gimp: gimp_wire_read(): error
The same symptoms have been confirmed by the port maintainer.
Running gimp with --verbose provides a pair of
Querying plug-in: '/opt/local/lib/gimp/2.0/plug-ins/blinds' Terminating plug-in: '/opt/local/lib/gimp/2.0/plug-ins/blinds'
output lines per each plug-in, this results in a GIMP instance with no plug-in that is of very little help, among other features it won't read/write any format other than its own XCF.
On my 10.9.5 system with dbus-system daemon and dbus-session agent from MacPorts running I have a DBUS_LAUNCHD_SESSION_BUS_SOCKET env variable in shell but not in launchd environment, this is probably why GIMP is complaining about the getenv at start.
I'm attaching full standard output and error.
Attachments (8)
Change History (68)
Changed 10 years ago by anddam (Andrea D'Amore)
Attachment: | gimp-out.txt added |
---|
Changed 10 years ago by anddam (Andrea D'Amore)
Attachment: | gimp-err.txt added |
---|
standard error with --verbose flag
comment:1 Changed 10 years ago by dbevans (David B. Evans)
Port: | gimp gimp-app removed |
---|---|
Status: | new → assigned |
Version: | 2.3.1 |
Thanks for opening this ticket and confirming the problem. It appears there are two separate issues involved here, first the dbus problem and secondly the gimp plugin one.
The dbus issue seems to be with the GDBus API provided by the GIO component of glib2. The stock code expects the environment variable DBUS_SESSION_BUS_ADDRESS to be set a session debus address of the form 'unix:path=<path to dbus session socket>'. We have added custom code that attempts to obtain the socket path using the the launchctl command mentioned in the error message if DBUS_SESSION_BUS_ADDRESS is not set.
On my 10.9.5 system, manually running the command yields the socket address as follows.
$ launchctl getenv DBUS_LAUNCHD_SESSION_BUS_SOCKET /tmp/launch-vW5khj/unix_domain_listener
I can override the GIMP dbus error by setting DBUS_SESSION_BUS_ADDRESS with this value
export DBUS_SESSION_BUS_ADDRESS="unix:path=/tmp/launch-vW5khj/unix_domain_listener"
You can get more debugging output from glib2 by setting the additional env variable
export G_DBUS_DEBUG=address
This produces the additional output
GDBus-debug:Address: In g_dbus_address_get_for_bus_sync() for bus type 'session' GDBus-debug:Address: env var DBUS_SESSION_BUS_ADDRESS is not set GDBus-debug:Address: env var DBUS_SYSTEM_BUS_ADDRESS is not set GDBus-debug:Address: env var DBUS_STARTER_BUS_TYPE is not set GDBus-debug:Address: launchctl command line: `launchctl getenv DBUS_LAUNCHD_SESSION_BUS_SOCKET' GDBus-debug:Address: Cannot look-up address bus type 'session': Error spawning command line `launchctl getenv DBUS_LAUNCHD_SESSION_BUS_SOCKET': Child process killed by signal 11
This give me an idea of where to look in glib2 to see what's going on.
No idea what's going on with the plugin load failures as yet. Will look at it today.
comment:4 follow-up: 5 Changed 10 years ago by eborisch (Eric A. Borisch)
Same symptoms on 10.6.8, FWIW.
comment:5 Changed 10 years ago by dbevans (David B. Evans)
Replying to eborisch@…:
Same symptoms on 10.6.8, FWIW.
I've reproduced the same symptoms with fresh builds on SL, Mavericks and even Yosemite Beta 5 so it looks like OS version is not an issue.
comment:6 Changed 10 years ago by dbevans (David B. Evans)
Status:
So far I've been able to confirm that the dbus issue and the failing plugins have a common element: they both involve spawning a child process that terminates prematurely.
In the dbus case, g_dbus_address_get_for_bus_sync() attempts to execute the launchctl command in a spawned child process which reports killed by signal 11. However, this only happens when running gimp. For example, running eog (or most other GNOME apps) the child process is spawned successfully and the correct result returned.
$ eog GDBus-debug:Address: In g_dbus_address_get_for_bus_sync() for bus type 'session' GDBus-debug:Address: env var DBUS_SESSION_BUS_ADDRESS is not set GDBus-debug:Address: env var DBUS_SYSTEM_BUS_ADDRESS is not set GDBus-debug:Address: env var DBUS_STARTER_BUS_TYPE is not set GDBus-debug:Address: launchctl command line: `launchctl getenv DBUS_LAUNCHD_SESSION_BUS_SOCKET' GDBus-debug:Address: launchctl stdout: 0000: 2f 74 6d 70 2f 6c 61 75 6e 63 68 2d 37 4d 46 69 /tmp/launch-7MFi 0010: 64 37 2f 75 6e 69 78 5f 64 6f 6d 61 69 6e 5f 6c d7/unix_domain_l 0020: 69 73 74 65 6e 65 72 0a 00 istener.. GDBus-debug:Address: Returning address 'unix:path=/tmp/launch-7MFid7/unix_domain_listener' for bus type 'session'
In the case of the GIMP plugins, the failure point occurs when GIMP is querying the individual plugins for their characteristics. The plugin is executed as a child process which communicates bidirectionally with the main program using a pair of pipes. After sending a query command to each plugin, the main process attempts to read an expected response but immediately gets an EOF instead and the plugin is skipped. A crash report for the child is produced as follows:
Process: gimp-2.8 [62774] Path: /opt/local/bin/gimp-2.8 Identifier: gimp-2.8 Version: ??? Code Type: X86-64 (Native) Parent Process: gimp-2.8 [62731] Responsible: Terminal [361] User ID: 501 Date/Time: 2014-10-09 12:14:46.255 -0700 OS Version: Mac OS X 10.9.5 (13F34) Report Version: 11 Anonymous UUID: 25BD4C19-81DA-575C-A9EB-3C11B6BF2F84 Crashed Thread: 0 Dispatch queue: com.apple.main-thread Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x00000001118b69b5 VM Regions Near 0x1118b69b5: __LINKEDIT 0000000110f36000-0000000110f47000 [ 68K] r--/rwx SM=COW /opt/local/lib/*.dylib --> MALLOC_TINY 00007ff479400000-00007ff479800000 [ 4096K] rw-/rwx SM=COW Application Specific Information: *** multi-threaded process forked *** crashed on child side of fork pre-exec Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 ??? 0x00000001118b69b5 0 + 4589316533 1 libsystem_pthread.dylib 0x00007fff85d06f5b _pthread_fork_child_postinit + 69 2 libsystem_c.dylib 0x00007fff85c0317e fork + 29 3 libglib-2.0.0.dylib 0x000000010f692067 fork_exec_with_pipes + 695 (gspawn.c:1337) 4 libglib-2.0.0.dylib 0x000000010f691293 g_spawn_async_with_pipes + 851 (gspawn.c:661) 5 libglib-2.0.0.dylib 0x000000010f690f26 g_spawn_async + 198 (gspawn.c:146) 6 gimp 0x000000010e078486 gimp_plug_in_open + 1894 (gimpplugin.c:363) 7 gimp 0x000000010e0826a1 gimp_plug_in_manager_call_query + 641 (gimppluginmanager-call.c:75) 8 gimp 0x000000010e087e9e gimp_plug_in_manager_query_new + 382 (gimppluginmanager-restore.c:355) 9 gimp 0x000000010e087688 gimp_plug_in_manager_restore + 536 (gimppluginmanager-restore.c:112) 10 gimp 0x000000010e0b4a71 gimp_real_restore + 97 (gimp.c:715) 11 libgobject-2.0.0.dylib 0x000000010f559674 g_cclosure_marshal_VOID__POINTER + 244 (gmarshal.c:1201) 12 libgobject-2.0.0.dylib 0x000000010f553398 g_type_class_meta_marshal + 120 (gclosure.c:966) 13 libgobject-2.0.0.dylib 0x000000010f55288a g_closure_invoke + 522 (gclosure.c:773) 14 libgobject-2.0.0.dylib 0x000000010f57583b signal_emit_unlocked_R + 2155 (gsignal.c:3591) 15 libgobject-2.0.0.dylib 0x000000010f5774ee g_signal_emit_valist + 6046 (gsignal.c:3309) 16 libgobject-2.0.0.dylib 0x000000010f577e94 g_signal_emit + 356 (gsignal.c:3365) 17 gimp 0x000000010e0b1849 gimp_restore + 1449 (gimp.c:1013) 18 gimp 0x000000010dd56704 app_run + 708 (app.c:221) 19 gimp 0x000000010dd5a017 main + 1255 (main.c:474) 20 libdyld.dylib 0x00007fff85d005fd start + 1 Thread 0 crashed with X86 Thread State (64-bit): rax: 0x0000000000000004 rbx: 0x0000000000000028 rcx: 0x00000001118b69b5 rdx: 0x0000000000000000 rdi: 0x0000000000000103 rsi: 0x0000000000000403 rbp: 0x00007fff51eb5440 rsp: 0x00007fff51eb5418 r8: 0x0000000000000003 r9: 0x0000000000000000 r10: 0x00000000000002bc r11: 0x0000000000000246 r12: 0x000000010e0786b0 r13: 0x0000000000000000 r14: 0x000000010e6d01e0 r15: 0x0000000000000001 rip: 0x00000001118b69b5 rfl: 0x0000000000010202 cr2: 0x00000001118b69b5 Logical CPU: 3 Error Code: 0x00000014 Trap Number: 14
Again spawning the child process has failed (SIGSEGV).
No idea what's causing the problem and why it is specific to GIMP at this point. If any one else has an idea about what's going on here or better how to fix it please let me know. I'm heading out for the weekend but should get back to this on Sunday.
comment:7 Changed 10 years ago by neverpanic (Clemens Lang)
On OS X, you cannot call non-async-safe functions in a child process fork(2)
ed from a multithreaded process before calling execve(2)
. This looks like some post-fork-phtread callback code that calls stuff it shouldn't call. A valid solution for this kind of problem is to replace the fork(2)
and execve(2)
combination with posix_spawn(2)
.
Please make sure upstream is aware of this problem.
For further reference, for example: http://lists.gnu.org/archive/html/coreutils/2013-11/msg00083.html. Googling for
*** multi-threaded process forked ***
also gets you lots of similar hits.
comment:8 follow-up: 12 Changed 10 years ago by ewen-naos-nz (Ewen McNeill)
I ran into this same problem on Mavericks. After spending a while trying reactivating older ports (eg, older gtk2, older glib2 2.40) -- which didn't seem to solve the problem, and involved lots of rebuilding things depending on them, I started looking at the glib2 source.
In particular the verbose error output says:
Error spawning command line `launchctl getenv DBUS_LAUNCHD_SESSION_BUS_SOCKET': Child process killed by signal 11 Dynamic session lookup supported but failed: launchctl terminated abnormally without any error message Not enough memory
which implies that maybe it's a memory allocation being failed. So I started following the glib2 call chain for running a process, from the launchctl example above (seemed easier than trying to find where the plugins get loaded, since I imagine they're going through a common glib2 function.
The call path seems to be:
- g_spawn_command_line_sync()
- g_spawn_sync()
- fork_exec_with_pipes()
- do_exec()
- Optionally: (* child_setup)(user_data)
- g_execute()
and AFAICT in g_execute() (in glib/gspawn.c) it looks like there is a memory allocation performed if the thing being run does not have a full path (around line 1721, glib 2.42):
freeme = name = g_malloc (pathlen + len + 1);
which is consistent with the command line printed out (running bare launchctl).
Memory allocations are one of those things which are typically on the "do not do between fork() and exec()." Although sometimes the rules for single threaded and multi-threaded applications are different about some of these things. (Hard to be sure if that will allocate memory or not, since it goes through all the glib2 memory wrappers too.)
So just as another option to consider for, eg, pthread atfork related things.
Possibly if someone figures out when it last worked (which is why I was trying downgrading glib2/gtk2/etc) then it'll be more obvious why it has stopped working.
Ewen
comment:12 Changed 10 years ago by hohro.st@…
I've also encountered this problem.
Possibly if someone figures out when it last worked (which is why I was trying downgrading glib2/gtk2/etc) then it'll be more obvious why it has stopped working.
There was no such problem at least 13 days ago (on Oct. 5). I think it occurred around the day when this ticket was opened.
comment:14 follow-up: 15 Changed 10 years ago by zhu527812567@…
I encountered this problem when I was using OS X 10.7.5 (~Oct. 13)
I found that something in fork_exec_with_pipes()@glib2 is not allowed. comment:8 indicates the possible offender is g_malloc. I tried to revert these new versions but old versions (1 month ago) did not work. I remembered that I updated my xcode. What about gcc provided by macports? Not work. gcc-apple-4.2 was also used but still not work.
Thus I tried to implement something using posix_spawn but that syscall paniced my kernel. After consulting some Apple SAs, I believed it was CVE-2013-3954 and fixed in 10.9. I decided to update my system entirely. After the upgrade I reinstalled these ports and the problem still exists. On the other hand, posix_spawn does not panic. Thus for 10.9 and above posix_spawn things may be an incomplete alternative.
It seems that we'd better report this issue to upstream glib. This issue seems similar to ours: https://bugzilla.gnome.org/show_bug.cgi?id=738620 .
Thanks.
comment:15 Changed 10 years ago by ewen-naos-nz (Ewen McNeill)
Replying to zhu527812567@…:
It seems that we'd better report this issue to upstream glib. This issue seems similar to ours: https://bugzilla.gnome.org/show_bug.cgi?id=738620 .
That does seem wise. Thanks for pointing out that glib ticket, which does seem similar to me too. Since I've already got an account on the Gnome bug tracker I've added a "BTW" pointer to this MacPorts ticket there with this text:
Potentially related, over at the MacPorts (OS X) site there is ticket #45309 for Gimp failing to launch DBUS or detect any plugins since new versions of glib/gtk/gimp were incorporated into MacPorts about 11 days ago (so around 2014-10-10 ish). On the DBUS launch there's a "Not enough memory" error report. It's worked fine for years prior to that.
As far as I can tell gimp detects plugins by launching something, via glib functions, to auto-discover them (and the DBUS case is definitely supposed to be a program launch). Based on the debugging to date it looks like something in the fork() child is failing before it is able to get as far as calling exec*() -- but we have not yet determined precisely what. Memory allocation is one plausible theory, especially given the "Not enough memory" report.
Ticket URL: https://trac.macports.org/ticket/45309
Seems to affect OS X 10.6.8, 10.7, 10.9 and 10.10 (beta) at least.
It appears that on OS X the relevant glib functions do work in some other gtk applications (eg, eog is mentioned in the ticket), but not in gimp. My hunch is that single threaded applications have different malloc-after-fork rules than multi-threaded applications.
But I suspect if anyone gets closer to narrowing down the trigger event on OS X then it'd also be worth creating our own upstream ticket with more specific detail. And/or if we can narrow it down to a specific upstream version change. Particularly since the tone of that upstream ticket is basically "following POSIX is hard" (which it is, to follow strictly) and suggesting that they "need" to do things that do not follow POSIX.
Ewen
comment:16 Changed 10 years ago by ewen-naos-nz (Ewen McNeill)
Random thought, triggered by an off-hand comment a glib developer made on the glib ticket about it being theoretically possible to write a non-threaded gtk application: maybe what changed is the _order_ in which things are done at startup (ie, maybe in gimp or gtk). Maybe there is now a thread being spawned *before* DBUS launch/plugin discovery, where as previously those were happening before any other threads got started? Certainly to the OS one looks like "single threaded program wants to run something" and the other looks like "multi-threaded program wants to run something", which in several POSIX compliant OS get treated differently. (Not sure what effect this would have later on, eg, using plugins -- but if, eg, the malloc() for the full path name being discovered is the trigger, it might be that gimp is calling it with a full path name -- thus skipping that malloc -- when it uses the plugin.)
Ewen
Changed 10 years ago by zhu527812567@…
Attachment: | gimp-lldb-b-workq.txt added |
---|
lldb transcript, break at workq_open and workq_return
comment:18 Changed 10 years ago by zhu527812567@…
well ... the multi-thread issue (attachment ticket:45309:gimp-lldb-b-workq.txt ):
frame #20: 0x0000000100e5d03f libpango-1.0.0.dylib`pango_language_get_default + 65
calls CFLocaleCopyPreferredLanguages
---> __workq_open
and then thread 10, 11, 12 ....
The crash in dbus seems to be anothor issue.
comment:20 Changed 10 years ago by wdc@…
I too am experiencing this problem with meld under MacOS 10.9:
wdc-home-2:gnucash wdc$ !meld meld JTROS_Cash.csv.save JTROS_Cash.csv Couldn't bind the translation domain. Some translations won't work. 'module' object has no attribute 'bind_textdomain_codeset' ** (process:65721): WARNING **: Trying to register gtype 'GMountMountFlags' as enum when in fact it is of type 'GFlags' ** (process:65721): WARNING **: Trying to register gtype 'GDriveStartFlags' as enum when in fact it is of type 'GFlags' ** (process:65721): WARNING **: Trying to register gtype 'GSocketMsgFlags' as enum when in fact it is of type 'GFlags' /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/gobject/__init__.py:115: Warning: attempting to add an interface (GtkEditable) to class (HistoryFileEntry) after class_init type_register(cls, namespace.get('__gtype_name__')) Dynamic session lookup supported but failed: launchd did not provide a socket path, verify that org.freedesktop.dbus-session.plist is loaded! Traceback (most recent call last): File "/opt/local/bin/meld", line 178, in <module> main() File "/opt/local/bin/meld", line 164, in main already_running, dbus_app = meld.dbus_service.setup(app) File "/opt/local/lib/meld/meld/dbus_service.py", line 56, in setup bus = dbus.SessionBus() File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/dbus/_dbus.py", line 211, in __new__ mainloop=mainloop) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/dbus/_dbus.py", line 100, in __new__ bus = BusConnection.__new__(subclass, bus_type, mainloop=mainloop) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/dbus/bus.py", line 122, in __new__ bus = cls._new_for_bus(address_or_type, mainloop=mainloop) dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoMemory: Not enough memory wdc-home-2:gnucash wdc$ ps axwww | grep dbus 65716 ?? Ss 0:00.01 /opt/local/bin/dbus-daemon --nofork --session 66022 s001 S+ 0:00.01 grep dbus
comment:22 Changed 10 years ago by ewen-naos-nz (Ewen McNeill)
Some thoughts on debugging this, in case someone has more time: immediately after the fork() point there are two processes, the parent and the child. Whatever is going wrong is clearly in the child process, and it seems to fail quickly enough that it's tricky to get a debugger to attach:
- "LANG=C lldb gimp-2.8" in one window
- "b fork", then "run", wait for breakpoint to hit
- "LANG=C lldb" in another window, and "process attach -w -n gimp-2.8" to get it to attach to the next gimp-2.8 process that appears
- "c" in the first window, so the fork() happens
but AFAICT the child exits (due to the error we're trying to debug) before the lldb has time to do anything meaningful with it.
I did try installing gdb, for "set follow-fork-mode child" but even after doing the code signing dance it appears OS X doesn't have kernel support needed for "follow-fork-mode" to work automatically, so gdb has to be used in basically the same way, which also doesn't have time to work. Plus gdb didn't seem to find any symbols, where as lldb at least had function names, so I think gdb is a deadend.
So... I think the only meaningful way to debug this would be a code change in the child side of the fork() (ie, in glib's fork_exec_with_pipes()), which made the child wait a lot longer before it did something that caused it to exit again (ie, immediately at the top of the child part of the function). Say, sleeping for 5 seconds. Which would allow time to attach the debugger to the child (as above), break out to the debugger, and set some more breakpoints, then continue/step through the child side of the code.
It'll take a bit of time to change the glib code, recompile/install, and step through, but at least we'd know which particular bit of the child code is failing which would, eg, make it feasible to report upstream more specifically. I'm unlikely to have time to do this before late next week, but will have a go late next week if no one else has time to try before then.
Ewen
comment:23 follow-up: 25 Changed 10 years ago by ewen-naos-nz (Ewen McNeill)
Additional though: it looks like both the DBUS one and the plugins ones are segfaulting (signal 11), so it's possible to get a core dump:
ulimit -c unlimited gimp-2.8
stop it after the first few have failed. The core files are in /cores/core.PID. Pick one then do:
ewen@ashram:~$ lldb gimp-2.8 -c /cores/core.1118 (lldb) target create "gimp-2.8" --core "/cores/core.1118" warning: (x86_64) /cores/core.1118 load command 298 LC_SEGMENT_64 has a fileoff + filesize (0x20929000) that extends beyond the end of the file (0x20928000), the segment will be truncated to match Core file '/cores/core.1118' (x86_64) was loaded. Process 0 stopped * thread #1: tid = 0x0000, 0x000000010f16b9b5, stop reason = signal SIGSTOP frame #0: 0x000000010f16b9b5 -> 0x10f16b9b5: addb %al, (%rax) 0x10f16b9b7: addb %al, (%rax) 0x10f16b9b9: addb %al, (%rax) 0x10f16b9bb: addb %al, (%rax) (lldb) print $rax (unsigned long) $0 = 4 (lldb)
So it's failing accessing memory at the address in rax, which is 0x4 (confirmed by "register read --all"), which is almost certainly "memory allocation failed, and we didn't catch that, and then we offset by 4". But it's non-trivial without debugging symbols to figure out precisely what this point is (my theory is still on the malloc() I found by source inspection earlier... but no time to dig further now).
Ewen
comment:25 Changed 10 years ago by zhu527812567@…
Replying to macports@…:
For the core files, the backtrace prints something like (frame 0 is 0x00000001069589e0 on my machine)
* frame #0: 0x00000001069589e0 frame #1: 0x00007fff98a4b17e libsystem_c.dylib`fork + 29 frame #2: 0x000000010471bf43 libglib-2.0.0.dylib`fork_exec_with_pipes + 277 frame #3: 0x000000010471ba5d libglib-2.0.0.dylib`g_spawn_sync + 334
OR lldb reports
error: core file does not contain 0x1069589e0
The debugger cannot find 0x00000001069589e0
. One possible cause is that the address does not point to valid executable instructions at all.
-> 0x1069589e0: addb %al, (%rax) 0x1069589e2: addb %al, (%rax) 0x1069589e4: addb %al, (%rax) 0x1069589e6: addb %al, (%rax) .... (many identical instructions)
The machine code (binary representation) for the instruction is binary zero, so maybe a zero page
(lldb) dis -s 0x0000000106958000 0x106958000: addb %al, (%rax) 0x106958002: addb %al, (%rax) 0x106958004: addb %al, (%rax) 0x106958006: addb %al, (%rax) ....
well, this means that fork()
returns to an invalid address page (which is filled with zero).
Not only this page, the pages before and after are all zeros. In this case we cannot debug the child :( , since the kernel does not return to the correct position in the child.
comment:30 Changed 10 years ago by lalakerhater@…
Just wanted to add the phrase "Unknown file type" to this ticket in case others are searching for it. That's the message the UI will return when trying to open files other than .xcf.
comment:33 Changed 10 years ago by wdc@…
I poked at this problem a little more. At the present time I am SUCCESSFULLY running meld and dbus. I'm not seeing the "DBus.Error.NoMemory" error.
To those experiencing this problem I have one question: Have you performed a system restart since installing dbus for the first time? After poking rather a lot at what might be different between my failed and working states, that seems to be the primary difference. I did NOT restart after installing dbus the first time.
I did other stuff, and I'm asking others in another thread if they think the other stuff I did mattered. I think it didn't and that the restart did matter. What do you other folks say?
comment:34 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
_pthread_fork_child_postinit basically just loops over a bunch of atfork handlers that are registered with pthreads and executes them and then resets the pthreads library to its initial state.
I suspect that something is calling pthread_atfork() to register an atfork handler and and the child handler is causing the crash. I suggest that you break on pthread_atfork and see what's doing that.
comment:36 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
FWIW, gimp seems to run fine with nothing printed to stout on my machine.
comment:37 Changed 10 years ago by dtakahashi@…
I still have the same problem on 10.8.5. I tried system reboot after the port upgrade (which restarts dbus), but it did not aid the problem including dbus error, so causes of those two issues may not be the same.
Changed 10 years ago by dtakahashi@…
Attachment: | gimp-lldb-transcript-pthread_atfork.txt added |
---|
lldb transcript breaking at pthread_atfork()
comment:39 follow-up: 45 Changed 10 years ago by ewen-naos-nz (Ewen McNeill)
FTR, on 10.9 I'm also seeming only the same four pthread_atfork() calls as dtakahashi added at comment:38 (gimp-lldb-transcript-pthread_atfork.txt), although I'm getting somewhat fewer symbols resolved in lldb in the "bt". Anyway the four calls seem to be:
- ImageLoaderMachO::doImageInit()
- ImageLoaderMachO::doModInitFunctions()
- getpwuid()
- getpwuid()
The first two seem to be tied up with loading the MachO (executable/libraries) images into memory, and the last two seem to be deep in the callpath of getpwuid(). All of which seem to be inside the OS libraries, rather than inside GIMP/GTK/glib. So I suspect it's unlikely they've changed recently to cause this to suddenly break like this.
There aren't any more pthread_atfork() calls registered before we start getting DBUS dynamic session lookup failures and plugin discovery failures. So I still currently suspect it's something in the actual child path after fork(), rather than an _atfork() handler.
Ewen
comment:40 Changed 10 years ago by ewen-naos-nz (Ewen McNeill)
After spending a bunch of time rebuilding glib2 with various print debugs (and some sleep(5) calls), I take it back. As best I can tell the child is not making it to the next line of code in gspawn.c:fork_exec_with_pipes() -- a print immediately after the fork() runs in the parent, but not obviously in the child. With the patch I was trying (I'll attach in a minute), output looks like this:
ewen@ashram:~$ gimp-2.8 About to fork in fork_exec_with_pipes() Immediately after fork in fork_exec_with_pipes(), pid=66894 Error spawning command line `launchctl getenv DBUS_LAUNCHD_SESSION_BUS_SOCKET': Child process killed by signal 11 Dynamic session lookup supported but failed: launchctl terminated abnormally without any error message Not enough memory About to fork in fork_exec_with_pipes() Immediately after fork in fork_exec_with_pipes(), pid=66901 (gimp-2.8:66893): LibGimpBase-WARNING **: gimp-2.8: gimp_wire_read(): error
The pid=NNNNN is the return value from fork(), so the fact we're getting a non-zero value means that print is coming from the parent process (if it was from the child process it'd say "pid=0"). I believe the gimp_wire_read() is just a result of the child suddenly going away, and thus the pipe to it being closed.
In the hope of getting some idea of what was happening in the child, I found some hints on dtrace on OS X and tried using "dtruss" to see what was happening, viz:
lldb /opt/local/bin/gimp-2.8 b main run
to start a gimp-2.8 process and pause it, then in another window:
sudo dtruss -f -n gimp-2.8 2>&1
to trace (and follow children) of the "gimp-2.8" processes, and then "c" in the "lldb" session.
After letting the gimp-2.8 process run for a little bit, I had a look for fork() calls in the dtruss() output. The most interesting bit is this:
66945/0x224bb8: fork() = 0 0 66945/0x224bb8: thread_selfid(0x7FFF7536B310, 0x7FFF87915E2E, 0x1) = 2247608 0 66945/0x224bb8: bsdthread_register(0x7FFF8AB13FBC, 0x7FFF8AB13FAC, 0x2000) = -1 Err#22
and:
66947/0x224ccd: fork() = 0 0 66947/0x224ccd: thread_selfid(0x7FFF7536B310, 0x7FFF87915E2E, 0x1) = 2247885 0 66947/0x224ccd: bsdthread_register(0x7FFF8AB13FBC, 0x7FFF8AB13FAC, 0x2000) = -1 Err#22
and so on. It appears basically all of the children follow that pattern of three syscalls -- a return from fork(), a thread_selfid(), and then a bsdthread_register() which then appears to fail. I suspect that's not good, but I've been able to find out very little about what bsdthread_register() actually does. (I'll also attach that full dtruss output in a minute.)
The only other time those PIDs are seen is when the parent wait4() collects the return code. So they're definitely vanishing pretty soon after that.
At this point I think we either (a) need a much smaller reproducible test case or (b) someone who knows more about the internals of OS X (or both). It's possible there's some other _atfork() style handler causing havoc, but off hand I don't know other ways those might be registered -- and if it'd cause bsdthread_register() to fail.
Ewen
Changed 10 years ago by ewen-naos-nz (Ewen McNeill)
Attachment: | gspawn.c-print-sleep.diff added |
---|
glib/gspawn.c(): fork_exec_with_pipes() debug print patch
Changed 10 years ago by ewen-naos-nz (Ewen McNeill)
Attachment: | dtruss-gimp-2014-11-02.txt added |
---|
Output of "sudo dtruss -f -n gimp-2.8 2>&1"
comment:41 Changed 10 years ago by ewen-naos-nz (Ewen McNeill)
Finally in case someone else wants to try the patch-glib-and-debug thing, the approximate idea is:
sudo port extract glib2 sudo port patch glib2 cd $(port dir glib2) sudo vi Portfile # add: "revison 1" sudo port configure glib2 cd $(port dir glib2) cd work/glib-*/glib # patch gspawn.c with gspawn.c-print-sleep.diff or otherwise edit sudo port build glib2 sudo port destroot glib2 sudo port -f -k install glib2 # Expect to see ---> Installing glib2 @2.42.0_1
beware that MacPorts will do a clean for you if you bump revision and then do "port configure", so keep a copy of your hacky local changes to copy back. Also beware that you need the (local) revision number to be higher than what is installed to get a new version installed; it'll be ignored otherwise. (I just kept reactivating my upstream _0 and uninstalling my local _1, then rebuilding the _1 with local patch, and then reinstalling the local _1; probably the wrong approach, but I didn't want to run through a bunch of revisions and get out of sync with MacPorts, and it seems to just be a integer "local" revision -- ie MacPorts revision, rather than user revision.)
Ewen
comment:42 Changed 10 years ago by ewen-naos-nz (Ewen McNeill)
FWIW, it looks like bsdthread_register() Err#22 (EINVAL) might be a dtruss artefact (which is someone talking about using dtruss to trace go programs, apparently they were looping in the error handler in that case).
But the fact it's not making it to the next line of code still implies it's something in the _atfork() handlers, not in the glib fork_exec_with_pipes() code.
Ewen
comment:44 Changed 10 years ago by dtakahashi@…
I investigated the issue from the gimp. I attach the simplified main.c
, which can reproduce "the death of lanchctl" (at least here). As far as I tried, the combination of two functions, g_option_context_add_group (context, gegl_get_option_group ());
and g_option_context_parse_strv (context, &argv, &error1);
seems to be important for killing launchctl.
comment:45 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
Replying to macports@…:
FTR, on 10.9 I'm also seeming only the same four pthread_atfork() calls as dtakahashi added at comment:38 (gimp-lldb-transcript-pthread_atfork.txt), although I'm getting somewhat fewer symbols resolved in lldb in the "bt". Anyway the four calls seem to be:
- ImageLoaderMachO::doImageInit()
- ImageLoaderMachO::doModInitFunctions()
- getpwuid()
- getpwuid()
The first two seem to be tied up with loading the MachO (executable/libraries) images into memory, and the last two seem to be deep in the callpath of getpwuid(). All of which seem to be inside the OS libraries, rather than inside GIMP/GTK/glib. So I suspect it's unlikely they've changed recently to cause this to suddenly break like this.
Your analysis is a tad off here.
The first one is part of the initialization of CoreFoundation. The last two happen *AFTER* the fork and are thus not relevant. The second one is surely the culprit:
(lldb) bt 49 * thread #1: tid = 0x39e8, 0x00007fff92e420ff libsystem_c.dylib`pthread_atfork, queue = 'com.apple.main-thread, stop reason = breakpoint 1.1 50 frame #0: 0x00007fff92e420ff libsystem_c.dylib`pthread_atfork 51 frame #1: 0x00007fff92e3bfc0 libsystem_c.dylib`pthread_once + 87 52 frame #2: 0x00007fff5fc13378 dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 236 53 frame #3: 0x00007fff5fc13762 dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 46 54 frame #4: 0x00007fff5fc1006e dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&) + 380 55 frame #5: 0x00007fff5fc0ffc4 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&) + 210 56 frame #6: 0x00007fff5fc0ffc4 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&) + 210 57 frame #7: 0x00007fff5fc0ffc4 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&) + 210 58 frame #8: 0x00007fff5fc0ffc4 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&) + 210 59 frame #9: 0x00007fff5fc0ffc4 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&) + 210 60 frame #10: 0x00007fff5fc0ffc4 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&) + 210 61 frame #11: 0x00007fff5fc0feba dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 54 62 frame #12: 0x00007fff5fc04e38 dyld`dyld::runInitializers(ImageLoader*) + 89 63 frame #13: 0x00007fff5fc0a87c dyld`dlopen + 581 64 frame #14: 0x00007fff952cfdd8 libdyld.dylib`dlopen + 57 65 frame #15: 0x000000010124b319 libgmodule-2.0.0.dylib`g_module_open + 1382 66 frame #16: 0x00000001011fc6ca libgegl-0.2.0.dylib`gegl_module_open + 20 67 frame #17: 0x00000001011fc2c6 libgegl-0.2.0.dylib`gegl_module_load + 84 68 frame #18: 0x00000001011fc231 libgegl-0.2.0.dylib`gegl_module_new + 152 69 frame #19: 0x00000001011fcf57 libgegl-0.2.0.dylib`gegl_module_db_module_initialize + 157 70 frame #20: 0x00000001011fbf5f libgegl-0.2.0.dylib`gegl_datafiles_read_directories + 653 71 frame #21: 0x00000001011df7c3 libgegl-0.2.0.dylib`gegl_post_parse_hook + 587 72 frame #22: 0x000000010149166e libglib-2.0.0.dylib`g_option_context_parse + 2092 73 frame #23: 0x0000000101492076 libglib-2.0.0.dylib`g_option_context_parse_strv + 52 74 frame #24: 0x000000010000fde8 gimp`main + 529 75 frame #25: 0x00007fff952d07e1 libdyld.dylib`start + 1
It is being called by one of the initializers due to the dlopen().
I suspect that something later dlclose()d that library leaving us with a dead pointer.
Perhaps you can break on dlclose() after that pthread_atfork to see if my hunch is correct.
comment:46 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
Also, it looks like you don't have debug symbols for your MacPorts code. For the relevant ports, I suggest that you set the optimize flags to '-O0 -g3' and build from source with -k. Also, ensure that installed libraries aren't stripped. You should then have dSYMs and source available for the MacPorts libraries
comment:47 Changed 10 years ago by ewen-naos-nz (Ewen McNeill)
Prompted by jeremyhu's analysis at comment:45, I noticed that the second pthread_atfork() call is inside libgegl, which is an image handling library -- at that I had an older version of that available. So I tried using the older version:
ewen@ashram:~$ port installed | grep gegl gegl @0.2.0_12+python27+x11 gegl @0.2.0_13+python27+x11 (active) ewen@ashram:~$ sudo port activate gegl @0.2.0_12+python27+x11 ---> Computing dependencies for gegl ---> Deactivating gegl @0.2.0_13+python27+x11 ---> Cleaning gegl ---> Activating gegl @0.2.0_12+python27+x11 ---> Cleaning gegl ewen@ashram:~$
and after doing that, it's then possible to run GIMP with many fewer errors -- and it can actually load non XCF images (eg, it loaded a JPEG image for me) so it looks like the plugins work).
ewen@ashram:~$ /opt/local/bin/gimp-2.8 GEGL-/opt/local/include/glib-2.0/glib/gmessages.h-Message: Module '/opt/local/lib/gegl-0.2/ff-load.so' load error: dlopen(/opt/local/lib/gegl-0.2/ff-load.so, 10): Library not loaded: /opt/local/lib/libavformat.55.dylib Referenced from: /opt/local/lib/gegl-0.2/ff-load.so Reason: image not found While parsing XMP metadata: Error: No XMP packet found ewen@ashram:~$
So it appears we have at least reached a workaround:
sudo port activate gegl @0.2.0_12+python27+x11
assuming you still have the older version around, and that:
- something changed between gegl 0.2.0_12 and 0.2.0_13 which matters
- something changed in gegl's build environment recently which matters
- not being able to load libavformat matters somehow (I suspect it's not loading because it refers to a different libavformat filename)
A very quick play with lldb breaking on dlclose() suggested that dlclose() was being called in that gegl path, so the trigger in the _atfork() might be something else. But I don't have much time to play with it today, so haven't investigated in any detail.
Thanks jeremyhu for pointing us in the right direction.
Ewen
comment:48 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
So the reason this workaround is successful is because you're getting an error even earlier. The dlopen() is failing (due to the upgrade of ffmpeg making a dependency unavailable) and thus pthread_atfork() is never being called to register the atfork handler, and thus that atfork handler is not causing the crash on the child side of the fork because it is never called.
something changed between gegl 0.2.0_12 and 0.2.0_13 which matters
No, it was just a revbump to force a rebuild with newer ffmpeg (r125491)
something changed in gegl's build environment recently which matters
Nope.
not being able to load libavformat matters somehow (I suspect it's not loading because it refers to a different libavformat filename)
Yes. Not being able to load libavformat prevents pthread_atfork from registering the failing atfork handler.
comment:49 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
A very quick play with lldb breaking on dlclose() suggested that dlclose() was being called in that gegl path,
Yep, I suspect that is resulting in the atfork handler getting removed from the process, but we still have a dangling pointer to it. Show the backtrace to see who is calling dlclose(). Assuming you have debug symbols, it should point you to the correct line in source as well which will be helpful.
so the trigger in the _atfork() might be something else.
I'm not really sure what you mean by this. Can you please elaborate?
From my analysis, we essentially have:
- gegl is dlopen()ed.
- Something in gegl or its dependencies registers a child atfork handler.
- Something is dlclose()d.
- fork() is called.
- The child side tries to call the atfork handler and crashes.
This would be expected if the registered atfork handler is inside of the code that was dlclose()d.
Also, if you set your optimization flags to -O0, you will avoid tail call optimizations, so we'll likely see the missing stack frames between:
frame #0: 0x00007fff92e420ff libsystem_c.dylib`pthread_atfork frame #1: 0x00007fff92e3bfc0 libsystem_c.dylib`pthread_once + 87 frame #2: 0x00007fff5fc13378 dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 236
which will tell us exactly what library initializer is doing the pthread_atfork.
comment:50 Changed 10 years ago by ewen-naos-nz (Ewen McNeill)
Thank you jerermyhu for the explanation (comment:49) -- it seems like a plausible theory for what is going wrong (ie, trying to call code that is no longer loaded). We appear to have confirm that it is this second pthread_atfork() that is the cause -- ie in gegl or one of its dependencies -- so that'd be the place to investigate next when someone has time (which is not me today :-) ). Rebuilding relevant libraries (glib, gegl, maybe others) with debug symbols (and no optimisation) would be a place to start.
My hunch is that dtakahashi's minimal example (comment:44) was also linked against the same things as GIMP, so probably has the same trigger.
For completeness:
so the trigger in the _atfork() might be something else.
I'm not really sure what you mean by this. Can you please elaborate?
I was assuming you were curious about the dlclose() for a lack of matching dlopen()/dlclose() pair (in this "discovery" phase), so since I seemed to find a dlclose() was speculating that maybe there was some other trigger for it. But the dangling reference into the now-dlclosed() library, called in the child at fork time, seems much more plausible.
Ewen
comment:51 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
Yeah, it looks like this is a bug in gegl.
It iterates over everything in /opt/local/lib/gegl-0.2 doing a dlopen(), some processing, and then a dlclose(). Since the issue seems to go away when you can't load /opt/local/lib/gegl-0.2/ff-load.so, I suspect that the initializer and atfork handler is somewhere in /opt/local/lib/gegl-0.2/ff-load.so or its dependencies.
Try updating back to the current version of gegl and just moving /opt/local/lib/gegl-0.2/ff-load.so to /tmp. I suspect you will be able to launch gimp just fine.
comment:52 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
Also, I've got quite a number of outdated ports on my machine as I was on vacation for a month and just got back. That's probably why I'm not seeing the issue. I'll update all my ports (which will probably take a day or so) and try to reproduce, but I'm fairly certain we've discovered the bug. We should just not dlclose() inside of gegl. Note that having dlopen() without a dlclose() is fine. You don't end up with two copies of the library loaded if you have two dlopen()s.
comment:53 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
Reproduced after just a few ports were updated:
drwxr-xr-x 4 root admin 136 Nov 2 13:09 gettext drwxr-xr-x 4 root admin 136 Nov 2 13:10 gdbm drwxr-xr-x 4 root admin 136 Nov 2 13:11 libxml2 drwxr-xr-x 4 root admin 136 Nov 2 13:19 openssl drwxr-xr-x 4 root admin 136 Nov 2 13:20 libedit drwxr-xr-x 4 root admin 136 Nov 2 13:20 sqlite3 drwxr-xr-x 3 root admin 102 Nov 2 13:32 db48 drwxr-xr-x 4 root admin 136 Nov 2 13:32 python_select drwxr-xr-x 4 root admin 136 Nov 2 13:37 python27 drwxr-xr-x 4 root admin 136 Nov 2 13:37 py27-libxml2 drwxr-xr-x 4 root admin 136 Nov 2 13:37 itstool drwxr-xr-x 4 root admin 136 Nov 2 13:37 gtk-doc drwxr-xr-x 4 root admin 136 Nov 2 13:38 libtool drwxr-xr-x 4 root admin 136 Nov 2 13:38 libpng drwxr-xr-x 4 root admin 136 Nov 2 13:38 py27-setuptools drwxr-xr-x 4 root admin 136 Nov 2 13:39 gmake drwxr-xr-x 4 root admin 136 Nov 2 13:40 p11-kit
Here's the stack with the call to pthread_atfork:
Process 28367 stopped * thread #1: tid = 0x74a62f, 0x00007fff9223cb0f libsystem_pthread.dylib`pthread_atfork, queue = 'com.apple.main-thread', stop reason = breakpoint 3.1 frame #0: 0x00007fff9223cb0f libsystem_pthread.dylib`pthread_atfork libsystem_pthread.dylib`pthread_atfork: -> 0x7fff9223cb0f: pushq %rbp 0x7fff9223cb10: movq %rsp, %rbp 0x7fff9223cb13: pushq %r15 0x7fff9223cb15: pushq %r14 (lldb) bt * thread #1: tid = 0x74a62f, 0x00007fff9223cb0f libsystem_pthread.dylib`pthread_atfork, queue = 'com.apple.main-thread', stop reason = breakpoint 3.1 * frame #0: 0x00007fff9223cb0f libsystem_pthread.dylib`pthread_atfork frame #1: 0x0000000108c087e3 libp11-kit.0.dylib`p11_library_init_impl + 147 at library.c:124 frame #2: 0x00007fff9223caf4 libsystem_pthread.dylib`__pthread_once_handler + 65 frame #3: 0x00007fff88a28f13 libsystem_platform.dylib`_os_once + 73 frame #4: 0x00007fff9223ca93 libsystem_pthread.dylib`pthread_once + 57 frame #5: 0x0000000108bbd7bb libp11-kit.0.dylib`_p11_kit_init + 27 at util.c:253 ... frame #19: 0x00007fff84950857 libdyld.dylib`dlopen(path=0x0000000108bbd7a0, mode=146938488) + 59 at dyldAPIsInLibSystem.cpp:1371 frame #20: 0x000000010110b48b libgmodule-2.0.0.dylib`g_module_open + 1490 frame #21: 0x00000001010c516d libgegl-0.2.0.dylib`gegl_module_open + 20 frame #22: 0x00000001010c4d57 libgegl-0.2.0.dylib`gegl_module_load + 84 frame #23: 0x00000001010c4cc2 libgegl-0.2.0.dylib`gegl_module_new + 155 frame #24: 0x00000001010c5a01 libgegl-0.2.0.dylib`gegl_module_db_module_initialize + 156 frame #25: 0x00000001010c49f8 libgegl-0.2.0.dylib`gegl_datafiles_read_directories + 626 frame #26: 0x00000001010a79a9 libgegl-0.2.0.dylib`gegl_post_parse_hook + 588 frame #27: 0x0000000101333fdb libglib-2.0.0.dylib`g_option_context_parse + 2265 frame #28: 0x0000000101334ac9 libglib-2.0.0.dylib`g_option_context_parse_strv + 52 frame #29: 0x000000010000f91b gimp-2.8`main + 537 frame #30: 0x00007fff849515c9 libdyld.dylib`start + 1
which points to p11-kit as the cause
comment:54 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
Port: | gegl p11-kit added; gimp2 removed |
---|
While I think gegl should not dlclose() the modules it opens (because dlclose() always risks dangling pointers like this), I'm curious why p11-kit is doing this. All it's doing is counting the number of times a process has forked.
static void count_forks (void) { /* Thread safe, executed in child, one thread exists */ p11_forkid++; } void p11_library_init_impl (void) { p11_debug_init (); p11_debug ("initializing library"); p11_mutex_init (&p11_library_mutex); pthread_key_create (&thread_local, free); p11_message_storage = thread_local_message; pthread_atfork (NULL, NULL, count_forks); }
comment:55 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
Port: | gimp2 added |
---|---|
Summary: | gimp2 @2.8.14 on 10.9 - doesn't load plug-ins → p11-kit child atfork handler causes gimp to crash due to gegl dlclose() |
comment:56 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
Summary: | p11-kit child atfork handler causes gimp to crash due to gegl dlclose() → p11-kit child atfork handler causes gimp forks to crash due to gegl dlclose(), no gimp plugins available |
---|
comment:57 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
It looks like pthread_atfork has been in p11-kit since 2011, so I suspect that p11-kit is a new dependency being brought in. In any event, gegl should just be updated to not dlclose()
comment:58 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
Port: | glib2 added |
---|---|
Summary: | p11-kit child atfork handler causes gimp forks to crash due to gegl dlclose(), no gimp plugins available → p11-kit child atfork handler causes gimp forks to crash due to gegl/glib2 dlclose(), no gimp plugins available |
Actually glib2 is the one that actually does the dlclose(). gegl does the g_module_close.
Perhaps we should just skip the dlclose() in gmodule (in glib2). The thing is, I don't know how many things rely on g_module_close() actually unmapping the file when the gmodule refcount gets to 0, but then again, things should not be expecting that they are the only reference to a dylib.
comment:59 Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
Owner: | changed from devans@… to ryandesign@… |
---|---|
Status: | assigned → new |
Over to ryan as the maintainer of glib2. Patch proposal will be attached shortly.
Changed 10 years ago by jeremyhu (Jeremy Huddleston Sequoia)
Attachment: | glib2-45309.patch added |
---|
glib2-45309.patch
comment:60 Changed 10 years ago by ryandesign (Ryan Carsten Schmidt)
Resolution: | → fixed |
---|---|
Status: | new → closed |
Thanks! r127768
standard output with --verbose flag