#64252 closed defect (fixed)
darktable @4.x+quartz: Crash when opening "darkroom" view of Sony NEX-6 AWR file
Reported by: | thomasrussellmurphy (Thomas Russell Murphy) | Owned by: | mascguy (Christopher Nielsen) |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | ports | Version: | 2.7.0 |
Keywords: | pending | Cc: | MarcusCalhoun-Lopez (Marcus Calhoun-Lopez), parafin, ryandesign (Ryan Carsten Schmidt) |
Port: | darktable GraphicsMagick |
Description
Initially reported to upstream (https://github.com/darktable-org/darktable/issues/10653) and told that this crash does not replicate on an official build for macOS.
Describe the bug/issue Importing a Sony NEX-6 AWR (RAW) file works initially in the lighttable view, but attempting to open the image into the darkroom view results in a crash while trying to load the file.
To Reproduce _Please provide detailed steps to reproduce the behaviour, for example:_
- Acquire a sample NEX-6 AWR [Real world shot ISO 100 (Zipped file - 16.4MB)](http://download.dpreview.com/sony_nex6/DSC00421.ARW.zip) from dpreview [Sony NEX-6 Review page 15](https://www.dpreview.com/reviews/sony-alpha-nex-6/15)
- Unzip file and place in a convenient directory
- Select import > add to library > (navigate to directory) DSC00421.ARW
- RAW file loads in lighttable view
- Select the darkroom view
- Image starts to load and render the RAW file
- Crash to desktop
- Restart results in immediate crash
Expected behavior The darkroom view renders the AWR with default RAW development settings. No crash and able to start adjusting RAW settings.
Platform _Please fill as much information as possible in the list given below. Please state "unknown" where you do not know the answer and remove any sections that are not applicable _
- darktable version : darktable @3.6.1_0+quartz (through the MacPorts [port](https://ports.macports.org/port/darktable/details/), [portfile](https://github.com/macports/macports-ports/blob/3fba86a9b1ca7c1aabf4fca0572a91c7e4ea9ef2/graphics/darktable/Portfile))
- OS : macOS 10.14.6
- Memory : 16 GB
- Graphics card : Intel UHD Graphics 630 1536 MB
- Graphics driver : macOS system
- gcc : portfile says "a minimum of Xcode Clang 10.0.1, as of v3.6.x"
Additional context _Please provide any additional information you think may be useful, for example:_
- Can you reproduce with another darktable version(s)? _Not attempted_
- Can you reproduce with a RAW or Jpeg or both? RAW-file-format
- Are the steps above reproducible with a fresh edit (i.e. after discarding history)? N/a
- If the issue is with the output image, attach an XMP file if (you'll have to change the extension to
.txt
) - Is the issue still present using an empty/new config-dir (e.g. start darktable with --configdir "/tmp")? yes (very repeatable, have to clear config to fix crash-on-reopen)
- Do you use lua scripts? no
Top of stack trace from crash reporter
Process: darktable [45286] Path: /opt/local/bin/darktable Identifier: org.darktable.darktable Version: 3.6.1 (3.6.1) Code Type: X86-64 (Native) Parent Process: ??? [1] Responsible: darktable [45286] User ID: 502 Date/Time: 2021-12-18 19:23:14.687 -0600 OS Version: Mac OS X 10.14.6 (18G9323) Report Version: 12 Bridge OS Version: 5.5 (18P4759a) Anonymous UUID: A72D88A0-DE09-2340-0C1A-CCC3D5030DA7 Time Awake Since Boot: 4700000 seconds System Integrity Protection: enabled Crashed Thread: 6 worker res 1 Exception Type: EXC_BAD_ACCESS (SIGABRT) Exception Codes: KERN_INVALID_ADDRESS at 0x000000015dd5d000 Exception Note: EXC_CORPSE_NOTIFY VM Regions Near 0x15dd5d000: MALLOC_LARGE 000000015b858000-000000015dd5d000 [ 37.0M] rw-/rwx SM=PRV --> STACK GUARD 000070000df9f000-000070000dfa0000 [ 4K] ---/rwx SM=NUL stack guard for thread 15 Application Specific Information: abort() called
Change History (43)
comment:1 Changed 3 years ago by mascguy (Christopher Nielsen)
Cc: | mascguy removed |
---|---|
Owner: | set to mascguy |
Status: | new → assigned |
comment:2 Changed 3 years ago by mascguy (Christopher Nielsen)
comment:3 Changed 3 years ago by Christopher Nielsen <mascguy@…>
comment:4 Changed 3 years ago by thomasrussellmurphy (Thomas Russell Murphy)
Now darktable seems to consistently crash, with a fresh config (clearing ~./config/darktable since I don't have it set up well) each time: a) on quit and b) upon importing the AWR file to the lighttable (can't get to the darkroom view, even).
Process: darktable [6712] Path: /opt/local/bin/darktable Identifier: org.darktable.darktable Version: 3.8.1 (3.8.1) Code Type: X86-64 (Native) Parent Process: ??? [1] Responsible: darktable [6712] User ID: 502 Date/Time: 2022-02-27 10:56:56.576 -0600 OS Version: Mac OS X 10.14.6 (18G9323) Report Version: 12 Bridge OS Version: 5.5 (18P4759a) Anonymous UUID: A72D88A0-DE09-2340-0C1A-CCC3D5030DA7 Time Awake Since Boot: 2900000 seconds System Integrity Protection: enabled Crashed Thread: 10 worker res 1 Exception Type: EXC_BAD_ACCESS (SIGABRT) Exception Codes: KERN_INVALID_ADDRESS at 0x0000000144c56000 Exception Note: EXC_CORPSE_NOTIFY VM Regions Near 0x144c56000: MALLOC_LARGE 0000000140150000-0000000144c56000 [ 75.0M] rw-/rwx SM=PRV --> STACK GUARD 000070000a788000-000070000a789000 [ 4K] ---/rwx SM=NUL stack guard for thread 1 Application Specific Information: abort() called
comment:5 Changed 2 years ago by Christopher Nielsen <mascguy@…>
comment:6 Changed 2 years ago by mascguy (Christopher Nielsen)
Keywords: | pending added |
---|
comment:7 Changed 2 years ago by mascguy (Christopher Nielsen)
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Testing locally with version 4.0.0 +quartz, I no longer see a crash.
Let me know if you're still having issues. If so, we can reopen.
comment:8 follow-ups: 9 10 Changed 2 years ago by thomasrussellmurphy (Thomas Russell Murphy)
With darktable @4.0.0_2+quartz
I get a crash at step 4 of my reproduction process, still. Fresh config directory.
However, it does appear to quit cleanly if I haven't imported anything, now.
comment:9 Changed 2 years ago by mascguy (Christopher Nielsen)
Resolution: | fixed |
---|---|
Status: | closed → reopened |
comment:10 Changed 2 years ago by mascguy (Christopher Nielsen)
Replying to thomasrussellmurphy:
With
darktable @4.0.0_2+quartz
I get a crash at step 4 of my reproduction process, still. Fresh config directory.However, it does appear to quit cleanly if I haven't imported anything, now.
I tested on multiple macOS releases, from 10.12 through 10.15. And while most of those work fine, the crash did indeed occur with 10.14. (Which matches with what you're testing on, so that's good.)
Need to do more digging.
comment:11 Changed 2 years ago by thomasrussellmurphy (Thomas Russell Murphy)
Thanks for continuing to investigate! Please let me know if you need additional input.
comment:12 follow-up: 24 Changed 2 years ago by mascguy (Christopher Nielsen)
Cc: | parafin added |
---|---|
Summary: | darktable @3.6.1_0+quartz: Crash when opening "darkroom" view of Sony NEX-6 AWR file → darktable @4.x+quartz: Crash when opening "darkroom" view of Sony NEX-6 AWR file |
Interestingly enough, this crash doesn't occur when darktable
is built with +debug
.
Also, it appears that the issue is originating from within GraphicsMagick
:
Magick: abort due to signal 11 (SIGSEGV) "Segmentation Fault"... Abort trap: 6
Confirmed by thread stack trace:
Thread 8 Crashed:: worker res 1 0 libsystem_kernel.dylib 0x00007fff6069f2c2 __pthread_kill + 10 1 libsystem_pthread.dylib 0x00007fff6075abf1 pthread_kill + 284 2 libsystem_c.dylib 0x00007fff606096a6 abort + 127 3 libGraphicsMagick.3.dylib 0x000000011021c4bb MagickPanicSignalHandler + 52 4 libsystem_platform.dylib 0x00007fff6074fb5d _sigtramp + 29 5 ??? 0x00007f9a9f941f00 0 + 140302078975744 6 libdarktable.dylib 0x000000010cf09bec dt_interpolation_resample + 252 7 libdarktable.dylib 0x000000010cf0a5c8 dt_interpolation_resample_roi + 88 8 libdarktable.dylib 0x000000010cfa4560 dt_iop_clip_and_zoom_roi + 80 9 libdemosaic.so 0x000000011a07591a process + 21322 10 libdarktable.dylib 0x000000010cfebfb5 pixelpipe_process_on_CPU + 405 11 libdarktable.dylib 0x000000010cfe9150 dt_dev_pixelpipe_process_rec + 4848 [...more dt_dev_pixelpipe_process_rec...] 89 libdarktable.dylib 0x000000010cfe6753 dt_dev_pixelpipe_process + 1075 90 libdarktable.dylib 0x000000010cf90081 dt_dev_process_preview_job + 481 91 libdarktable.dylib 0x000000010cf65ff1 dt_dev_process_preview_job_run + 17 92 libdarktable.dylib 0x000000010cf5ef4d dt_control_work_res + 525 93 libsystem_pthread.dylib 0x00007fff607582eb _pthread_body + 126 94 libsystem_pthread.dylib 0x00007fff6075b249 _pthread_start + 66 95 libsystem_pthread.dylib 0x00007fff6075740d thread_start + 13
Getting warmer!
comment:13 Changed 2 years ago by mascguy (Christopher Nielsen)
Just for kicks, I also tried installing GraphicsMagick
with +q32
, for another point of reference. But that didn't make a difference.
comment:14 Changed 2 years ago by mascguy (Christopher Nielsen)
Cc: | ryandesign added |
---|---|
Port: | GraphicsMagick added |
Presently, port GraphicsMagick
doesn't have a +debug
variant, which might help diagnosing this.
I'll see if I can quickly figure out what's necessary, and create a PR for @ryandesign.
comment:15 follow-up: 16 Changed 2 years ago by parafin
GraphicsMagick is a red-herring in the backtrace - it installs its own SIGSEGV handler, so unless application does the same it shows up in any backtrace of an application linking to it. Which, I would say, borders on malicious behaviour.
For source of the crash see frames 5-6, which is darktable's own code. This is probably a compiler bug. darktable ended official support for building on macOS 10.14 as of 4.0 release (Xcode version which is possible to install there is too old for the features we want, especially in rawspeed). It's possible though to build on macOS 10.15 with newer Xcode, but target 10.14 as deployment target (older ones are again not officially supported). This is how official macOS DMG package is created.
As a debug suggestion I can propose disabling OpenMP to see if it avoids the crash. It of course will degrade performance significantly.
comment:16 Changed 2 years ago by mascguy (Christopher Nielsen)
Replying to parafin:
For source of the crash see frames 5-6, which is darktable's own code. This is probably a compiler bug.
Confirmed, and the crash doesn't occur if we build with a newer MacPorts Clang.
I'll commit a fix shortly. Thanks for the quick response!
comment:17 follow-up: 19 Changed 2 years ago by mascguy (Christopher Nielsen)
Thomas, I may not be able to push a formal fix until later today, or perhaps even tomorrow.
So in the interim, you can fix your installation, by doing the following:
$ sudo port -f uninstall darktable $ sudo port -N install darktable +quartz configure.compiler=macports-clang-13
comment:18 Changed 2 years ago by Christopher Nielsen <mascguy@…>
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
comment:19 Changed 2 years ago by mascguy (Christopher Nielsen)
Thomas, now that a formal fix has been pushed, you needn't do anything special. Just wait at least two hours, resync your ports, and then upgrade.
Once that's been done, let me know if all is well!
comment:20 follow-up: 21 Changed 2 years ago by thomasrussellmurphy (Thomas Russell Murphy)
Cleanly updating to darktable @4.0.0_3+quartz (active)
still results in the import -> crash behavior with the specified file and null configuration before start. I also encountered a build failure since cleaned after the recommended forced uninstall and assigned compiler suggestion.
comment:21 Changed 2 years ago by mascguy (Christopher Nielsen)
Replying to thomasrussellmurphy:
Cleanly updating to
darktable @4.0.0_3+quartz (active)
still results in the import -> crash behavior with the specified file and null configuration before start. I also encountered a build failure since cleaned after the recommended forced uninstall and assigned compiler suggestion.
The previous compilation error was expected, as there was another change needed. But the crash is troubling, as it's no longer occurring for my 10.14 installations.
Is the stack trace still similar, per our earlier comments from today?
comment:22 Changed 2 years ago by mascguy (Christopher Nielsen)
Resolution: | fixed |
---|---|
Status: | closed → reopened |
comment:23 Changed 2 years ago by mascguy (Christopher Nielsen)
Also, can you install with both +quartz
and +debug
, and test with that?
comment:24 follow-up: 25 Changed 2 years ago by thomasrussellmurphy (Thomas Russell Murphy)
Reinstalling with +debug
added results in. . . no crash. So potential workaround in that. Switching back tot he default, I get the same pattern of offsets as comment 12, now as Thread 9 Crashed:: worker res 1
.
comment:25 Changed 2 years ago by mascguy (Christopher Nielsen)
Replying to thomasrussellmurphy:
Reinstalling with
+debug
added results in. . . no crash. So potential workaround in that. Switching back tot he default, I get the same pattern of offsets as comment 12, now asThread 9 Crashed:: worker res 1
.
That's great news!
Can you also provide the output from port info --depends_build darktable
? I want to verify which MacPorts clang is being used. (It should default to clang-13
, but just want to be sure before going any further.)
comment:26 follow-up: 27 Changed 2 years ago by thomasrussellmurphy (Thomas Russell Murphy)
Indeed not. I get depends_build: cmake, cctools, gettext, intltool, pkgconfig, po4a, perl5.34
with the straight install.
comment:27 Changed 2 years ago by mascguy (Christopher Nielsen)
Replying to thomasrussellmurphy:
Indeed not. I get
depends_build: cmake, cctools, gettext, intltool, pkgconfig, po4a, perl5.34
with the straight install.
Ah, you must have Xcode 11.x installed, which is a potential scenario that I missed.
Can you provide the output from xcodebuild -version
?
comment:28 Changed 2 years ago by Christopher Nielsen <mascguy@…>
comment:29 Changed 2 years ago by mascguy (Christopher Nielsen)
Thomas, once you sync your ports, can you re-check the output from port info --depends_build darktable
?
comment:30 follow-up: 31 Changed 2 years ago by thomasrussellmurphy (Thomas Russell Murphy)
Xcode 11.3.1 Build version 11C504
Heading off to sync and re-install. At sync, I now see depends_build: cmake, cctools, gettext, intltool, pkgconfig, po4a, perl5.34, clang-13
.
Thanks for all the support!
comment:31 Changed 2 years ago by mascguy (Christopher Nielsen)
Replying to thomasrussellmurphy:
Heading off to sync and re-install. At sync, I now see
depends_build: cmake, cctools, gettext, intltool, pkgconfig, po4a, perl5.34, clang-13
.Thanks for all the support!
My pleasure, glad we could help!
And let me know when you've verified that the issue is fixed. I'll keep the ticket open until then, just-in-case.
comment:32 follow-up: 33 Changed 2 years ago by thomasrussellmurphy (Thomas Russell Murphy)
I got one crash-on-quit with the new install, but haven't been able to replicate with fresh config. In any case, the basic image loading now does work with this sample file, as does moving between modes once the file is loaded.
comment:33 Changed 2 years ago by mascguy (Christopher Nielsen)
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Replying to thomasrussellmurphy:
I got one crash-on-quit with the new install, but haven't been able to replicate with fresh config. In any case, the basic image loading now does work with this sample file, as does moving between modes once the file is loaded.
That's awesome news, thanks for confirming!
comment:34 follow-up: 35 Changed 2 years ago by jmroot (Joshua Root)
I'd be very wary of assuming this to be a compiler bug. Yes, those exist sometimes, but when code mysteriously behaves differently, it is much more likely to be the optimiser taking different shortcuts in response to undefined behaviour in the code. I guess +debug probably compiles with -O0, hiding the issue, so to diagnose this it might be necessary to add -g to the normal flags in order to get line number information in the backtrace. (Note that the build directory needs to be present at runtime for this to work, so use port's -k flag to keep it.)
comment:35 Changed 2 years ago by mascguy (Christopher Nielsen)
Replying to jmroot:
I'd be very wary of assuming this to be a compiler bug. Yes, those exist sometimes, but when code mysteriously behaves differently, it is much more likely to be the optimiser taking different shortcuts in response to undefined behaviour in the code. I guess +debug probably compiles with -O0, hiding the issue, so to diagnose this it might be necessary to add -g to the normal flags in order to get line number information in the backtrace. (Note that the build directory needs to be present at runtime for this to work, so use port's -k flag to keep it.)
For sure, an optimization-related bug seems like a good possibility. Eventually I'll revisit this in more detail, though my first priority was to ensure it's no longer a blocker for Thomas and other users.
First I'd like to disable the GraphicsMagick lib signal handler though, per your comments in issue:65630.
comment:36 follow-up: 37 Changed 2 years ago by mascguy (Christopher Nielsen)
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Upstream PR opened for darktable
, to disable signal handlers for GraphicsMagick
. Thanks @parafin!
12324 - graphicsmagick: use new API to not install signal handlers
comment:37 Changed 2 years ago by mascguy (Christopher Nielsen)
Replying to mascguy:
Upstream PR opened for
darktable
, to disable signal handlers forGraphicsMagick
. Thanks @parafin!
Testing locally with this patch - combined with building via Xcode Clang - results in the following stacktrace:
Thread 7 Crashed:: worker res 1 0 libsystem_platform.dylib 0x00007fff7c71f6de _platform_memmove$VARIANT$Nehalem + 254 1 libdarktable.dylib 0x000000010ce921fc dt_interpolation_resample + 252 2 libdarktable.dylib 0x000000010ce92bd8 dt_interpolation_resample_roi + 88 3 libdarktable.dylib 0x000000010cf2cb70 dt_iop_clip_and_zoom_roi + 80 4 libdemosaic.so 0x000000011a90f91a process + 21322 5 libdarktable.dylib 0x000000010cf745c5 pixelpipe_process_on_CPU + 405 6 libdarktable.dylib 0x000000010cf71760 dt_dev_pixelpipe_process_rec + 4848
Definitely an improvement detail-wise, with the GraphicsMagick
signal handlers disabled!
Thoughts relative to the crash within _platform_memmove
?
comment:38 Changed 2 years ago by Christopher Nielsen <mascguy@…>
comment:39 Changed 2 years ago by jmroot (Joshua Root)
There's clearly inlining happening, as dt_interpolation_resample
just calls either dt_interpolation_resample_sse
or dt_interpolation_resample_plain
. That line number information would be really helpful.
comment:40 Changed 2 years ago by mascguy (Christopher Nielsen)
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
While it would be awesome if we could determine the issue with Xcode Clang, there simply isn't enough time to go down this rabbit hole. Closing as fixed.
comment:41 follow-ups: 42 43 Changed 2 years ago by jmroot (Joshua Root)
Undefined behaviour is (by definition, for better or worse) not an issue with the compiler, it's an issue with the code. Just because the code doesn't crash when built with a different compiler doesn't mean it's doing what it's supposed to do. I wouldn't call building with -g a rabbit hole, it's pretty basic debugging. I realise upstream doesn't seem interested in investigating, and if you don't want to either then fair enough I guess, I just wanted to make it clear that this is not so much "fixed" as "it compiles, ship it."
comment:42 Changed 2 years ago by mascguy (Christopher Nielsen)
Replying to jmroot:
Undefined behaviour is (by definition, for better or worse) not an issue with the compiler, it's an issue with the code. Just because the code doesn't crash when built with a different compiler doesn't mean it's doing what it's supposed to do. I wouldn't call building with -g a rabbit hole, it's pretty basic debugging. I realise upstream doesn't seem interested in investigating, and if you don't want to either then fair enough I guess, I just wanted to make it clear that this is not so much "fixed" as "it compiles, ship it."
Yep, understood.
But to correct you on one point: It's not accurate to state that upstream isn't interested in investigating. Indeed, they've dealt with this in the past, and require a newer compiler because of it.
comment:43 Changed 2 years ago by parafin
Replying to jmroot:
Undefined behaviour is (by definition, for better or worse) not an issue with the compiler, it's an issue with the code. Just because the code doesn't crash when built with a different compiler doesn't mean it's doing what it's supposed to do. I wouldn't call building with -g a rabbit hole, it's pretty basic debugging. I realise upstream doesn't seem interested in investigating, and if you don't want to either then fair enough I guess, I just wanted to make it clear that this is not so much "fixed" as "it compiles, ship it."
Compiler bugs do exist and I’ve encountered several myself. Well, you just have to open gcc bugzilla to believe it;)
So I don’t see why exactly you state that darktable triggers an undefined behavior as if it were a fact. While there’s for sure enough bugs of various nature in darktable, I don’t actually believe this to be one of them. Specifically OpenMP support historically was very problematic in various compilers, with differences between implementations and bugs (e.g. compiler just crashing, or is undefined behaviour again to blame?).
Tested with the X11 variant, and the crash doesn't occur in that case. So that's good news.
Next up is installing the Quartz variant - which is the setup covered by this ticket - to see what's happening.