#63480 closed defect (fixed)
pspp-devel @1.5.3_10+gui+x11: failed assertion `!"reached"' when building
Reported by: | evanmiller (Evan Miller) | Owned by: | nerdling (Jeremy Lavergne) |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | ports | Version: | 2.7.1 |
Keywords: | Cc: | mascguy (Christopher Nielsen) | |
Port: | pspp-devel |
Description
This happens when attempting to build pspp-devel using both cairo and cairo-devel.
:info:build LSAN_OPTIONS="suppressions=/opt/local/var/macports/build/_Users_emiller_macports.local_math_pspp-devel/pspp-devel/work/pspp-1.5.3-gee1bfc/tests/lsan.supp:print_suppressions=0:$LSAN_OPTIONS" utilities/pspp-output convert doc/pspp-figures/descriptives.spv doc/pspp-figures/descriptives.png -O trim=true -O left-margin=0in -O right-margin=0in -O top-margin=0in -O bottom-margin=0in -O paper-size=7.5x99in --table-look=./doc/tutorial.stt :info:build cairo-pattern.c:3392: failed assertion `!"reached"' :info:build make[2]: *** [doc/pspp-figures/descriptives.png] Abort trap
It looks like the failed assertion is here:
https://cgit.freedesktop.org/cairo/tree/src/cairo-pattern.c#n3392
Full log to follow. It's a 32-bit PPC system, which is often relevant.
Attachments (1)
Change History (20)
Changed 3 years ago by evanmiller (Evan Miller)
Attachment: | pspp-devel-main.log added |
---|
comment:1 Changed 3 years ago by kencu (Ken)
comment:2 Changed 3 years ago by evanmiller (Evan Miller)
The plot thickens.... running it again, I get
:info:build LSAN_OPTIONS="suppressions=/opt/local/var/macports/build/_Users_emiller_macports.local_math_pspp-devel/pspp-devel/work/pspp-1.5.3-gee1bfc/tests/lsan.supp:print_suppressions=0:$LSAN_OPTIONS" utilities/pspp-output convert doc/pspp-figures/aggregate.spv doc/pspp-figures/aggregate.png -O trim=true -O left-margin=0in -O right-margin=0in -O top-margin=0in -O bottom-margin=0in -O paper-size=7.5x99in --table-look=./doc/tutorial.stt :info:build pspp-output(674,0xa000ed88) malloc: *** error for object 0x49079f0: incorrect checksum for freed object - object was probably modified after being freed, break at szone_error to debug :info:build pspp-output(674,0xa000ed88) malloc: *** set a breakpoint in szone_error to debug
So it looks like there's memory corruption happening somewhere. The initial assert failure is likely just a symptom of that.
comment:3 Changed 3 years ago by kencu (Ken)
no, not memory corruption in the app, exactly, I don't think
This is the error we see caused by the new libgcc_s.1.dylib in gcc7 conflicting with the old libgcc_s.1.dylib in /usr/lib.
We saw the same error in several (but not all) software linked against libgcc7. The exact nature of what causes it is not 100% clear to me at least, but it did not happen with gcc 7.4.0 and it does happen since then.
There are two fixes I know of. Build with static-libgcc (I believe that will work, haven't actually done it to prove it) or use DYLD_LIBRARY_PATH set to /opt/local/lib/libgcc. Or use gcc 7.4.0.
MacPorts decided to do the DYLD setting fix. There is an option in the legacysupport PortGroup to wrap the binaries that error, and set this automatically with a wrapper.
Inspect the legacysupport 1.1 PortGroup, and look in the cmake Portfile for a good example.
comment:4 Changed 3 years ago by kencu (Ken)
The darwin gcc maintainer and I were talking also about the idea of replacing some of the older libraries in /usr/lib with some newer ones that would not show this incompatability, in particular the one mentioned. I have not as yet tried that to see just what would happen, but it is an available option we might consider, depending on how much trouble this causes...
comment:5 Changed 3 years ago by evanmiller (Evan Miller)
Running the failing command manually, sometimes it succeeds, sometimes I get
GLib (gthread-posix.c): Unexpected error from C library during 'pthread_setspecific': Invalid argument. Aborting.
Other times I get the incorrect checksum
message.
To test your GCC theory, I used install_name_tool
to remove the /usr/lib/libgcc_s.1.dylib
linkage, and get the same errors:
$ sudo -u macports install_name_tool -change /usr/lib/libgcc_s.1.dylib /opt/local/lib/libgcc/libgcc_s.1.dylib utilities/.libs/pspp-output $ sudo -u macports ./utilities/pspp-output convert doc/pspp-figures/aggregate.spv doc/pspp-figures/aggregate.png -O trim=true -O left-margin=0in -O right-margin=0in -O top-margin=0in -O bottom-margin=0in -O paper-size=7.5x99in --table-look=./doc/tutorial.stt GLib (gthread-posix.c): Unexpected error from C library during 'pthread_setspecific': Invalid argument. Aborting. fish: Job 1, 'sudo -u macports ./utilities/...' terminated by signal SIGABRT (Abort) $ sudo -u macports ./utilities/pspp-output convert doc/pspp-figures/aggregate.spv doc/pspp-figures/aggregate.png -O trim=true -O left-margin=0in -O right-margin=0in -O top-margin=0in -O bottom-margin=0in -O paper-size=7.5x99in --table-look=./doc/tutorial.stt pspp-output(314,0xa000ed88) malloc: *** error for object 0x491fcd0: incorrect checksum for freed object - object was probably modified after being freed, break at szone_error to debug pspp-output(314,0xa000ed88) malloc: *** set a breakpoint in szone_error to debug $ otool -l utilities/.libs/pspp-output | grep 'name /' name /usr/lib/dyld (offset 12) name /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (offset 24) name /opt/local/lib/pspp/libpspp-1.5.3-gee1bfc.dylib (offset 24) name /opt/local/lib/libgsl.25.dylib (offset 24) name /opt/local/lib/pspp/libpspp-core-1.5.3-gee1bfc.dylib (offset 24) name /opt/local/lib/libxml2.2.dylib (offset 24) name /opt/local/lib/libpangocairo-1.0.0.dylib (offset 24) name /opt/local/lib/libpango-1.0.0.dylib (offset 24) name /opt/local/lib/libgobject-2.0.0.dylib (offset 24) name /opt/local/lib/libglib-2.0.0.dylib (offset 24) name /opt/local/lib/libharfbuzz.0.dylib (offset 24) name /opt/local/lib/libcairo.2.dylib (offset 24) name /opt/local/lib/libiconv.2.dylib (offset 24) name /opt/local/lib/libreadline.8.dylib (offset 24) name /opt/local/lib/libgslcblas.0.dylib (offset 24) name /opt/local/lib/libz.1.dylib (offset 24) name /opt/local/lib/libintl.8.dylib (offset 24) name /opt/local/lib/libgcc/libgcc_s.1.dylib (offset 24) name /opt/local/lib/libgcc/libgcc_s.1.dylib (offset 24) name /usr/lib/libSystem.B.dylib (offset 24)
comment:6 Changed 3 years ago by kencu (Ken)
You can't change the /usr/lib/libgcc_s.1.dylib to /opt/local/lib/libgcc/libgcc_s.1.dylib and have it work correctly.
There is funky stuff that goes on. As I recall, some of it is stub library stuff. Some of it involves objects being passed between some of those other linked in dylibs and the main executable, and those other linked in dylibs might be linked themselves against /usr/lib/libgcc_s.1.dylib .
So that is not actually as good as test as you would have hoped it might be.
Instead, do this (on the original unmodified executable):
DYLD_LIBRARY_PATH=/opt/local/lib/libgcc ./utilities/pspp-output
and if it is like the other 10,000 examples of this, it will work properly.
comment:7 Changed 3 years ago by kencu (Ken)
BTW I held my open repos back to gcc 7.4.0 for two years after MacPorts upgraded to 7.5.0 so I would not be bothered by this "bug".
It's not actually a bug, it's a true ABI incompatibility and it was never meant to work to pass objects back and forth between libgcc-4.2 and libgcc-7.5 -- gcc makes no such promises.
comment:8 Changed 3 years ago by evanmiller (Evan Miller)
Same errors using DYLD_LIBRARY_PATH=/opt/local/lib/libgcc
after rebuilding the executable.
The symptoms look to me like a classic case of memory corruption due to programmer error.
comment:9 Changed 3 years ago by kencu (Ken)
ok, this one may be different then.
here's an example of the libgcc7 error
comment:10 Changed 3 years ago by evanmiller (Evan Miller)
Just for the record, the linked GCC issue presents as Non-aligned pointer being freed
. Here we are seeing incorrect checksum for freed object
. Without knowing more, perhaps that distinction will help to debug malloc errors in other projects.
comment:11 Changed 3 years ago by kencu (Ken)
perhaps that might hold up as a solid differentiator.
having this constant underlying toolchain issue always in the wings, and knowing that the manifestations of it are unpredictable, certainly reduces confidence in the process.
whenever there is a disconnect between a created and freed object, the toolchain is the first thing to consider, unfortunately.
comment:12 Changed 3 years ago by evanmiller (Evan Miller)
Running in a debugger, I intermittently get:
Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_PROTECTION_FAILURE at address: 0x00000005 0x0028e5b4 in output_driver_destroy () (gdb) bt #0 0x0028e5b4 in output_driver_destroy () #1 0x0028e678 in output_engine_pop () #2 0x00007fd8 in run_convert () #3 0x0000af7c in main ()
Still get the incorrect checksum for freed object
occasionally but haven't gotten a backtrace on it yet.
Given the intermittency I suspect some kind of uninitialized memory somewhere.
comment:13 Changed 3 years ago by kencu (Ken)
Methinks there is something wrong in glib2, on tiger at least. This should not be happening:
GLib (gthread-posix.c): Unexpected error from C library during 'pthread_setspecific': Invalid argument.
and seems possibly to be the root of all further evil.
comment:14 Changed 3 years ago by evanmiller (Evan Miller)
My hypothesis with that error was that the argument was being memory corrupted. But it will be hard to know without either deep debugging or a sanitizer.
comment:15 Changed 3 years ago by evanmiller (Evan Miller)
Well, -fsanitize=address
isn't supported on this machine, but it would be a useful flag to add on a platform where ASan supported and see if that turns up anything.
comment:16 Changed 3 years ago by evanmiller (Evan Miller)
Seeing a similar-looking issue with pspp-1.5.3-g39e99a
:
:info:build LSAN_OPTIONS="suppressions=/opt/local/var/macports/build/_Users_emiller_macports.local_math_pspp-devel/pspp-devel/work/pspp-1.5.3-g39e99a/tests/lsan.supp:print_suppressions=0:$LSAN_OPTIONS" utilities/pspp-output convert doc/pspp-figures/crosstabs.spv doc/pspp-figures/crosstabs.png -O trim=true -O left-margin=0in -O right-margin=0in -O top-margin=0in -O bottom-margin=0in -O paper-size=7.5x99in --table-look=./doc/tutorial.stt :info:build cairo-pattern.c:371: failed assertion `other->status == CAIRO_STATUS_SUCCESS' :info:build make[2]: *** [doc/pspp-figures/crosstabs.png] Abort trap
comment:17 Changed 3 years ago by evanmiller (Evan Miller)
The latest beta (g82a757
) seems to have resolved this issue.
comment:18 Changed 3 years ago by mascguy (Christopher Nielsen)
Cc: | mascguy added |
---|
comment:19 Changed 2 years ago by nerdling (Jeremy Lavergne)
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
According to the referenced commit from evanmiller, this should have been fixed and shipped in v1.5.5. That'd map to pspp-devel @1.6.0_0 for macports.
Given that, this looks to have been fixed upstream already.
Please re-open if that's not the case as I have no hardware to verify this.
pspp builds on Tiger with a minimal fix btw
https://github.com/kencu/TigerPorts/commit/964e19dfa87a639776d00439b99ae9193a6d6043#diff-58a6d6801ae771e632351013ffe8e53628287bf5310e469c8c48573669011ef0