Opened 12 months ago
Closed 11 months ago
#68810 closed defect (fixed)
OpenBLAS: libopenblas.0.dylib cannot find symbol _xerbla_
Reported by: | erikbs | Owned by: | NicosPavlov |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | ports | Version: | |
Keywords: | Cc: | michaelld (Michael Dickens), catap (Kirill A. Korinsky), Dave-Allured (Dave Allured) | |
Port: | OpenBLAS |
Description
After the recent major changes to the OpenBLAS Portfile (migration to CMake etc.), libopenblas.0.dylib
fails to find the symbol _xerbla_
when loaded. This breaks e.g. py-numpy, which fails on import, and py-scipy, which depends on NumPy being imported successfully.
Here is the output when calling import numpy
from the Python 3.11 REPL after installing a py311-numpy version that depends on the new OpenBLAS version:
>>> import numpy Traceback (most recent call last): File "/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/numpy/core/__init__.py", line 24, in <module> from . import multiarray File "/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/numpy/core/multiarray.py", line 10, in <module> from . import overrides File "/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/numpy/core/overrides.py", line 8, in <module> from numpy.core._multiarray_umath import ( ImportError: dlopen(/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/numpy/core/_multiarray_umath.cpython-311-darwin.so, 2): Symbol not found: _xerbla_ Referenced from: /opt/local/lib/libopenblas.0.dylib Expected in: flat namespace in /opt/local/lib/libopenblas.0.dylib
Change History (11)
comment:1 Changed 12 months ago by erikbs
Port: | removed |
---|
comment:2 Changed 12 months ago by ryandesign (Ryan Carsten Schmidt)
Cc: | michaelld catap added |
---|---|
Owner: | set to NicosPavlov |
Status: | new → assigned |
comment:3 follow-up: 5 Changed 12 months ago by catap (Kirill A. Korinsky)
comment:4 Changed 12 months ago by jmroot (Joshua Root)
It may be relevant which py-numpy subport is being used, as per #68807.
comment:5 Changed 11 months ago by erikbs
I am on 10.9 and I use the default variant configuration for NumPy, which is +openblas +gfortran, I think. It is the OpenBLAS option that is the important one, since it is the OpenBLAS dylib that cannot be loaded.
comment:6 Changed 11 months ago by erikbs
I removed the patch that enables weak linking on older platforms as a test. The build then fails:
... :info:build [ 0%] Building C object driver/others/CMakeFiles/driver_others.dir/xerbla.c.o :info:build cd /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/driver/others && /opt/local/bin/clang-mp-16 -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.25 -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build -pipe -O3 -DNDEBUG -I/opt/local/include -arch x86_64 -DHAVE_C11 -Wall -m64 -mavx2 -mavx -msse -msse2 -msse3 -mssse3 -msse4.1 -fPIC -DSMALL_MATRIX_OPT -DNO_AVX512 -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=8 -DMAX_PARALLEL_NUMBER=1 -DMAX_STACK_ALLOC=2048 -DNO_AFFINITY -DVERSION="\"0.3.25\"" -DBUILD_SINGLE -DBUILD_DOUBLE -DBUILD_COMPLEX -DBUILD_COMPLEX16 -arch x86_64 -mmacosx-version-min=10.9 -MD -MT driver/others/CMakeFiles/driver_others.dir/xerbla.c.o -MF CMakeFiles/driver_others.dir/xerbla.c.o.d -o CMakeFiles/driver_others.dir/xerbla.c.o -c /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.25/driver/others/xerbla.c ... :info:build [ 11%] Building C object interface/CMakeFiles/interface.dir/CMakeFiles/xerbla.c.o :info:build cd /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/interface && /opt/local/bin/clang-mp-16 -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.25 -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build -pipe -O3 -DNDEBUG -I/opt/local/include -arch x86_64 -DHAVE_C11 -Wall -m64 -mavx2 -mavx -msse -msse2 -msse3 -mssse3 -msse4.1 -fPIC -DSMALL_MATRIX_OPT -DNO_AVX512 -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=8 -DMAX_PARALLEL_NUMBER=1 -DMAX_STACK_ALLOC=2048 -DNO_AFFINITY -DVERSION="\"0.3.25\"" -DBUILD_SINGLE -DBUILD_DOUBLE -DBUILD_COMPLEX -DBUILD_COMPLEX16 -arch x86_64 -mmacosx-version-min=10.9 -MD -MT interface/CMakeFiles/interface.dir/CMakeFiles/xerbla.c.o -MF CMakeFiles/interface.dir/CMakeFiles/xerbla.c.o.d -o CMakeFiles/interface.dir/CMakeFiles/xerbla.c.o -c /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/interface/CMakeFiles/xerbla.c ... :info:build ar: creating archive libopenblas.a :info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols :info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols :info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols :info:build /opt/local/bin/ranlib: file: libopenblas.a(la_constants.f90.o) has no symbols :info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols :info:build /opt/local/bin/ranlib: file: libopenblas.a(la_constants.f90.o) has no symbols :info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols :info:build /opt/local/bin/ranlib: file: libopenblas.a(la_constants.f90.o) has no symbols :info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols :info:build /opt/local/bin/ranlib: file: libopenblas.a(la_constants.f90.o) has no symbols :info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols :info:build /opt/local/bin/ranlib: file: libopenblas.a(la_constants.f90.o) has no symbols :info:build sh -c '/opt/local/bin/ar -ru libopenblas.a /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/driver/others/CMakeFiles/driver_others.dir/xerbla.c.o && exit 0' :info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols :info:build /opt/local/bin/ranlib: file: libopenblas.a(la_constants.f90.o) has no symbols :info:build sh -c 'echo "" | /opt/local/bin/gfortran-mp-13 -o dummy.o -c -x f95-cpp-input - ' :info:build f951: Warning: Reading file '<stdin>' as free form :info:build sh -c '/opt/local/bin/gfortran-mp-13 -fpic -shared -Wl,-all_load -Wl,-force_load,libopenblas.a -Wl,-noall_load dummy.o -o /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/lib/libopenblas.0.3.dylib' :info:build ld: warning: option -noall_load is obsolete and being ignored :info:build Undefined symbols for architecture x86_64: :info:build "_xerbla_", referenced from: :info:build _sgemv_ in libopenblas.a(sgemv.c.o) :info:build _sger_ in libopenblas.a(sger.c.o) :info:build _strsv_ in libopenblas.a(strsv.c.o) :info:build _strmv_ in libopenblas.a(strmv.c.o) :info:build _ssyr2_ in libopenblas.a(ssyr2.c.o) :info:build _sgbmv_ in libopenblas.a(sgbmv.c.o) :info:build _ssbmv_ in libopenblas.a(ssbmv.c.o) :info:build ... :info:build (maybe you meant: _xerbla_array_) :info:build ld: symbol(s) not found for architecture x86_64 :info:build collect2: error: ld returned 1 exit status
There are two xerbla.c.o files. Standing in the build directory:
sh-3.2# find . -iname xerbla.c.o ./driver/others/CMakeFiles/driver_others.dir/xerbla.c.o ./interface/CMakeFiles/interface.dir/CMakeFiles/xerbla.c.o
The second does not contain any symbols, but the first one does:
sh-3.2# nm -gU ./driver/others/CMakeFiles/driver_others.dir/xerbla.c.o | grep _xerbla_ 0000000000000000 T _xerbla_
However, libopenblas.a
does not contain the _xerbla_
symbol:
sh-3.2# nm -gU libopenblas.a | grep _xerbla_ no symbols no symbols 0000000000000000 T _xerbla_array_
Even when I run
/opt/local/bin/ar -ru libopenblas.a /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/driver/others/CMakeFiles/driver_others.dir/xerbla.c.o
manually, libopenblas.a
still does not contain it. It warns about missing symbols, but that seems to be because it references the other xerbla.c.o file (and another file without symbols), but the command does not fail:
sh-3.2# /opt/local/bin/ar -ru libopenblas.a /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/driver/others/CMakeFiles/driver_others.dir/xerbla.c.o /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols /opt/local/bin/ranlib: file: libopenblas.a(la_constants.f90.o) has no symbols sh-3.2# echo $? 0
I get ar
/ranlib
from cctools:
The following ports are currently installed: cctools @949.0.1_3+llvm90 (active)
Per this comment, libopenblas.a
should have contained _xerbla_
(“even if as a weak symbol”, but __attribute__((weak))
is #ifdef-ed to only apply to ELF).
I have no idea why, but when I did this:
sh-3.2# cc -o xx.o -I$(pwd) -I ../OpenBLAS-0.3.25/ -c ../OpenBLAS-0.3.25/driver/others/xerbla.c sh-3.2# /opt/local/bin/ar -ru libopenblas.a xx.o sh-3.2# chown macports libopenblas.a
followed by
install -o openblas +gcc13 +lapack +native
in the MacPorts shell, the linking succeeds and the build finishes. Even NumPy works
So why does
cc -o xx.o -I$(pwd) -I ../OpenBLAS-0.3.25/ -c ../OpenBLAS-0.3.25/driver/others/xerbla.c
produce a usable object file when
cd /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/driver/others && /opt/local/bin/clang-mp-16 -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.25 -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build -pipe -O3 -DNDEBUG -I/opt/local/include -arch x86_64 -DHAVE_C11 -Wall -m64 -mavx2 -mavx -msse -msse2 -msse3 -mssse3 -msse4.1 -fPIC -DSMALL_MATRIX_OPT -DNO_AVX512 -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=8 -DMAX_PARALLEL_NUMBER=1 -DMAX_STACK_ALLOC=2048 -DNO_AFFINITY -DVERSION="\"0.3.25\"" -DBUILD_SINGLE -DBUILD_DOUBLE -DBUILD_COMPLEX -DBUILD_COMPLEX16 -arch x86_64 -mmacosx-version-min=10.9 -MD -MT driver/others/CMakeFiles/driver_others.dir/xerbla.c.o -MF CMakeFiles/driver_others.dir/xerbla.c.o.d -o CMakeFiles/driver_others.dir/xerbla.c.o -c /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.25/driver/others/xerbla.c
does not?
My cc
is:
sh-3.2# cc --version clang version 17.0.6 Target: x86_64-apple-darwin13.4.0 Thread model: posix InstalledDir: /opt/local/libexec/llvm-17/bin
comment:7 Changed 11 months ago by erikbs
produce a usable object file when
cd /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/driver/others && /opt/local/bin/clang-mp-16 -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.25 -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build -pipe -O3 -DNDEBUG -I/opt/local/include -arch x86_64 -DHAVE_C11 -Wall -m64 -mavx2 -mavx -msse -msse2 -msse3 -mssse3 -msse4.1 -fPIC -DSMALL_MATRIX_OPT -DNO_AVX512 -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=8 -DMAX_PARALLEL_NUMBER=1 -DMAX_STACK_ALLOC=2048 -DNO_AFFINITY -DVERSION="\"0.3.25\"" -DBUILD_SINGLE -DBUILD_DOUBLE -DBUILD_COMPLEX -DBUILD_COMPLEX16 -arch x86_64 -mmacosx-version-min=10.9 -MD -MT driver/others/CMakeFiles/driver_others.dir/xerbla.c.o -MF CMakeFiles/driver_others.dir/xerbla.c.o.d -o CMakeFiles/driver_others.dir/xerbla.c.o -c /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.25/driver/others/xerbla.c
does not
Turns out that it actually does … If I manually run this command when the build fails (to regenerate xerbla.c.o
) and then resume the build using port install -o
, everything works just fine. I can even do
sudo port install -s -o openblas +gcc13 +lapack +native sudo rm /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/driver/others/CMakeFiles/driver_others.dir/xerbla.c.o* sudo port install -s -o openblas +gcc13 +lapack +native
to make it complete successfully.
Thinking that there must have been something with the build order that caused things to fail, I copied the object file from both runs to a temporary location to compare them. The result confused me:
md5 */xerbla.c.o MD5 (verkar/xerbla.c.o) = 57ac55a93b3cde59adeaaccb658f6206 MD5 (verkar_ikkje/xerbla.c.o) = 57ac55a93b3cde59adeaaccb658f6206
The files are identical. And sure enough, a simple touch <..>/xerbla.c.o
(instead of rm
) also makes the build succeed! In fact, after experimenting with timestamps, it seems that it is enough if ./driver/others/CMakeFiles/driver_others.dir/xerbla.c.o
is newer than ./interface/CMakeFiles/interface.dir/CMakeFiles/xerbla.c.o
. man ar
revealed that this is the expected behaviour for the -u
option.
My preliminary conclusion is that it is *not* weak linking that is the solution on older Mac OS X versions, but rather one of these options:
- Ensure that the the correct
xerbla.c.o
is either linked first or compiled last - Make
ar
updatexerbla.c.o
even though the modification time is older than the existing entry. This can be done by usingar -rs
instead ofar -ru
.
Luckily xerbla.c.o
is linked separately and not as part of a bulk operation, so we can safely change ar -ru
to ar -rs
without consequences for other object files (I found about 75 .o files that are not unique in the build tree).
I have submitted a pull request to OpenBLAS: https://github.com/OpenMathLib/OpenBLAS/pull/4353
comment:8 follow-up: 9 Changed 11 months ago by catap (Kirill A. Korinsky)
erikbs, can you open a PR to OpenBLAS and OpenBLAS-devel ports with backport of this patch?
comment:9 Changed 11 months ago by erikbs
Replying to catap:
erikbs, can you open a PR to OpenBLAS and OpenBLAS-devel ports with backport of this patch?
Good idea; https://github.com/macports/macports-ports/pull/21650
Do you have time to test it on a couple of the versions you tested the weak linking solution on (except Mavericks of course)? I think weak linking no longer is necessary, so I removed that.
comment:10 Changed 11 months ago by Dave-Allured (Dave Allured)
Cc: | Dave-Allured added |
---|
comment:11 Changed 11 months ago by erikbs
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
erikbs which system do you use? I've used py-numpy as a test port when make that migration, and it works well on macOS 13:
and the same for py311: