Opened 6 years ago
Closed 5 years ago
#56954 closed defect (worksforme)
py-numpy: numpy.polyfit broken with +gfortran variant on High Sierra
Reported by: | mojca (Mojca Miklavec) | Owned by: | michaelld (Michael Dickens) |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | ports | Version: | |
Keywords: | Cc: | jmroot (Joshua Root), dershow, jsalort (Julien Salort), DanielO (Daniel O'Connor), reneeotten (Renee Otten), majoc-at-astro (majoc-at-astro), Dave-Allured (Dave Allured) | |
Port: | py-numpy |
Description
There seems to be an issue with numpy.polyfit under various versions of python I tested (2.7, 3.6, 3.7).
Running the following code under system python:
import numpy as np points = np.array([[0, -2.1108348e+04], [3.2768000e+04, -2.7959160e+03], [6.5534000e+04, 1.4279546e+04], [4.9151000e+04, 6.6721514e+03], [1.6384000e+04, -1.3232387e+04]], dtype=np.float32) koef1 = np.polyfit(points[:,0], points[:,1], 1) koef2 = np.polyfit(points[:,0], points[:,1], 2)
works and returns
>>> koef1 array([ 5.53485811e-01, -2.13732832e+04], dtype=float32) >>> koef2 array([ -4.00165135e-07, 5.79710186e-01, -2.15881035e+04], dtype=float32)
However using python from MacPorts it looks much worse:
>>> koef1 = np.polyfit(points[:,0], points[:,1], 1) Python(83745,0x7fff9e546380) malloc: *** mach_vm_map(size=18446744072450498560) failed (error code=3) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug init_dgelsd failed init __main__:1: RankWarning: Polyfit may be poorly conditioned >>> koef2 = np.polyfit(points[:,0], points[:,1], 2) Python(83745,0x7fff9e546380) malloc: *** mach_vm_map(size=18446744072450498560) failed (error code=3) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug init_dgelsd failed init __main__:1: RankWarning: Polyfit may be poorly conditioned >>> koef1 array([2.2937462e-317, 1.0438839e-312]) >>> koef2 array([5.05031302e+09, 8.97368555e+04, 2.23606798e+00])
Attachments (1)
Change History (38)
comment:1 Changed 6 years ago by mojca (Mojca Miklavec)
Changed 6 years ago by mojca (Mojca Miklavec)
Attachment: | numpy-test-suite.txt added |
---|
failed tests of numpy
comment:2 Changed 6 years ago by mf2k (Frank Schima)
I cannot reproduce your polyfit error. It works fine for me with python36 and ipython. Here are my test suite results:
NumPy version 1.15.0 ... 4946 passed, 20 skipped, 7 xfailed in 274.98 seconds
For reference, what are your installed versions of the relevant ports?
$ port installed python36 py36-numpy py36-ipython The following ports are currently installed: py36-ipython @6.4.0_0 (active) py36-numpy @1.15.0_0+gcc8 (active) python36 @3.6.6_0+optimizations (active)
comment:3 Changed 6 years ago by michaelld (Michael Dickens)
The commands you list work for me using MacPorts' Python 2.7, 3.6, and 3.7 && latest NumPy; all on 10.12 latest.
comment:4 Changed 6 years ago by reneeotten (Renee Otten)
I am seeing the same errors as reported by mojca with Python 2.7, 3.6, and 3.7. The only difference for me, compared to what Frank reports, is that all py-numpy ports were installed with the default +gfortran variant:
py36-numpy @1.15.0_0+gfortran (active)
comment:5 follow-up: 7 Changed 6 years ago by mf2k (Frank Schima)
How about these ports? There was a problem with using llvm for them that surfaced recently.
$ port installed ld64 cctools The following ports are currently installed: cctools @895_6+xcode (active) ld64 @3_1+ld64_xcode (active)
comment:6 Changed 6 years ago by michaelld (Michael Dickens)
All of my NumPy ports are installed using the default:
py27-numpy @1.15.0_0+gfortran (active) py36-numpy @1.15.0_0+gfortran (active) py37-numpy @1.15.0_0+gfortran (active)
my cctools & ld64 are up to date, but since these NumPy were installed way before these latest issues with cctools & ld64, I doubt that's what's causing the issue here (though one never knows ;)
comment:7 Changed 6 years ago by reneeotten (Renee Otten)
Replying to mf2k:
How about these ports? There was a problem with using llvm for them that surfaced recently.
$ port installed ld64 cctools The following ports are currently installed: cctools @895_6+xcode (active) ld64 @3_1+ld64_xcode (active)
seem to be the latest as well:
~> port installed ld64 cctools The following ports are currently installed: cctools @895_6+xcode (active) ld64 @3_1+ld64_xcode (active)
doing an uninstall and clean, followed by sudo port -vst install py37-numpy
didn't help either.
comment:8 Changed 6 years ago by mf2k (Frank Schima)
I switched my py36-numpy to use +gfortran
and I can confirm the error reported in this ticket. I installed with the binary from the buildbot and also built from source, and the same failed result occurs. So there is definitely something wrong with the +gfortran
variant for py-numpy. I think py-numpy should switch to +gcc8
as the default.
comment:9 Changed 6 years ago by michaelld (Michael Dickens)
What OSX version are folks having issues on?
py*-numpy +gfortran
built from source works for me on 10.12. Haven't tried this on other OSX versions, but I can easily do so; I do "import numpy; numpy.test()" on all OSX I have around &.it works about the same on all of them. All built from source.
comment:10 Changed 6 years ago by michaelld (Michael Dickens)
I have no objection to switching to +gcc8
; just want to make sure that's the correct fix.
comment:12 Changed 6 years ago by michaelld (Michael Dickens)
Gotcha. I confirm the issue with 10.13. Doesn't happen with 10.12 or any prior (in my testing). So this bug is 10.13 only (in my testing).
Thus, wondering if this is NumPy or something else. The routine "init_dgelsd" looks like it's from LAPACK(E).
comment:13 Changed 6 years ago by mf2k (Frank Schima)
Summary: | python: numpy.polyfit broken → py-numpy: numpy.polyfit broken with +gfortran variant on High Sierra |
---|
comment:14 Changed 6 years ago by mf2k (Frank Schima)
Port: | python27 python36 python37 removed |
---|
comment:15 Changed 6 years ago by michaelld (Michael Dickens)
Following the instructions to set a breakpoint at malloc_error_break
, here's the backtrace for Python 2.7:
% lldb /opt/local/bin/python2.7 (lldb) target create "/opt/local/bin/python2.7" Current executable set to '/opt/local/bin/python2.7' (x86_64). (lldb) b malloc_error_break Breakpoint 1: where = libsystem_malloc.dylib`malloc_error_break, address = 0x0000000000011962 (lldb) r test_numpy_10_13.py Process 25452 launched: '/opt/local/bin/python2.7' (x86_64) Process 25452 stopped * thread #2, stop reason = exec frame #0: 0x000000010000519c dyld`_dyld_start dyld`_dyld_start: -> 0x10000519c <+0>: popq %rdi 0x10000519d <+1>: pushq $0x0 0x10000519f <+3>: movq %rsp, %rbp 0x1000051a2 <+6>: andq $-0x10, %rsp Target 0: (Python) stopped. (lldb) c Process 25452 resuming Python(25452,0x7fff9cda4380) malloc: *** mach_vm_map(size=18446744072618995712) failed (error code=3) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug Process 25452 stopped * thread #2, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 frame #0: 0x00007fff64772962 libsystem_malloc.dylib`malloc_error_break libsystem_malloc.dylib`malloc_error_break: -> 0x7fff64772962 <+0>: pushq %rbp 0x7fff64772963 <+1>: movq %rsp, %rbp 0x7fff64772966 <+4>: nop 0x7fff64772967 <+5>: nopl (%rax) Target 0: (Python) stopped. (lldb) bt * thread #2, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x00007fff64772962 libsystem_malloc.dylib`malloc_error_break frame #1: 0x00007fff6476fa08 libsystem_malloc.dylib`szone_error + 392 frame #2: 0x00007fff64771ed6 libsystem_malloc.dylib`mvm_allocate_pages + 256 frame #3: 0x00007fff64767475 libsystem_malloc.dylib`large_malloc + 464 frame #4: 0x00007fff6476339d libsystem_malloc.dylib`szone_malloc_should_clear + 388 frame #5: 0x00007fff647631bd libsystem_malloc.dylib`malloc_zone_malloc + 103 frame #6: 0x00007fff647624c7 libsystem_malloc.dylib`malloc + 24 frame #7: 0x0000000104fb5a2f _umath_linalg.so`DOUBLE_lstsq + 735 frame #8: 0x0000000104934e27 umath.so`PyUFunc_GenericFunction + 19415 frame #9: 0x00000001049378be umath.so`ufunc_generic_call + 174 frame #10: 0x00000001000b0201 Python`PyObject_Call + 97 frame #11: 0x000000010015b9aa Python`PyEval_EvalFrameEx + 9130 frame #12: 0x00000001001593a4 Python`PyEval_EvalCodeEx + 2212 frame #13: 0x0000000100163f0d Python`fast_function + 109 frame #14: 0x000000010015b80c Python`PyEval_EvalFrameEx + 8716 frame #15: 0x00000001001593a4 Python`PyEval_EvalCodeEx + 2212 frame #16: 0x0000000100163f0d Python`fast_function + 109 frame #17: 0x000000010015b80c Python`PyEval_EvalFrameEx + 8716 frame #18: 0x00000001001593a4 Python`PyEval_EvalCodeEx + 2212 frame #19: 0x0000000100158af2 Python`PyEval_EvalCode + 34 frame #20: 0x0000000100186fed Python`PyRun_FileExFlags + 157 frame #21: 0x0000000100186b24 Python`PyRun_SimpleFileExFlags + 740 frame #22: 0x000000010019e71f Python`Py_Main + 3279 frame #23: 0x00007fff645ba015 libdyld.dylib`start + 1 (lldb)
comment:16 Changed 6 years ago by michaelld (Michael Dickens)
and Python 3.6:
% lldb /opt/local/bin/python3.6 (lldb) target create "/opt/local/bin/python3.6" Current executable set to '/opt/local/bin/python3.6' (x86_64). (lldb) b malloc_error_break Breakpoint 1: where = libsystem_malloc.dylib`malloc_error_break, address = 0x00007fff64772962 (lldb) r test_numpy_10_13.py Process 69019 launched: '/opt/local/Library/Frameworks/Python.framework/Versions/3.6/Resources/Python.app/Contents/MacOS/Python' (x86_64) python3.6(69019,0x7fff9cda4380) malloc: *** mach_vm_map(size=18446744072618991616) failed (error code=3) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug Process 69019 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 frame #0: 0x00007fff64772962 libsystem_malloc.dylib`malloc_error_break libsystem_malloc.dylib`malloc_error_break: -> 0x7fff64772962 <+0>: pushq %rbp 0x7fff64772963 <+1>: movq %rsp, %rbp 0x7fff64772966 <+4>: nop 0x7fff64772967 <+5>: nopl (%rax) Target 0: (Python) stopped. (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x00007fff64772962 libsystem_malloc.dylib`malloc_error_break frame #1: 0x00007fff6476fa08 libsystem_malloc.dylib`szone_error + 392 frame #2: 0x00007fff64771ed6 libsystem_malloc.dylib`mvm_allocate_pages + 256 frame #3: 0x00007fff64767475 libsystem_malloc.dylib`large_malloc + 464 frame #4: 0x00007fff6476339d libsystem_malloc.dylib`szone_malloc_should_clear + 388 frame #5: 0x00007fff647631bd libsystem_malloc.dylib`malloc_zone_malloc + 103 frame #6: 0x00007fff647624c7 libsystem_malloc.dylib`malloc + 24 frame #7: 0x0000000104ed4a2f _umath_linalg.cpython-36m-darwin.so`DOUBLE_lstsq + 735 frame #8: 0x0000000104789267 umath.cpython-36m-darwin.so`PyUFunc_GenericFunction + 19415 frame #9: 0x000000010478be3e umath.cpython-36m-darwin.so`ufunc_generic_call + 174 frame #10: 0x00000001000ad243 Python`_PyObject_FastCallDict + 143 frame #11: 0x00000001000ad5fc Python`_PyObject_FastCallKeywords + 97 frame #12: 0x000000010014c356 Python`call_function + 443 frame #13: 0x0000000100144c25 Python`_PyEval_EvalFrameDefault + 4479 frame #14: 0x000000010014cb06 Python`_PyEval_EvalCodeWithName + 1747 frame #15: 0x000000010014d1e9 Python`fast_function + 218 frame #16: 0x000000010014c35d Python`call_function + 450 frame #17: 0x0000000100144b8d Python`_PyEval_EvalFrameDefault + 4327 frame #18: 0x000000010014cb06 Python`_PyEval_EvalCodeWithName + 1747 frame #19: 0x000000010014d1e9 Python`fast_function + 218 frame #20: 0x000000010014c35d Python`call_function + 450 frame #21: 0x0000000100144b8d Python`_PyEval_EvalFrameDefault + 4327 frame #22: 0x000000010014cb06 Python`_PyEval_EvalCodeWithName + 1747 frame #23: 0x0000000100143a2c Python`PyEval_EvalCode + 42 frame #24: 0x000000010016cd8f Python`run_mod + 54 frame #25: 0x000000010016bd9e Python`PyRun_FileExFlags + 164 frame #26: 0x000000010016b489 Python`PyRun_SimpleFileExFlags + 283 frame #27: 0x000000010018026a Python`Py_Main + 3466 frame #28: 0x0000000100001e1d Python`___lldb_unnamed_symbol1$$Python + 227 frame #29: 0x00007fff645ba015 libdyld.dylib`start + 1 (lldb)
comment:17 Changed 6 years ago by michaelld (Michael Dickens)
and Python 3.7:
% lldb /opt/local/bin/python3.7 (lldb) target create "/opt/local/bin/python3.7" Current executable set to '/opt/local/bin/python3.7' (x86_64). (lldb) b malloc_error_break Breakpoint 1: where = libsystem_malloc.dylib`malloc_error_break, address = 0x0000000000011962 (lldb) r test_numpy_10_13.py Process 79296 launched: '/opt/local/bin/python3.7' (x86_64) Process 79296 stopped * thread #2, stop reason = exec frame #0: 0x000000010000519c dyld`_dyld_start dyld`_dyld_start: -> 0x10000519c <+0>: popq %rdi 0x10000519d <+1>: pushq $0x0 0x10000519f <+3>: movq %rsp, %rbp 0x1000051a2 <+6>: andq $-0x10, %rsp Target 0: (Python) stopped. (lldb) c Process 79296 resuming Python(79296,0x7fff9cda4380) malloc: *** mach_vm_map(size=18446744072618991616) failed (error code=3) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug Process 79296 stopped * thread #2, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 frame #0: 0x00007fff64772962 libsystem_malloc.dylib`malloc_error_break libsystem_malloc.dylib`malloc_error_break: -> 0x7fff64772962 <+0>: pushq %rbp 0x7fff64772963 <+1>: movq %rsp, %rbp 0x7fff64772966 <+4>: nop 0x7fff64772967 <+5>: nopl (%rax) Target 0: (Python) stopped. (lldb) bt * thread #2, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x00007fff64772962 libsystem_malloc.dylib`malloc_error_break frame #1: 0x00007fff6476fa08 libsystem_malloc.dylib`szone_error + 392 frame #2: 0x00007fff64771ed6 libsystem_malloc.dylib`mvm_allocate_pages + 256 frame #3: 0x00007fff64767475 libsystem_malloc.dylib`large_malloc + 464 frame #4: 0x00007fff6476339d libsystem_malloc.dylib`szone_malloc_should_clear + 388 frame #5: 0x00007fff647631bd libsystem_malloc.dylib`malloc_zone_malloc + 103 frame #6: 0x00007fff647624c7 libsystem_malloc.dylib`malloc + 24 frame #7: 0x0000000105256a2f _umath_linalg.cpython-37m-darwin.so`DOUBLE_lstsq + 735 frame #8: 0x0000000104389267 umath.cpython-37m-darwin.so`PyUFunc_GenericFunction + 19415 frame #9: 0x000000010438be3e umath.cpython-37m-darwin.so`ufunc_generic_call + 174 frame #10: 0x00000001000bc6dd Python`_PyObject_FastCallKeywords + 359 frame #11: 0x00000001001525d3 Python`call_function + 568 frame #12: 0x000000010014a6ab Python`_PyEval_EvalFrameDefault + 2706 frame #13: 0x0000000100152f31 Python`_PyEval_EvalCodeWithName + 1837 frame #14: 0x00000001000bc83c Python`_PyFunction_FastCallKeywords + 225 frame #15: 0x00000001001525da Python`call_function + 575 frame #16: 0x000000010014a617 Python`_PyEval_EvalFrameDefault + 2558 frame #17: 0x0000000100152f31 Python`_PyEval_EvalCodeWithName + 1837 frame #18: 0x00000001000bc83c Python`_PyFunction_FastCallKeywords + 225 frame #19: 0x00000001001525da Python`call_function + 575 frame #20: 0x000000010014a586 Python`_PyEval_EvalFrameDefault + 2413 frame #21: 0x0000000100152f31 Python`_PyEval_EvalCodeWithName + 1837 frame #22: 0x0000000100149b91 Python`PyEval_EvalCode + 42 frame #23: 0x0000000100177e7f Python`run_mod + 54 frame #24: 0x0000000100176e9a Python`PyRun_FileExFlags + 164 frame #25: 0x0000000100176579 Python`PyRun_SimpleFileExFlags + 283 frame #26: 0x000000010018dc4e Python`pymain_main + 5114 frame #27: 0x000000010018e3e0 Python`_Py_UnixMain + 104 frame #28: 0x00007fff645ba015 libdyld.dylib`start + 1 frame #29: 0x00007fff645ba015 libdyld.dylib`start + 1 (lldb)
comment:18 Changed 6 years ago by michaelld (Michael Dickens)
So it seems like there's a memory allocation error in umath_linalg
routine DOUBLE_lstsq
... or, something like that.
comment:19 Changed 6 years ago by michaelld (Michael Dickens)
lstsq
== "least squares" ... which is for solving linear problems. So not a huge surprise given that the ticket issue is for polyfit
... fitting polynomial curves to data, which is a least squares type of problem.
comment:20 Changed 6 years ago by michaelld (Michael Dickens)
Here is where the error message is coming from: https://github.com/numpy/numpy/blob/master/numpy/linalg/umath_linalg.c.src#L2561 .
and here's the printed warning: https://github.com/numpy/numpy/blob/master/numpy/lib/polynomial.py#L585 .
No idea if this is useful, but here we are!
comment:21 Changed 6 years ago by michaelld (Michael Dickens)
I don't have gcc8
installed yet, so I can't test with it. That said py27-numpy +gcc7
works for me on 10.13 .. which is strange because the +gfortran
variant has the same dependencies as +gcc7
... which makes sense because ${prefix}/bin/gfortran-mp-7
is provided by gcc7
... so, what is the difference in the Portfile
when using the 2 variants? That seems to make the difference.
comment:22 Changed 6 years ago by mojca (Mojca Miklavec)
These are the ports that I have installed:
cctools @895_6+xcode (active) ld64 @3_1+ld64_xcode (active) py27-numpy @1.15.0_0+gfortran (active) py36-numpy @1.15.0_0+gfortran (active) py37-numpy @1.15.0_0+openblas (active) # after playing with the idea that gfortran might be the cause, but it didn't change anything OpenBLAS @0.3.2_0+clang+gcc7+lapack (active)
Apparently either
sudo port install py37-numpy +gcc7
or
sudo port install py37-numpy +gcc8
fixed the problem, but it would be ideal to figure out why exactly before blindly changing the default.
comment:23 Changed 6 years ago by dershow
Cc: | dershow added |
---|
comment:24 Changed 6 years ago by andreavicere
On my system (MacOS Mojave 10.14, macports 2.5.3) switching only to gcc8
did not suffice.
I had to also switch to
openblas
: running
sudo port install py27-numpy +gcc8 +openblas sudo port install py37-numpy +gcc8 +openblas
fixed the numpy.polyfit
issue in both Python 2 and 3.
comment:25 Changed 6 years ago by jsalort (Julien Salort)
Cc: | jsalort added |
---|
comment:26 Changed 6 years ago by DanielO (Daniel O'Connor)
Cc: | DanielO added |
---|
comment:27 Changed 6 years ago by reneeotten (Renee Otten)
Cc: | reneeotten added |
---|
comment:28 Changed 6 years ago by lpsinger (Leo Singer)
Just selecting openblas and not gcc8 was sufficient for me.
sudo port install py27-numpy +gcc8 +openblas
comment:29 Changed 6 years ago by reneeotten (Renee Otten)
FYI, the upstream bug-report is here.
Installing with +gcc8 and +openblas
did work for me though...
py27-numpy @1.15.4_0+gcc8+openblas (active) OpenBLAS @0.3.3_0+clang+gcc8+lapack (active)
comment:30 Changed 6 years ago by majoc-at-astro (majoc-at-astro)
Cc: | majoc-at-astro added |
---|
comment:31 Changed 6 years ago by mf2k (Frank Schima)
I am not able to get a working py37-scipy. It keeps trying to rebuild and fails after 3 attempts. I'm not sure if this is related to this issue or not? Here are my relevant installed ports.
$ port installed OpenBLAS-devel ld64 cctools py37-numpy py37-scipy The following ports are currently installed: cctools @921_1+llvm70 (active) ld64 @3_1+ld64_xcode (active) OpenBLAS-devel @20190210_0+gcc8+lapack (active) py37-numpy @1.16.1_0+gcc8+openblas (active) py37-scipy @1.2.1_0+gcc8+openblas (active)
I have tried various combinations with no success. Can anyone share a working combo?
comment:33 Changed 6 years ago by Dave-Allured (Dave Allured)
Cc: | Dave-Allured added |
---|
comment:34 Changed 6 years ago by michaelld (Michael Dickens)
comment:35 Changed 6 years ago by michaelld (Michael Dickens)
In my testing, using +gfortran +openblas
does the trick, so I just committed making +openblas
a default variant. This change will fix new default installs, but it won't help folks who already have this port installed & are just updating it.
After reading through the NumPy bug report, I'm wondering if the issue is at least partly with Apple's lapack
since that's what's used if -atlas -openblas
is selected (which was the default). Wondering if this might be fixed by using the port vecLibFort
.
Anyway: Progress!
comment:36 Changed 6 years ago by michaelld (Michael Dickens)
I might think of making a +accelerate
variant that's optional, since NumPy seems to provide an internal version of LAPACK if no other BLAS is selected.
comment:37 Changed 5 years ago by mf2k (Frank Schima)
Resolution: | → worksforme |
---|---|
Status: | assigned → closed |
I cannot replicate this problem anymore.
The test suite for numpy can be run via
and returns a number of errors: