Opened 2 years ago
Last modified 2 years ago
#66307 assigned defect
Scalapack: will not configure properly on PPC since mpi PG cannot handle mpich-gcc* but wants mpich-default
Reported by: | barracuda156 | Owned by: | catap (Kirill A. Korinsky) |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | ports | Version: | 2.8.0 |
Keywords: | powerpc, leopard, snowleopard | Cc: | |
Port: | scalapack |
Description
Existing configure fails with mpich-gcc. To begin with, it tries to install mpich-default, warning that it gonna fail. After removing mpi specification line, configure fails on:
-- Check for working Fortran compiler: /opt/local/bin/gfortran-mp-12 - skipped -- Found MPI_C: /usr/lib/libmpi.dylib (found version "2.0") -- Could NOT find MPI_Fortran (missing: MPI_Fortran_LIB_NAMES MPI_Fortran_F77_HEADER_DIR MPI_Fortran_MODULE_DIR MPI_Fortran_WORKS) -- Could NOT find MPI (missing: MPI_Fortran_FOUND) (found version "2.0") -- Found MPI_LIBRARY : FALSE CMake Error at CMakeLists.txt:74 (message): --> MPI Library NOT FOUND -- please set MPI_BASE_DIR accordingly -- -- Configuring incomplete, errors occurred!
Yet mpich
is installed with gfortran
. The problem is that ${mpi.fc}
is not a name of a compiler, but only a prefix; naturally, it fails to be found.
This works correctly:
pre-configure { configure.args-append \ -DMPI_C_COMPILER=${mpi.cc}-mpich-gcc12 \ -DMPI_Fortran_COMPILER=${mpi.fc}-mpich-gcc12 \ -DMPIEXEC=${prefix}/bin/${mpi.exec} \ -DLAPACK_LIBRARIES="-L${prefix}/lib ${linalglib}" }
Obviously, a general fix should be added either to the PG or the port. I just quoted what fixed the failure locally.
Also, this should be added, otherwise gcc-4.2
is picked and it fails on configure due to unsupported flags:
compiler.blacklist *gcc-4.*
Attachments (1)
Change History (29)
comment:1 follow-up: 2 Changed 2 years ago by catap (Kirill A. Korinsky)
comment:2 Changed 2 years ago by barracuda156
Replying to catap:
Sergey, may I ask you to open PR? :)
Sure, but we need a solution for mpi PG for mpich-gcc to be correctly supported or otherwise “required” should be dropped from mpi options.
comment:3 follow-up: 4 Changed 2 years ago by kencu (Ken)
Not to mention that gcc12 in not presently a supported compiler on macports for systems < SnowLeopard or for PPC.
The mpich ecosystem is quite difficult to approach, and a general fix for older systems is not simple.
In the end, I just wrote up my own mpich-default port in about 10 minutes that builds and runs on every system from 10.4 PPC on up.
comment:4 follow-up: 6 Changed 2 years ago by barracuda156
Replying to kencu:
Not to mention that gcc12 in not presently a supported compiler on macports for systems < SnowLeopard or for PPC.
The time has come. I will make a PR today to move Leopard to modern libgcc and enable building gcc10+ with gcc10-bootstrap. I have been using that ever since gcc10-bootstrap was made by catap across three systems, I reckon this is a thorough testing.
The mpich ecosystem is quite difficult to approach, and a general fix for older systems is not simple.
In the end, I just wrote up my own mpich-default port in about 10 minutes that builds and runs on every system from 10.4 PPC on up.
That can be another solution. Could you commit your update to mpich-default or share it, so that we don’t do the work twice?
comment:5 Changed 2 years ago by kencu (Ken)
I will post up my mpich-default port here for you to try out. I just use it as a drop-in replacement for the current one in MacPorts, and don't touch anything else in the Portfiles or PortGroups.
Once mpich-default is installed, scalapack configures easily, finds all the correct compilers, and builds through to the end on 10.4 PPC without any alterations to the scalapack Portfile.
There is, however, a linking error at the very final step that I didn't look into yet though...
comment:6 follow-up: 11 Changed 2 years ago by kencu (Ken)
Replying to barracuda156:
The time has come. I will make a PR today to move Leopard to modern libgcc and enable building gcc10+ with gcc10-bootstrap. I have been using that ever since gcc10-bootstrap was made by catap across three systems, I reckon this is a thorough testing.
Sounds good! You need to:
- make sure it builds on 10.4, 10.5 and 10.6 PPC and 10.4 and 10.5 Intel
- control for the possibility that a user has a newer clang installed on 10.5 Intel that will be used as the assembler (this messes up gcc sometimes).
- make some kind of consideration for the fact that libgcc8,9,10,and 11 will be missing (I presume) so the libgcc Port will need to somehow handle that
- have a "force-deactivate" phase such as the one done when we upgraded those systems from libgcc6 to libgcc7 a few years ago, and like the one I did when I updated 10.6 Intel from libgcc7 to libgcc12.
You will most likely need some help, as these things are somewhat hard to do right. Catap here has developed many of the needed skills, and there have been a few folks around with strong opinions on things that will be able to dig in and test your proposal.
Good luck!
comment:7 Changed 2 years ago by kencu (Ken)
Summary: | Scalapack: configure options breaking build on PPC → Scalapack: will not configure properly if mpi is removed from the Portfile |
---|
comment:8 follow-up: 9 Changed 2 years ago by kencu (Ken)
I changed the title of your ticket here, as scalapack configures 100% fine if it finds mpich-default to configure against.
However, if you remove the mpi specification line, I suppose it is not a big surprise that it won't configure right.
comment:9 Changed 2 years ago by barracuda156
Replying to kencu:
I changed the title of your ticket here, as scalapack configures 100% fine if it finds mpich-default to configure against.
However, if you remove the mpi specification line, I suppose it is not a big surprise that it won't configure right.
Well, I did not remove mpi
from the portfile, of course. I only removed required
option, which wanted mpich-default
, which in turn warned it is broken.
As it is, settings are wrong. Either mpich-default
has to be fixed for PPC or mpich-gcc
enabled correctly in mpi
PG.
comment:10 Changed 2 years ago by barracuda156
Summary: | Scalapack: will not configure properly if mpi is removed from the Portfile → Scalapack: will not configure properly on PPC since mpi PG cannot handle mpich-gcc* but wants mpich-default |
---|
comment:11 Changed 2 years ago by barracuda156
Replying to kencu:
You will most likely need some help, as these things are somewhat hard to do right. Catap here has developed many of the needed skills, and there have been a few folks around with strong opinions on things that will be able to dig in and test your proposal.
Good luck!
Thank you. I will certainly need help re Intel part: this is not something I am able to do.
I will check re force deactivation (point 4).
As for point 3, all libgcc build fine, though I am not sure we need silly ports that do not install anything but take several hours to build – just to delete its build directory, leaving a line in the registry. AFAIR, libgcc8
installs exactly nothing. libgcc9
installs extensions dylibs, though those perhaps should be installed by libgcc10
, if at all needed (Iain said we can use symlinks instead). libgcc10
installs a Fortran dylib – again, not sure if that is something needed. libgcc11
installs nothing again.
comment:12 Changed 2 years ago by barracuda156
Okay, so to be precise, here is what libgcc
s install:
libgcc6
:
libgfortran.3.dylib
libgcc7
:
libgfortran.4.dylib
libgcc8
: NOTHING
libgcc9
:
libgcc_ext.10.4.dylib libgcc_ext.10.5.dylib
libgcc10
: NOTHING
libgcc11
: NOTHING
libgcc12
: full runtime, of which into libgcc
:
libatomic.1.dylib libatomic.dylib libgcc_ehs.1.1.dylib libgcc_ehs.dylib libgcc_s.1.1.dylib libgcc_s.1.dylib libgcc_s.dylib libgfortran.5.dylib libgfortran.dylib libgomp.1.dylib libgomp.dylib libitm.1.dylib libitm.dylib libobjc-gnu.4.dylib libobjc-gnu.dylib libssp.0.dylib libssp.dylib libstdc++.6.dylib libstdc++.dylib
That is, we got three parasitic dependencies which install strictly nothing but are required to be built. On the fastest G5 – I have G5 Quad with 16 GB RAM and SSD – it still takes about 3 hours per arch, so if we consider Leopard and universal builds, that translated into 18 hours of useless compilation.
comment:13 follow-up: 16 Changed 2 years ago by kencu (Ken)
i have not seen that all gccs 8-11 build on 10.4 through 10.6 Intel and PPC. Esp build in MacPorts environment.
Where does your assertion come from?
OTOH, I also see no reason to support all those either.
comment:14 Changed 2 years ago by kencu (Ken)
btw, we already do use symlinks where it works.
And sure, once it was realized that libgcc11, for example, installs nothing, the libgcc11 port could have been set up to skip building it. You’d have to take that up with Chris, who set that up.
comment:15 follow-up: 17 Changed 2 years ago by kencu (Ken)
Oh, I remember why Chris set it up to always build.
Iain changes things around sometimes between version bumps, so gcc11.1 sometimes installed libraries that gcc11.0 did not, for example, and it was hard to keep up with that.
Also, different OS versions and different archs installed different libraries.
So rather than try to keep track of all that nonsense, the portfile just looks at the dylibs and installs what is missing. Sometimes that is nothing, sometimes that is not nothing.
comment:16 Changed 2 years ago by barracuda156
Replying to kencu:
i have not seen that all gccs 8-11 build on 10.4 through 10.6 Intel and PPC. Esp build in MacPorts environment.
Where does your assertion come from?
I have built those myself on 10.5.8 and 10A190. In Macports environment, of course. I have used gcc10–gcc12, they all work on Leopard, SL PPC and SL Rosetta.
Not sure if we have any case for gcc8–gcc10 at all, and certainly not for parasitic libgccs (gcc10-bootstrap
is the only gcc10 that is essential). We do want to keep gcc11
for the time-being, since gcc12 has occasional failures, see discussion here: https://github.com/iains/darwin-toolchains-start-here/discussions/41
comment:17 follow-up: 19 Changed 2 years ago by barracuda156
Replying to kencu:
Oh, I remember why Chris set it up to always build.
Iain changes things around sometimes between version bumps, so gcc11.1 sometimes installed libraries that gcc11.0 did not, for example, and it was hard to keep up with that.
Also, different OS versions and different archs installed different libraries.
So rather than try to keep track of all that nonsense, the portfile just looks at the dylibs and installs what is missing. Sometimes that is nothing, sometimes that is not nothing.
Well, you perhaps remember what Iaian said in that regard: it should not be necessary at all.
However, I do not want to push this change – that may prove too hard, and can cause PR to be closed unnecessarily. Also, there is no reason to do it in one go. Once the current PR is merged, we can discuss what to do with unneeded libgccs. For gcc12 we only need gcc10-bootstrap and libgcc12, not other versions.
- S. By the way, there is one issue which I forgot about: blacklisting of gccs in portfile of gcc12 has a weird effect of causing dependency cycle. I have no idea why. But I had to remove at least blacklist of gcc-4.2, even though it is not used for anything at all, in order for gcc10-bootstrap do its work.
comment:18 follow-up: 22 Changed 2 years ago by kencu (Ken)
OK, please post up the actual proven build successes you have (in the PR, or some other place than this scalapack ticket).
No committer will be able to take anyone's word for this as there are too many opportunities for errors and too many systems to cover off.
comment:19 Changed 2 years ago by kencu (Ken)
Replying to barracuda156:
Replying to kencu:
Oh, I remember why Chris set it up to always build.
Iain changes things around sometimes between version bumps, so gcc11.1 sometimes installed libraries that gcc11.0 did not, for example, and it was hard to keep up with that.
Also, different OS versions and different archs installed different libraries.
So rather than try to keep track of all that nonsense, the portfile just looks at the dylibs and installs what is missing. Sometimes that is nothing, sometimes that is not nothing.
Well, you perhaps remember what Iaian said in that regard: it should not be necessary at all.
It is needed though, as it caused build failures othewise when libraries could not be found.
However, I do not want to push this change – that may prove too hard, and can cause PR to be closed unnecessarily. Also, there is no reason to do it in one go. Once the current PR is merged, we can discuss what to do with unneeded libgccs.
check.
For gcc12 we only need gcc10-bootstrap and libgcc12, not other versions.
There will be some work to do, as currently libgcc7 depends on libgcc8 which depends on ... through to libgcc12. So that has to be sorted out.
- S. By the way, there is one issue which I forgot about: blacklisting of gccs in portfile of gcc12 has a weird effect of causing dependency cycle. I have no idea why. But I had to remove at least blacklist of gcc-4.2, even though it is not used for anything at all, in order for gcc10-bootstrap do its work.
No idea why this would happen.
comment:20 follow-up: 21 Changed 2 years ago by kencu (Ken)
here is my port of mpich-default 4.0.3. Passes all tests.
3.4.2 built on all systems. I have only tried this updated one so far on Tiger PPC, though.
To run the test suite, you have to "sudo port select python3 python310" or similar, as I didn't as yet rewrite a full python into the test files.
Changed 2 years ago by kencu (Ken)
Attachment: | science-mpich-default-4.0.3-Porttree.zip added |
---|
greatly simplified mpich-default that I use on older systems
comment:21 Changed 2 years ago by barracuda156
Replying to kencu:
here is my port of mpich-default 4.0.3. Passes all tests.
Thank you! Any idea why was it even disabled?
I looked through the existing mpich
port, and it takes little to enable mpich-default
from there. Also, it appears that mpich-default
is technically non-different from mpich-gcc12
(or whatever is the default system compiler). The only thing required is blacklist old GCCs, like you have done, or otherwise set C++11 standard (TBH, I did not check if that is required, but given that gcc-4.2
does not build it, failing immediately at configure, I guess yes), so that correct GCC is used.
- S. Also
+native
needs a fix for PPC, like I did forfolly
or smth alike (-march=native
is not supported with PPC).
comment:22 follow-up: 23 Changed 2 years ago by barracuda156
Replying to kencu:
OK, please post up the actual proven build successes you have (in the PR, or some other place than this scalapack ticket).
No committer will be able to take anyone's word for this as there are too many opportunities for errors and too many systems to cover off.
In fact you should be able to see that from statistics. For example, you can see libgcc9
installed for PPC: https://ports.macports.org/port/libgcc9/stats/
TBH, I do not get why you doubt that so much, given that we know for the fact that gcc11
and gcc12
build and work on PPC across three systems, starting from Leopard. Moreover, all GCCs are supported by upstream – so they must build, unless Macports break something on its side. There is nothing surprising in that.
comment:23 follow-up: 24 Changed 2 years ago by kencu (Ken)
Replying to barracuda156:
TBH, I do not get why you doubt that so much
I doubt it so much because I've been watching well-meaning PRs fail for many years now, and I have in particular seen how fragile building gcc can be, especially in the "non-sterile" MacPorts environment.
How do you think I got to know Iain so well in the first place ;>
Anyway, if it builds so easily, folks should have no trouble showing that. I can't get anything useful off of the stats website for this question as I would have no idea how they built it, if they did build it.
comment:24 follow-ups: 26 27 Changed 2 years ago by barracuda156
Replying to kencu:
Anyway, if it builds so easily, folks should have no trouble showing that.
Well, I have no trouble showing that, but what would you consider the evidence? :)
- S. But again, we do not need any of libgccs aside of libgcc12 in order to move to gcc12.
comment:25 Changed 2 years ago by barracuda156
On a side-note, I have built mpich-default
on 10.6 by simply allowing it in the portfile. Wonder why at all it was banned in the first place. It does not seem to need any kind of hacks whatsoever.
comment:26 Changed 2 years ago by kencu (Ken)
Replying to barracuda156:
- S. But again, we do not need any of libgccs aside of libgcc12 in order to move to gcc12.
P.S. But again, gcc7 and all older gccs will then be broken :>
comment:27 Changed 2 years ago by kencu (Ken)
Replying to barracuda156:
Well, I have no trouble showing that, but what would you consider the evidence? :)
Somebody putting their name down beside verification that the given gcc did indeed build on the given system with the given PR.
So that later, if/when the build fails, we know who to ask.
comment:28 Changed 2 years ago by kencu (Ken)
Anyway, I think it's time to stop talking about hypotheticals in this scalapack ticket, and put the effort into the actual PR. So no more responses about libgcc/gcc from me here.
Re: mpich-default -- that was a project I undertook 18 months ago that I eventually gave up on and wrote my own. If you can sell a PR to the maintainers, feel free to float one.
Sergey, may I ask you to open PR? :)