Opened 8 years ago
Last modified 3 years ago
#51642 assigned defect
gcc49 @4.9.3: does not honor -march=native
Reported by: | noloader (Jeffrey Walton) | Owned by: | macports-tickets@… |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | ports | Version: | 2.3.4 |
Keywords: | Cc: | pgf, colinhowell (Colin Douglas Howell) | |
Port: | gcc49 |
Description
I'm attempting a compile uder MacPort's GCC compiler:
$ make CXX=/opt/local//bin/x86_64-apple-darwin12-gcc-4.9.3 /opt/local//bin/x86_64-apple-darwin12-gcc-4.9.3 -DNDEBUG -g2 -O2 -fPIC -march=native -pipe -c cryptlib.cpp :1057:no such instruction: `vzeroupper' :1507:no such instruction: `vzeroupper' :1517:no such instruction: `vzeroupper' :1662:no such instruction: `vzeroupper' :1838:no such instruction: `vzeroupper' ...
Piping it into wc
reveals:
$ make CXX=/opt/local//bin/x86_64-apple-darwin12-gcc-4.9.3 2>&1 | wc -l 190
The machine is a 2012 MacBook Pro with a Core i7-2760QM. Broadly speaking, the cpu includes SSE4 and AES-NI, but lacks RDRAND and RDSEED:
$ sysctl -a | grep machdep.cpu.features machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC POPCNT AES PCID XSAVE OSXSAVE TSCTMR AVX1.0
vzeroupper
is part of AVX, and AVX is available in the cpu feature set.
We have users who use a few MacPorts compilers, so I'm guessing additional testing will reveal the same issue under GCC 5.1 and 5.3. I'm only going to file this one report, however.
Here's some additional information that may be useful.
$ port version Version: 2.3.4
And:
$ sysctl -a | grep machdep.cpu machdep.cpu.max_basic: 13 machdep.cpu.max_ext: 2147483656 machdep.cpu.vendor: GenuineIntel machdep.cpu.brand_string: Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz machdep.cpu.family: 6 machdep.cpu.model: 42 machdep.cpu.extmodel: 2 machdep.cpu.extfamily: 0 machdep.cpu.stepping: 7 machdep.cpu.feature_bits: 3219913727 532341759 machdep.cpu.extfeature_bits: 672139520 1 machdep.cpu.signature: 132775 machdep.cpu.brand: 0 machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC POPCNT AES PCID XSAVE OSXSAVE TSCTMR AVX1.0 machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI machdep.cpu.logical_per_package: 16 machdep.cpu.cores_per_package: 8 machdep.cpu.microcode_version: 26 machdep.cpu.processor_flag: 4 machdep.cpu.mwait.linesize_min: 64 machdep.cpu.mwait.linesize_max: 64 machdep.cpu.mwait.extensions: 3 machdep.cpu.mwait.sub_Cstates: 135456 machdep.cpu.thermal.sensor: 1 machdep.cpu.thermal.dynamic_acceleration: 1 machdep.cpu.thermal.invariant_APIC_timer: 1 machdep.cpu.thermal.thresholds: 2 machdep.cpu.thermal.ACNT_MCNT: 1 machdep.cpu.thermal.core_power_limits: 1 machdep.cpu.thermal.fine_grain_clock_mod: 1 machdep.cpu.thermal.package_thermal_intr: 1 machdep.cpu.thermal.hardware_feedback: 0 machdep.cpu.thermal.energy_policy: 0 machdep.cpu.xsave.extended_state: 7 832 832 0 machdep.cpu.arch_perf.version: 3 machdep.cpu.arch_perf.number: 4 machdep.cpu.arch_perf.width: 48 machdep.cpu.arch_perf.events_number: 7 machdep.cpu.arch_perf.events: 0 machdep.cpu.arch_perf.fixed_number: 3 machdep.cpu.arch_perf.fixed_width: 48 machdep.cpu.cache.linesize: 64 machdep.cpu.cache.L2_associativity: 8 machdep.cpu.cache.size: 256 machdep.cpu.tlb.inst.small: 64 machdep.cpu.tlb.inst.large: 8 machdep.cpu.tlb.data.small: 64 machdep.cpu.tlb.data.large: 32 machdep.cpu.tlb.shared: 512 machdep.cpu.address_bits.physical: 36 machdep.cpu.address_bits.virtual: 48 machdep.cpu.core_count: 4 machdep.cpu.thread_count: 8
Change History (18)
comment:1 Changed 8 years ago by noloader (Jeffrey Walton)
comment:2 follow-ups: 3 5 Changed 8 years ago by ryandesign (Ryan Carsten Schmidt)
Setting CXX=/opt/local//bin/x86_64-apple-darwin12-gcc-4.9.3
is unusual. Usually, you would set CXX=/opt/local/bin/g++-mp-4.9
. Have you tried that already? Does it change anything?
Any particular reason you're using gcc 4.9.3? The latest stable version is 6.1.0, available in the gcc6 port.
comment:3 Changed 8 years ago by ryandesign (Ryan Carsten Schmidt)
Replying to ryandesign@…:
Setting
CXX=/opt/local//bin/x86_64-apple-darwin12-gcc-4.9.3
is unusual. Usually, you would setCXX=/opt/local/bin/g++-mp-4.9
. Have you tried that already? Does it change anything?
I guess they're both hardlinks to the same file so it shouldn't make any difference. Still, /opt/local/bin/g++-mp-4.9 is the recommended form, since it does not vary by processor, OS version, or minor gcc version number.
comment:4 Changed 8 years ago by ryandesign (Ryan Carsten Schmidt)
Owner: | changed from macports-tickets@… to mww@… |
---|---|
Port: | gcc49 added |
Summary: | GCC does not honor -march=native → gcc49 @4.9.3: does not honor -march=native |
comment:5 Changed 8 years ago by noloader (Jeffrey Walton)
Replying to ryandesign@…:
Setting
CXX=/opt/local/bin/x86_64-apple-darwin12-gcc-4.9.3
is unusual. Usually, you would setCXX=/opt/local/bin/g++-mp-4.9
. Have you tried that already? Does it change anything?
Yeah, Uri Blumenthal told me it was wrong. I couldn't find a c++ compiler, so I installed GCC and hoped it would do the right thing based on file extensions:
$ port search 'g++' No match for g++ found $ port search '*g++*' No match for *g++* found $ port search 'g\+\+' No match for g\+\+ found $ port search '*g\+\+*' No match for *g\+\+* found
Any particular reason you're using gcc 4.9.3? The latest stable version is 6.1.0, available in the gcc6 port.
Yes. I am setting up tests for MacPorts GCC compilers using a 4.x, 5.x and 6.x. They will all be tested eventually.
comment:6 Changed 8 years ago by noloader (Jeffrey Walton)
This is probably related to -Wa,-q
. When I originally tried to use it, I used it with a typo: -Wa,q
. The typo resulted in:
$ /opt/local/bin/g++-mp-4.9 -Wa,q -c test.cxx as:file(q) Can't open source file for input! No such file or directory.
At that point, I thought the compiler did not support the option.
Now that the typo has been corrected, it appears the option is supported:
$ /opt/local/bin/g++-mp-4.9 -Wa,-q -c test.cxx /opt/local/bin/as: assembler (/opt/local/bin/clang) not installed
(The next mystery is why GCC cannot find one of the seven Clang's installed. I'll sort that out next on Stack Overflow).
comment:7 Changed 8 years ago by noloader (Jeffrey Walton)
Its not clear to me if -Wa-q
is the solution to the problem. We are trying to get our changes for MacPorts GCC tested now: Need MacPorts testers who use GCC.
I'll report back if we hear something from one of the users.
comment:8 follow-ups: 9 10 Changed 8 years ago by Veence (Vincent)
I am sorry to butt in, but this sounds to me like the old 'as' story. GCC 4 use Apple-supplied /usr/bin/as (or /opt/local/bin/as for that matter):
as -v Apple Inc version cctools-877.8, GNU assembler version 1.38
Which is so ancient it does not know about SSE 4 or AVX.
Clang has its own modern assembly, and can (maybe?) be used as a backend for GCC
comment:9 Changed 8 years ago by Veence (Vincent)
Replying to vince@…:
I am sorry to butt in, but this sounds to me like the old 'as' story. GCC 4 use Apple-supplied /usr/bin/as (or /opt/local/bin/as for that matter):
as -v Apple Inc version cctools-877.8, GNU assembler version 1.38Which is so ancient it does not know about SSE 4 or AVX.
Clang has its own modern assembler, and can (maybe?) be used as a backend for GCC
comment:10 Changed 8 years ago by pgf
Replying to Veence:
I am sorry to butt in, but this sounds to me like the old 'as' story. GCC 4 use Apple-supplied /usr/bin/as (or /opt/local/bin/as for that matter):
as -v Apple Inc version cctools-877.8, GNU assembler version 1.38Which is so ancient it does not know about SSE 4 or AVX.
The GCCs provided by MacPorts depend on cctools
which in turn provides that really old as
1.38 (/opt/local/bin/as
) which does not support newer instructions like AVX. That's why the compilation fails with the error no such instruction
if one tries to enable those instructions with -march=
, -mavx
, etc... Indeed there are ports which fail to build for this reason depending on the host CPU, for example gdal
with the variants +gccX
and +perf
on my machine (Core i7 Hasswell), because the latter append -march=native
to configure.optflags
.
Clang has its own modern assembly, and can (maybe?) be used as a backend for GCC
Yes, that's exactly what the -Wa,-q
compiler option does. It instructs the assembler (/opt/local/bin/as
) to switch to the clang integrated assembler (opt/local/bin/clang -integrated-as
):
[foo@bar: ~]$ /opt/local/bin/as -v Apple Inc version cctools-886, GNU assembler version 1.38 [foo@bar: ~]$ /opt/local/bin/as -v -q clang version 3.9.0 (tags/RELEASE_390/final) Target: x86_64-apple-darwin16.1.0 Thread model: posix InstalledDir: /opt/local/libexec/llvm-3.9/bin "/opt/local/libexec/llvm-3.9/bin/clang" -cc1as -triple x86_64-apple-macosx10.12.0 -filetype obj -main-file-name - -target-cpu core2 -fdebug-compilation-dir /Users/foo -dwarf-debug-producer clang version 3.9.0 (tags/RELEASE_390/final) -dwarf-version=2 -mrelocation-model pic -o a.out -
The clang compiler provided by MacPorts must be installed and selected in order to use the integrated assembler:
[foo@bar: ~]$ sudo port select --show clang The currently selected version for 'clang' is 'mp-clang-3.9'. [foo@bar: ~]$ sudo port -f deactivate clang-3.9 ---> Deactivating clang-3.9 @3.9.0_1+analyzer ---> Cleaning clang-3.9 [foo@bar: ~]$ /opt/local/bin/as -v -q /opt/local/bin/as: assembler (/opt/local/bin/clang) not installed
The good news is that one doesn't need to mess with the compiler options. The same behaviour can be triggered setting the AS_INTEGRATED_ASSEMBLER environment variable:
[foo@bar: ~]$ AS_INTEGRATED_ASSEMBLER=1 /opt/local/bin/as -v clang version 3.9.0 (tags/RELEASE_390/final) Target: x86_64-apple-darwin16.1.0 Thread model: posix InstalledDir: /opt/local/libexec/llvm-3.9/bin "/opt/local/libexec/llvm-3.9/bin/clang" -cc1as -triple x86_64-apple-macosx10.12.0 -filetype obj -main-file-name - -target-cpu core2 -fdebug-compilation-dir /Users/foo -dwarf-debug-producer clang version 3.9.0 (tags/RELEASE_390/final) -dwarf-version=2 -mrelocation-model pic -o a.out -
This is really useful because one doesn't need to change anything (Makefiles, scripts, etc...) to switch to the clang integrated assembler, just set that env variable. Indeed, a whole port can be compiled this way adding the env variable to the different keywords configure.env
, build.env
, test.env
, etc...
comment:11 Changed 8 years ago by pgf
Cc: | pgf added |
---|
comment:12 Changed 8 years ago by kurthindenburg (Kurt Hindenburg)
Owner: | changed from mww@… to macports-tickets@… |
---|---|
Status: | new → assigned |
comment:13 Changed 7 years ago by colinhowell (Colin Douglas Howell)
I'm glad for pgf's workaround, because I was being bitten by this bug with the gcc7 port, using current MacPorts 2.4.2.
The gcc compiler proper only generates assembly language output, and it must invoke an assembler to assemble that output into binary object code. All of MacPorts' gcc ports have been configured to use /opt/local/bin/as, the old cctools assembler. So by the very nature of this problem, it will affect every GCC port that can generate instructions not recognized by that assembler, which means everything from gcc44 onward. With any of those compilers, this can happen either through using -march=native to generate optimal code for one's own machine, if that machine is reasonably new (one with a "Sandy Bridge" processor or newer, so around 2011 or newer), or through explicitly asking to generate optimal code for a newer architecture (e.g. -march=sandybridge), or through asking to use an instruction-set feature (e.g. -mavx) that would generate unrecognized instructions.
Therefore this ticket should be updated to include all the gcc ports from gcc44 on (except for cross-compilers for non-x86 architectures, of course) and to mention it still afflicts the latest MacPorts version. This is a serious problem and really should be fixed.
comment:14 Changed 7 years ago by colinhowell (Colin Douglas Howell)
comment:15 Changed 7 years ago by colinhowell (Colin Douglas Howell)
Cc: | colinhowell added |
---|
comment:16 Changed 7 years ago by mouse07410 (Mouse)
For all the non-cross-compiled stuff, it would probably be advisable to set env variable AS_INTEGRATED_ASSEMBLER=1
.
comment:17 Changed 7 years ago by kencu (Ken)
I'm not certain which option would be better. We can leave things as they are, and use either the Wa,-q
flag, or the AS_INTEGRATED_ASSEMBLER=1
environment variable to get the alternate behaviour (clang as assembler instead of as
).
Or we could presumably change the gcc build (on Intel) to set clang as the assembler by default by specifying some version of clang here instead of as
and then setting a build dep on that:
AS_FOR_TARGET=${prefix}/path/to/some/clang
HOWEVER, that could cause some irritating bootstrap issues and other issues...
My initial inclination is to leave things as they are, as there is an easy workaround to get the "clang
as as
" option, but none to go the other way...
comment:18 Changed 3 years ago by barracuda156
Several crypto-related ports fail for x86_64
now on 10.6.8 when built with gcc11
(nettle
, gnutls
, nss
). So there is basically no solution for gcc
?
I probably should have supplied this with the original report. Sorry about that.