Opened 15 years ago

Closed 15 years ago

Last modified 14 years ago

#24033 closed defect (worksforme)

atlas-3.8.3 fails to compile on Core i7

Reported by: martisa@… Owned by: jameskyle@…
Priority: Normal Milestone:
Component: ports Version: 1.8.2
Keywords: Cc: jowens@…
Port: atlas

Description

I'm quiet new here, but I have long Linux experience (Gentoo). Atlas fails to compile at any attempt on my new i7.

BEGINNING GEMV PRIMITIVE TESTING:

   TEST TA=N, M=997, N=177, lda=1004, beta=(0.000000,0.000000) STARTED
   TEST TA=N, M=997, N=177, lda=1004, beta=(0.000000,0.000000) PASSED
   TEST TA=-, M=997, N=177, lda=1004, beta=(0.000000,0.000000) STARTED
   TEST TA=-, M=997, N=177, lda=1004, beta=(0.000000,0.000000) PASSED
   TEST TA=N, M=997, N=177, lda=1004, beta=(1.000000,0.000000) STARTED
   TEST TA=N, M=997, N=177, lda=1004, beta=(1.000000,0.000000) PASSED
   TEST TA=-, M=997, N=177, lda=1004, beta=(1.000000,0.000000) STARTED
   TEST TA=-, M=997, N=177, lda=1004, beta=(1.000000,0.000000) PASSED
   TEST TA=N, M=997, N=177, lda=1004, beta=(0.800000,0.000000) STARTED
   TEST TA=N, M=997, N=177, lda=1004, beta=(0.800000,0.000000) PASSED
   TEST TA=-, M=997, N=177, lda=1004, beta=(0.800000,0.000000) STARTED
   TEST TA=-, M=997, N=177, lda=1004, beta=(0.800000,0.000000) PASSED
   TEST TA=N, M=997, N=177, lda=1004, beta=(0.800000,0.300000) STARTED
   TEST TA=N, M=997, N=177, lda=1004, beta=(0.800000,0.300000) PASSED
   TEST TA=-, M=997, N=177, lda=1004, beta=(0.800000,0.300000) STARTED
   TEST TA=-, M=997, N=177, lda=1004, beta=(0.800000,0.300000) PASSED


GEMV PRIMITIVE PASSED ALL TESTS

Assertion failed: (fscanf(fp, " %lf %lf %lf", mfs, mfs+1, mfs+2) == 3), function mvcase, file /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_math_atlas/work/atlas-3.8.3/build/..//tune/blas/gemv/mvsearch.c, line 211.
ATL_cgemvN_mm.c : 2397.39
make[3]: *** [res/cMVRES] Abort trap
make[2]: *** [/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_math_atlas/work/atlas-3.8.3/build/tune/blas/gemv/res/cMVRES] Error 2
ERROR 734 DURING MVTUNE!!.  CHECK INSTALL_LOG/cMVTUNE.LOG FOR DETAILS.
cd /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_math_atlas/work/atlas-3.8.3/build ; make error_report
make -f Make.top error_report
uname -a 2>&1 >> bin/INSTALL_LOG/ERROR.LOG
/opt/local/bin/gcc-mp-4.3 -v 2>&1  >> bin/INSTALL_LOG/ERROR.LOG
Using built-in specs.
Target: x86_64-apple-darwin10
Configured with: ../gcc-4.3.4/configure --prefix=/opt/local --build=x86_64-apple-darwin10 --enable-languages=c,c++,objc,obj-c++,java,fortran --libdir=/opt/local/lib/gcc43 --includedir=/opt/local/include/gcc43 --infodir=/opt/local/share/info --mandir=/opt/local/share/man --with-local-prefix=/opt/local --with-system-zlib --disable-nls --program-suffix=-mp-4.3 --with-gxx-include-dir=/opt/local/include/gcc43/c++/ --with-gmp=/opt/local --with-mpfr=/opt/local
Thread model: posix
gcc version 4.3.4 (GCC) 
/opt/local/bin/gcc-mp-4.3 -V 2>&1  >> bin/INSTALL_LOG/ERROR.LOG
gcc-mp-4.3: '-V' option must have argument
make[4]: [error_report] Error 1 (ignored)
/opt/local/bin/gcc-mp-4.3 --version 2>&1  >> bin/INSTALL_LOG/ERROR.LOG
tar cf error_UNKNOWNx8664SSE3.tar Make.inc bin/INSTALL_LOG/*
gzip --best error_UNKNOWNx8664SSE3.tar
mv error_UNKNOWNx8664SSE3.tar.gz error_UNKNOWNx8664SSE3.tgz
Error report error_<ARCH>.tgz has been created in your top-level ATLAS
directory.  Be sure to include this file in any help request.
cat: ../../CONFIG/error.txt: No such file or directory
cat: ../../CONFIG/error.txt: No such file or directory


IN STAGE 1 INSTALL:  SYSTEM PROBE/AUX COMPILE


   Level 1 cache size calculated as 32KB
   dFPU: Separate multiply and add instructions with 5 cycle pipeline.
         Apparent number of registers : 17
         Register-register performance=6737.95MFLOPS
   sFPU: Separate multiply and add instructions with 4 cycle pipeline.
         Apparent number of registers : 17
         Register-register performance=6757.74MFLOPS


IN STAGE 2 INSTALL:  TYPE-DEPENDENT TUNING


STAGE 2-1: TUNING PREC='d' (precision 1 of 4)


   STAGE 2-1-1 : BUILDING BLOCK MATMUL TUNE
      The best matmul kernel was ATL_dmm4x2x128_sse2.c, NB=56, written by Whaley & Voronenko
      Performance: 11750.77MFLOPS (419.67 percent of of detected clock rate)
        (Gen case got 5754.66MFLOPS)
      NCgemmNN : muladd=1, lat=2, pf=512, nb=44, mu=13, nu=1 ku=44,
                 ForceFetch=1, ifetch=13 nfetch=1
                 Performance = 5187.76 (44.15 of copy matmul, 185.28 of clock)
      NCgemmNT : muladd=1, lat=2, pf=512, nb=44, mu=13, nu=1 ku=4,
                 ForceFetch=1, ifetch=13 nfetch=1
                 Performance = 4222.83 (35.94 of copy matmul, 150.82 of clock)
      NCgemmTN : muladd=1, lat=4, pf=512, nb=44, mu=13, nu=1 ku=44,
                 ForceFetch=1, ifetch=13 nfetch=1
                 Performance = 5063.36 (43.09 of copy matmul, 180.83 of clock)
      NCgemmTT : muladd=1, lat=5, pf=512, nb=44, mu=13, nu=1 ku=44,
                 ForceFetch=1, ifetch=13 nfetch=1
                 Performance = 3961.07 (33.71 of copy matmul, 141.47 of clock)
make -f Makefile MMinstall pre=d 2>&1 | ./xatlas_tee INSTALL_LOG/Stage1.log


   STAGE 2-1-2: CacheEdge DETECTION


   STAGE 2-1-3: LARGE/SMALL CASE CROSSOVER DETECTION


   STAGE 2-1-4: LEVEL 3 BLAS TUNE
make -f Makefile INSTALL_LOG/atlas_dtrsmXover.h pre=d 2>&1 | ./xatlas_tee INSTALL_LOG/dL3TUNE.LOG
make -f Makefile dcblaslib 2>&1 | ./xatlas_tee INSTALL_LOG/dL3TUNE.LOG
      done.


   STAGE 2-1-5: GEMV TUNE
      gemvN : chose routine 9:ATL_gemvN_32x4_1.c written by R. Clint Whaley
              Yunroll=32, Xunroll=4, using 100 percent of L1
              Performance = 3100.26 (26.38 of copy matmul, 110.72 of clock)
      gemvT : chose routine 105:ATL_gemvT_2x16_1.c written by R. Clint Whaley
              Yunroll=2, Xunroll=16, using 100 percent of L1
              Performance = 3432.03 (29.21 of copy matmul, 122.57 of clock)


   STAGE 2-1-6: GER TUNE
      ger : chose routine 2:ATL_ger1_4x4_1.c written by R. Clint Whaley
            mu=4, nu=4, using  0.86 percent of L1 Cache
              Performance = 3230.32 (27.49 of copy matmul, 115.37 of clock)


STAGE 2-2: TUNING PREC='s' (precision 2 of 4)


   STAGE 2-2-1 : BUILDING BLOCK MATMUL TUNE
      The best matmul kernel was ATL_smm2x2x256_sse.c, NB=72, written by R. Clint Whaley
      Performance: 21457.33MFLOPS (766.33 percent of of detected clock rate)
        (Gen case got 6025.71MFLOPS)
      NCgemmNN : muladd=1, lat=4, pf=0, nb=56, mu=14, nu=1 ku=56,
                 ForceFetch=0, ifetch=10 nfetch=4
                 Performance = 5454.38 (25.42 of copy matmul, 194.80 of clock)
      NCgemmNT : muladd=1, lat=8, pf=0, nb=56, mu=14, nu=1 ku=4,
                 ForceFetch=0, ifetch=10 nfetch=4
                 Performance = 4509.88 (21.02 of copy matmul, 161.07 of clock)
      NCgemmTN : muladd=1, lat=2, pf=0, nb=56, mu=14, nu=1 ku=56,
                 ForceFetch=0, ifetch=10 nfetch=4
                 Performance = 4692.17 (21.87 of copy matmul, 167.58 of clock)
      NCgemmTT : muladd=1, lat=8, pf=0, nb=56, mu=14, nu=1 ku=56,
                 ForceFetch=0, ifetch=10 nfetch=4
                 Performance = 4269.04 (19.90 of copy matmul, 152.47 of clock)
make -f Makefile MMinstall pre=s 2>&1 | ./xatlas_tee INSTALL_LOG/dL3TUNE.LOG


   STAGE 2-2-2: CacheEdge DETECTION


   STAGE 2-2-3: LARGE/SMALL CASE CROSSOVER DETECTION


   STAGE 2-2-4: LEVEL 3 BLAS TUNE
make -f Makefile INSTALL_LOG/atlas_strsmXover.h pre=s 2>&1 | ./xatlas_tee INSTALL_LOG/sL3TUNE.LOG
make -f Makefile scblaslib 2>&1 | ./xatlas_tee INSTALL_LOG/sL3TUNE.LOG
      done.


   STAGE 2-2-5: GEMV TUNE
      gemvN : chose routine 3:ATL_gemvN_1x1_1a.c written by R. Clint Whaley
              Yunroll=32, Xunroll=1, using 75 percent of L1
              Performance = 6812.25 (31.75 of copy matmul, 243.29 of clock)
      gemvT : chose routine 105:ATL_gemvT_2x16_1.c written by R. Clint Whaley
              Yunroll=2, Xunroll=16, using 75 percent of L1
              Performance = 3679.49 (17.15 of copy matmul, 131.41 of clock)


   STAGE 2-2-6: GER TUNE
      ger : chose routine 1:ATL_ger1_axpy.c written by R. Clint Whaley
            mu=16, nu=1, using  1.00 percent of L1 Cache
              Performance = 7419.17 (34.58 of copy matmul, 264.97 of clock)


STAGE 2-3: TUNING PREC='z' (precision 3 of 4)


   STAGE 2-3-1 : BUILDING BLOCK MATMUL TUNE
      The best matmul kernel was ATL_dmm4x2x128_sse2.c, NB=52, written by Whaley & Voronenko
      Performance: 11465.97MFLOPS (409.50 percent of of detected clock rate)
        (Gen case got 5714.61MFLOPS)
      NCgemmNN : muladd=1, lat=6, pf=512, nb=24, mu=13, nu=1 ku=24,
                 ForceFetch=1, ifetch=13 nfetch=1
                 Performance = 5059.39 (44.13 of copy matmul, 180.69 of clock)
      NCgemmNT : muladd=1, lat=4, pf=512, nb=24, mu=13, nu=1 ku=4,
                 ForceFetch=1, ifetch=13 nfetch=1
                 Performance = 4024.63 (35.10 of copy matmul, 143.74 of clock)
      NCgemmTN : muladd=1, lat=8, pf=512, nb=24, mu=13, nu=1 ku=24,
                 ForceFetch=1, ifetch=13 nfetch=1
                 Performance = 4621.62 (40.31 of copy matmul, 165.06 of clock)
      NCgemmTT : muladd=1, lat=3, pf=512, nb=24, mu=13, nu=1 ku=24,
                 ForceFetch=1, ifetch=13 nfetch=1
                 Performance = 3594.39 (31.35 of copy matmul, 128.37 of clock)
make -f Makefile MMinstall pre=z 2>&1 | ./xatlas_tee INSTALL_LOG/sL3TUNE.LOG


   STAGE 2-3-2: CacheEdge DETECTION


   STAGE 2-3-3: LARGE/SMALL CASE CROSSOVER DETECTION


   STAGE 2-3-4: LEVEL 3 BLAS TUNE
make -f Makefile Il3lib pre=z 2>&1 | ./xatlas_tee INSTALL_LOG/zL3TUNE.LOG
make -f Makefile zcblaslib 2>&1 | ./xatlas_tee INSTALL_LOG/zL3TUNE.LOG
      done.


   STAGE 2-3-5: GEMV TUNE
      gemvN : chose routine 3:ATL_cgemvN_1x1_1a.c written by R. Clint Whaley
              Yunroll=32, Xunroll=1, using 96 percent of L1
              Performance = 4072.57 (35.52 of copy matmul, 145.45 of clock)
      gemvT : chose routine 103:ATL_cgemvT_2x4_1.c written by R. Clint Whaley
              Yunroll=4, Xunroll=8, using 96 percent of L1
              Performance = 3254.37 (28.38 of copy matmul, 116.23 of clock)


   STAGE 2-3-6: GER TUNE
      ger : chose routine 1:ATL_cger1_axpy.c written by R. Clint Whaley
            mu=16, nu=1, using  0.75 percent of L1 Cache
              Performance = 2994.13 (26.11 of copy matmul, 106.93 of clock)


STAGE 2-4: TUNING PREC='c' (precision 4 of 4)


   STAGE 2-4-1 : BUILDING BLOCK MATMUL TUNE
      The best matmul kernel was ATL_smm2x2x256_sse.c, NB=80, written by R. Clint Whaley
      Performance: 20194.66MFLOPS (721.24 percent of of detected clock rate)
        (Gen case got 5823.04MFLOPS)
      NCgemmNN : muladd=1, lat=8, pf=0, nb=40, mu=14, nu=1 ku=40,
                 ForceFetch=0, ifetch=14 nfetch=1
                 Performance = 5206.52 (25.78 of copy matmul, 185.95 of clock)
      NCgemmNT : muladd=1, lat=8, pf=0, nb=40, mu=14, nu=1 ku=4,
                 ForceFetch=0, ifetch=14 nfetch=1
                 Performance = 4335.99 (21.47 of copy matmul, 154.86 of clock)
      NCgemmTN : muladd=1, lat=8, pf=0, nb=40, mu=14, nu=1 ku=40,
                 ForceFetch=0, ifetch=14 nfetch=1
                 Performance = 4022.89 (19.92 of copy matmul, 143.67 of clock)
      NCgemmTT : muladd=1, lat=5, pf=0, nb=40, mu=14, nu=1 ku=40,
                 ForceFetch=0, ifetch=14 nfetch=1
                 Performance = 3702.76 (18.34 of copy matmul, 132.24 of clock)
make -f Makefile MMinstall pre=c 2>&1 | ./xatlas_tee INSTALL_LOG/zL3TUNE.LOG


   STAGE 2-4-2: CacheEdge DETECTION


   STAGE 2-4-3: LARGE/SMALL CASE CROSSOVER DETECTION


   STAGE 2-4-3: COPY/NO-COPY CROSSOVER DETECTION
make -f Makefile INSTALL_LOG/cXover.h pre=c 2>&1 | ./xatlas_tee INSTALL_LOG/cMMCROSSOVER.LOG
      done.


   STAGE 2-4-4: LEVEL 3 BLAS TUNE
make -f Makefile Il3lib pre=c 2>&1 | ./xatlas_tee INSTALL_LOG/cL3TUNE.LOG
make -f Makefile ccblaslib 2>&1 | ./xatlas_tee INSTALL_LOG/cL3TUNE.LOG
      done.


   STAGE 2-4-5: GEMV TUNE
make -f Makefile INSTALL_LOG/cMVRES pre=c 2>&1 | ./xatlas_tee INSTALL_LOG/cMVTUNE.LOG
make[1]: *** [build] Error 255
make: *** [build] Error 2
Error: Target org.macports.build returned: shell command " cd "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_math_atlas/work/atlas-3.8.3/build" && /usr/bin/make build " returned error 2
DEBUG: Backtrace: shell command " cd "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_math_atlas/work/atlas-3.8.3/build" && /usr/bin/make build " returned error 2
    while executing
"command_exec build"
    (procedure "portbuild::build_main" line 9)
    invoked from within
"$procedure $targetname"
Warning: the following items did not execute (for atlas): org.macports.activate org.macports.build org.macports.destroot org.macports.install
Error: The following dependencies failed to build: py25-numpy atlas fftw-3 py25-nose py25-xml
Error: Status 1 encountered during processing.
To report a bug, see <http://guide.macports.org/#project.tickets>

Attachments (1)

main.log.bz2 (606.7 KB) - added by andre.david@… 14 years ago.
Failure log

Download all attachments as: .zip

Change History (8)

comment:1 Changed 15 years ago by martisa@…

Cc: martisa@… added

Cc Me!

comment:2 Changed 15 years ago by mf2k (Frank Schima)

Cc: martisa@… removed
Keywords: atlas i7 removed
Owner: changed from macports-tickets@… to jameskyle@…

comment:3 Changed 15 years ago by jowens@…

Cc: jowens@… added

Cc Me!

comment:4 Changed 15 years ago by jameskyle@…

I don't have an i7 to test on (this may change in the near, but not immediate future).

I would wager that the atlas script is not detecting the CPU correctly though.

comment:5 Changed 15 years ago by jameskyle@…

Resolution: worksforme
Status: newclosed

The submitter was kind enough to provide a build environment on his i7. I was unable to reproduce the issue so am closing this ticket.

comment:6 Changed 14 years ago by andre.david@…

I am unable to build "atlas @3.8.3, Revision 4" using '+gcc44+universal' (see attached log). I tried the same with '-gcc44+universal' to the same end.

Changed 14 years ago by andre.david@…

Attachment: main.log.bz2 added

Failure log

comment:7 Changed 14 years ago by Veence (Vincent)

Try using gcc45 instead, it should work.

Note: See TracTickets for help on using tickets.