Opened 4 years ago

Last modified 18 months ago

#62474 assigned defect

tensorflow build fails on 10.15

Reported by: essandess (Steve Smith) Owned by: mascguy (Christopher Nielsen)
Priority: Normal Milestone:
Component: ports Version: 2.6.4
Keywords: Cc: cjones051073 (Chris Jones), blair (Blair Zajac), emcrisostomo (Enrico Maria Crisostomo)
Port: py-tensorflow

Description

sudo port install py39-tensorflow

:info:build ERROR: /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_python_py-tensorflow/py39-tensorflow/work/846a2d310fd0286a6b86467e524348b4/external/com_google_protobuf/BUILD:301:11: C++ compilation of rule '@com_google_protobuf//:protoc_lib' failed due to Exec failed due to IOException: xcrun failed with code 1.
:info:build This most likely indicates that SDK version [10.15] for platform [MacOSX] is unsupported for the target version of xcode.
:info:build Process exited with status 1
:info:build stdout: stderr: xcodebuild: error: SDK "macosx10.15" cannot be located.
:info:build xcodebuild: error: SDK "macosx10.15" cannot be located.
:info:build xcrun: error: unable to lookup item 'Path' in SDK 'macosx10.15'
:info:build java.io.IOException: xcrun failed with code 1.
:info:build This most likely indicates that SDK version [10.15] for platform [MacOSX] is unsupported for the target version of xcode.
:info:build Process exited with status 1
:info:build stdout: stderr: xcodebuild: error: SDK "macosx10.15" cannot be located.
:info:build xcodebuild: error: SDK "macosx10.15" cannot be located.
:info:build xcrun: error: unable to lookup item 'Path' in SDK 'macosx10.15'
:info:build     at com.google.devtools.build.lib.exec.local.XcodeLocalEnvProvider.querySdkRoot(XcodeLocalEnvProvider.java:150)

Attachments (1)

main.log (3.0 MB) - added by essandess (Steve Smith) 4 years ago.

Change History (29)

Changed 4 years ago by essandess (Steve Smith)

Attachment: main.log added

comment:1 Changed 4 years ago by cjones051073 (Chris Jones)

Guess first question is what SDKs do you have installed ?

comment:2 Changed 4 years ago by cjones051073 (Chris Jones)

:debug:extract SDKROOT='/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk'

does the above exist or not ?

comment:3 Changed 4 years ago by cjones051073 (Chris Jones)

Ok, I can reproduce the same here on my 11 system....

comment:4 Changed 4 years ago by cjones051073 (Chris Jones)

[a69f6ca5c235867b17eba59cd42a608bb03eccae/macports-ports]

Looks like maybe that workaround is no longer required, but need to see what the buildbots make of it. I was at least able to get past the place it was falling, but my lowly 2015 mac book cannot really handle building the whole thing, let alone test other OSes...

Last edited 4 years ago by ryandesign (Ryan Carsten Schmidt) (previous) (diff)

comment:5 Changed 4 years ago by mf2k (Frank Schima)

Cc: emcrisostomo removed
Owner: set to emcrisostomo
Status: newassigned

comment:6 Changed 4 years ago by essandess (Steve Smith)

Thanks. FWIW I tried a reinstall just to FAFO but this doesn't help ProblemHotlist#clt.

Last edited 4 years ago by ryandesign (Ryan Carsten Schmidt) (previous) (diff)

comment:7 Changed 4 years ago by cjones051073 (Chris Jones)

Sorry, I don’t get what point you are trying to make. ?

comment:8 Changed 4 years ago by essandess (Steve Smith)

@Chris,

Tangentially, I hacked in your tensorflow bazel build code into a bunch of other ports. This would all benefit from a bazel port group, but I personally don't feel I have the expertise to design it.

I made my own minor improvements (?) to the bazel calls and workflow. Please see:

Last edited 4 years ago by essandess (Steve Smith) (previous) (diff)

comment:9 in reply to:  7 Changed 4 years ago by essandess (Steve Smith)

Replying to cjones051073:

Sorry, I don’t get what point you are trying to make. ?

Reinstalling CLT doesn't fix the issue.

comment:10 Changed 4 years ago by cjones051073 (Chris Jones)

I never said it would.

comment:11 in reply to:  10 ; Changed 4 years ago by essandess (Steve Smith)

Replying to cjones051073:

I never said it would.

FAFO == https://www.urbandictionary.com/define.php?term=FAFO

comment:12 Changed 4 years ago by cjones051073 (Chris Jones)

A bazel PG might indeed make sense, if there is now more than one port using it. I also don’t really have huge enthusiasm to do it, given the hassle I have had with tensorflow... its not a build system I like that much...

comment:13 in reply to:  11 Changed 4 years ago by cjones051073 (Chris Jones)

Replying to essandess:

Replying to cjones051073:

I never said it would.

FAFO == https://www.urbandictionary.com/define.php?term=FAFO

Right, hadn’t heard of that one...

comment:14 in reply to:  12 Changed 4 years ago by essandess (Steve Smith)

Replying to cjones051073:

A bazel PG might indeed make sense, if there is now more than one port using it. I also don’t really have huge enthusiasm to do it, given the hassle I have had with tensorflow... its not a build system I like that much...

This is the future: https://blog.tensorflow.org/2020/11/accelerating-tensorflow-performance-on-mac.html

But we'll still need all the bazel-built dependencies for the higher level frameworks. Hate it or hate it, we'll need to use bazel.

comment:15 in reply to:  2 ; Changed 4 years ago by ryandesign (Ryan Carsten Schmidt)

Replying to cjones051073:

:debug:extract SDKROOT='/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk'

does the above exist or not ?

This still hasn't been answered.

It does exist on the 10.15 buildbot worker because it uses Xcode 11 not 12.

comment:16 in reply to:  15 Changed 4 years ago by essandess (Steve Smith)

Yes, /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk exists on my system.

Replying to ryandesign:

Replying to cjones051073:

:debug:extract SDKROOT='/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk'

does the above exist or not ?

This still hasn't been answered.

It does exist on the 10.15 buildbot worker because it uses Xcode 11 not 12.

comment:17 Changed 4 years ago by essandess (Steve Smith)

If relevant:

$ xcode-select -p
/Applications/Xcode.app/Contents/Developer

comment:18 Changed 4 years ago by essandess (Steve Smith)

Also:

$ ls -d /Library/Developer/CommandLineTools/SDKs/*
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk@
/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/
/Library/Developer/CommandLineTools/SDKs/MacOSX11.1.sdk/
$ ls -d /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/*
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/DriverKit20.2.sdk/
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk@

comment:19 Changed 4 years ago by essandess (Steve Smith)

This is also an issue on other macOS versions with bazel builds of other ports, as observed in the build bot.

See:

failed: I/O exception during sandboxed execution: xcrun failed with code 1.
2021-03-17T13:54:58.5545850Z This most likely indicates that SDK version [10.14] for platform [MacOSX] is unsupported for the target version of xcode.
2021-03-17T13:54:58.5648320Z Process exited with status 1
2021-03-17T13:54:58.5750100Z stdout: stderr: xcodebuild: error: SDK "macosx10.14" cannot be located.
2021-03-17T13:54:58.5853880Z xcodebuild: error: SDK "macosx10.14" cannot be located.
2021-03-17T13:54:58.5956190Z xcrun: error: unable to lookup item 'Path' in SDK 'macosx10.14'
2021-03-17T13:54:58.6058010Z Target //tree:_tree failed to build

comment:20 Changed 4 years ago by essandess (Steve Smith)

This issue isn't bazel-related—it's xcode-select-related. My vanilla-configured box with Xcode and CLT installed throws this error with xcrun:

xcode-select --reset
xcrun --show-sdk-path
/Applications/Xcode.app/Contents/Developer
xcrun --show-sdk-version
"xcrun" error: unable to lookup item 'SDKVersion' in SDK '/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk'

This is the error that bazel is hitting.

To correct this:

sudo xcode-select --switch /Library/Developer/CommandLineTools
xcrun --show-sdk-path
/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk
xcrun --show-sdk-version
10.15.6

Then: sudo port install py39-tensorflow (and other bazel-based builds).

But how should this be corrected from within a Portfile?

comment:21 Changed 4 years ago by cjones051073 (Chris Jones)

This isn't an issue an individual port can, or should try to, fix itself. its an issue with the user's Xcode installation, and as such the user is expected to fix it themselves. If you think there is a bug here with Xcode-select, then you should take it up with Apple.

comment:22 in reply to:  21 Changed 4 years ago by essandess (Steve Smith)

Replying to cjones051073:

This isn't an issue an individual port can, or should try to, fix itself. its an issue with the user's Xcode installation, and as such the user is expected to fix it themselves. If you think there is a bug here with Xcode-select, then you should take it up with Apple.

FB9051310

comment:23 Changed 4 years ago by essandess (Steve Smith)

FWIW, I’m still not certain this isn’t an local configuration issue. For posterity, these commands work to build tensorflow:

xcrun --kill-cache
sudo xcode-select --switch /Library/Developer/CommandLineTools
xcrun --show-sdk-path
/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk

Resetting the CLT path to its default breaks the build. This doesn’t work:

sudo xcode-select --reset
xcrun --show-sdk-path
/Applications/Xcode.app/Contents/Developer

comment:24 Changed 4 years ago by essandess (Steve Smith)

I'm on tho this chronic issue building tensorflow: Symbol not found: __ZN10tensorflow4data12experimental19SnapshotDatasetV2Op13kReaderPrefixE.

Tensorflow builds and installs using the latest bazel PG and Portfile, but it doesn't load because of some missing symbol.

This was an issue for a long time in MacPorts pre-compiled version. Was it ever resolved?

python3 -c 'import tensorflow as tf'
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/tensorflow/python/pywrap_tensorflow.py", line 64, in <module>
    from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: dlopen(/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so, 6): Symbol not found: __ZN10tensorflow4data12experimental19SnapshotDatasetV2Op13kReaderPrefixE
  Referenced from: /opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so
  Expected in: flat namespace
 in /opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/tensorflow/__init__.py", line 41, in <module>
    from tensorflow.python.tools import module_util as _module_util
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/tensorflow/python/__init__.py", line 39, in <module>
    from tensorflow.python import pywrap_tensorflow as _pywrap_tensorflow
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/tensorflow/python/pywrap_tensorflow.py", line 83, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/tensorflow/python/pywrap_tensorflow.py", line 64, in <module>
    from tensorflow.python._pywrap_tensorflow_internal import *
ImportError: dlopen(/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so, 6): Symbol not found: __ZN10tensorflow4data12experimental19SnapshotDatasetV2Op13kReaderPrefixE
  Referenced from: /opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so
  Expected in: flat namespace
 in /opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

comment:25 Changed 4 years ago by essandess (Steve Smith)

Already posted this issue, and taking it here: #61279

Last edited 4 years ago by ryandesign (Ryan Carsten Schmidt) (previous) (diff)

comment:26 Changed 4 years ago by blair (Blair Zajac)

Cc: blair added

comment:27 Changed 18 months ago by mascguy (Christopher Nielsen)

Cc: emcrisostomo added
Owner: changed from emcrisostomo to mascguy

Currently looking at options to fix the build for 10.14 and 10.15, which presently fails with "argument list too long." There's at least one proposed patch for Bazel, which might finally resolve these issues. Stay tuned!

comment:28 in reply to:  27 Changed 18 months ago by mascguy (Christopher Nielsen)

Replying to mascguy:

Currently looking at options to fix the build for 10.14 and 10.15, which presently fails with "argument list too long." There's at least one proposed patch for Bazel, which might finally resolve these issues. Stay tuned!

In case anyone's interested in the details, the upstream discussion is here:

The author of the patches implemented incremental linking, to avoid a single massive final link command.

The proposed changes - albeit a year old, so they may need some modifications - are here:

Note: See TracTickets for help on using tickets.