Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid FoundationNetworking and libcurl on non-Darwin platforms #681

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

MaxDesiatov
Copy link
Member

@MaxDesiatov MaxDesiatov commented Aug 2, 2023

Summary

When linking docc binary statically with Swift Foundation and stdlib (with swift build --product docc --static-swift-stdlib) it still depends on FoundationNetworking on non-Darwin platforms. That library itself has a dependency on curl and pulls in a huge list of transitive dependencies because of that. Such libraries are still linked dynamically, and this can be shown with this command on Linux:

# ldd ./.build/debug/docc
	linux-vdso.so.1 (0x0000ffff84448000)
	libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff7f320000)
	libcurl.so.4 => /lib/aarch64-linux-gnu/libcurl.so.4 (0x0000ffff7f270000)
	libxml2.so.2 => /lib/aarch64-linux-gnu/libxml2.so.2 (0x0000ffff7f080000)
	libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffff7ee50000)
	libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffff7ee20000)
	libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff7ec70000)
	/lib/ld-linux-aarch64.so.1 (0x0000ffff8440f000)
	libnghttp2.so.14 => /lib/aarch64-linux-gnu/libnghttp2.so.14 (0x0000ffff7ec30000)
	libidn2.so.0 => /lib/aarch64-linux-gnu/libidn2.so.0 (0x0000ffff7ec00000)
	librtmp.so.1 => /lib/aarch64-linux-gnu/librtmp.so.1 (0x0000ffff7ebd0000)
	libssh.so.4 => /lib/aarch64-linux-gnu/libssh.so.4 (0x0000ffff7eb50000)
	libpsl.so.5 => /lib/aarch64-linux-gnu/libpsl.so.5 (0x0000ffff7eb20000)
	libssl.so.3 => /lib/aarch64-linux-gnu/libssl.so.3 (0x0000ffff7ea70000)
	libcrypto.so.3 => /lib/aarch64-linux-gnu/libcrypto.so.3 (0x0000ffff7e680000)
	libgssapi_krb5.so.2 => /lib/aarch64-linux-gnu/libgssapi_krb5.so.2 (0x0000ffff7e620000)
	libldap-2.5.so.0 => /lib/aarch64-linux-gnu/libldap-2.5.so.0 (0x0000ffff7e5b0000)
	liblber-2.5.so.0 => /lib/aarch64-linux-gnu/liblber-2.5.so.0 (0x0000ffff7e590000)
	libzstd.so.1 => /lib/aarch64-linux-gnu/libzstd.so.1 (0x0000ffff7e4c0000)
	libbrotlidec.so.1 => /lib/aarch64-linux-gnu/libbrotlidec.so.1 (0x0000ffff7e4a0000)
	libz.so.1 => /lib/aarch64-linux-gnu/libz.so.1 (0x0000ffff7e470000)
	libicuuc.so.70 => /lib/aarch64-linux-gnu/libicuuc.so.70 (0x0000ffff7e260000)
	liblzma.so.5 => /lib/aarch64-linux-gnu/liblzma.so.5 (0x0000ffff7e220000)
	libunistring.so.2 => /lib/aarch64-linux-gnu/libunistring.so.2 (0x0000ffff7e060000)
	libgnutls.so.30 => /lib/aarch64-linux-gnu/libgnutls.so.30 (0x0000ffff7de60000)
	libhogweed.so.6 => /lib/aarch64-linux-gnu/libhogweed.so.6 (0x0000ffff7de00000)
	libnettle.so.8 => /lib/aarch64-linux-gnu/libnettle.so.8 (0x0000ffff7dda0000)
	libgmp.so.10 => /lib/aarch64-linux-gnu/libgmp.so.10 (0x0000ffff7dd10000)
	libkrb5.so.3 => /lib/aarch64-linux-gnu/libkrb5.so.3 (0x0000ffff7dc30000)
	libk5crypto.so.3 => /lib/aarch64-linux-gnu/libk5crypto.so.3 (0x0000ffff7dbf0000)
	libcom_err.so.2 => /lib/aarch64-linux-gnu/libcom_err.so.2 (0x0000ffff7dbd0000)
	libkrb5support.so.0 => /lib/aarch64-linux-gnu/libkrb5support.so.0 (0x0000ffff7dbb0000)
	libsasl2.so.2 => /lib/aarch64-linux-gnu/libsasl2.so.2 (0x0000ffff7db80000)
	libbrotlicommon.so.1 => /lib/aarch64-linux-gnu/libbrotlicommon.so.1 (0x0000ffff7db40000)
	libicudata.so.70 => /lib/aarch64-linux-gnu/libicudata.so.70 (0x0000ffff7bf10000)
	libp11-kit.so.0 => /lib/aarch64-linux-gnu/libp11-kit.so.0 (0x0000ffff7bdc0000)
	libtasn1.so.6 => /lib/aarch64-linux-gnu/libtasn1.so.6 (0x0000ffff7bd90000)
	libkeyutils.so.1 => /lib/aarch64-linux-gnu/libkeyutils.so.1 (0x0000ffff7bd70000)
	libresolv.so.2 => /lib/aarch64-linux-gnu/libresolv.so.2 (0x0000ffff7bd40000)
	libffi.so.8 => /lib/aarch64-linux-gnu/libffi.so.8 (0x0000ffff7bd20000)

This makes the docc binary hard to distribute on Linux even when linking Foundation and stdlib statically.

DocC doesn't actually need anything from FoundationNetworking other than these two things:

  1. Data(contentsOf: URL) initializer. This one pulls networking code as it doesn't know upfront if a URL is local or remote. I've audited its uses in the DocC codebase and there's no reliance on remote URLs. In fact, I'd advocate replacing such uses of URL with Swift System's FilePath in a future PR to guarantee that these are actual file paths at compile-time on the type system level and no URLs can imply use of some HTTP client under the hood. In the meantime calls to Data(contentsOf:) are replaced with FileHandle.readToEnd().
  2. URLRequest and URLResponse, but only as model types and no actual networking calls are made with them on non-Darwin platforms. HTTPRequest and HTTPResponse from the new apple/swift-http-types library covers this functionality.

Dependencies

A new dependency on apple/swift-http-types 0.2.1 is introduced to replace existing uses of URLRequest and URLResponse on non-Darwin platforms.

Testing

Build on Linux with swift build --product docc --static-swift-stdlib and run ldd ./.build/debug/docc, observe that the list of linked libraries is as small as this:

# ldd ./.build/debug/docc
	linux-vdso.so.1 (0x0000ffff9c81a000)
	libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff97540000)
	libxml2.so.2 => /lib/aarch64-linux-gnu/libxml2.so.2 (0x0000ffff97350000)
	libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffff97120000)
	libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffff970f0000)
	libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff96f40000)
	/lib/ld-linux-aarch64.so.1 (0x0000ffff9c7e1000)
	libicuuc.so.70 => /lib/aarch64-linux-gnu/libicuuc.so.70 (0x0000ffff96d30000)
	libz.so.1 => /lib/aarch64-linux-gnu/libz.so.1 (0x0000ffff96d00000)
	liblzma.so.5 => /lib/aarch64-linux-gnu/liblzma.so.5 (0x0000ffff96cc0000)
	libicudata.so.70 => /lib/aarch64-linux-gnu/libicudata.so.70 (0x0000ffff95090000)

Checklist

  • Added tests
  • Ran the ./bin/test script and it succeeded
  • Updated documentation if necessary

`Data(contentsOf:)` on non-Darwin platforms comes from the `FoundationNetworking` module, which pulls in a massive amount of transitive dependencies of libcurl. That makes it harder to redistribute a statically linked executable binary of DocC.
@MaxDesiatov MaxDesiatov added the enhancement Improvements or enhancements to existing functionality label Aug 2, 2023
@MaxDesiatov
Copy link
Member Author

@swift-ci test

@@ -19,7 +19,7 @@ let swiftSettings: [SwiftSetting] = [
let package = Package(
name: "SwiftDocC",
platforms: [
.macOS(.v10_15),
.macOS(.v11),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumped to .v11 per a discussion with @franklinsch, this allows us to use throwing FileHandle.readToEnd() unlike the unsafe non-throwing FileHandle.readDataToEndOfFile that crashes the test suite on Linux.

@@ -90,7 +90,7 @@ public class FileServer {
xlog("Tried to load an invalid path: \(path).\nFalling back to serve index.html.")
}
mimeType = "text/html"
data = self.data(for: path.appendingPathComponent("/index.html"))
data = self.data(for: "/index.html")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this isn't a change in behavior? What's needless about this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only path component of url that data(for:) took as argument was used. The rest of the components are ignored by DocC server implementations.

Additionally, HTTPRequest unlike URLRequest doesn't store full URLs, only paths. The authority and scheme components can be assembled from scheme and authority properties on HTTPRequest , but those aren't used anywhere in the DocC codebase, other than a test in this diff below where we have to construct a new URLRequest to pass it to WKWebView on Darwin platforms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if request.path is "/something" then this used to pass "/something/index.html" as an argument but now it only passes "index.html". Isn't that returning data for the wrong path?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If request.path is "/something" then that is handled by the if branch above, no this else branch. The latter branch didn't use request.path previously, that logic is unchanged, as verified by the test suite.

@MaxDesiatov
Copy link
Member Author

@swift-ci test

Package.swift Outdated Show resolved Hide resolved
public func response(to request: URLRequest) -> (URLResponse, Data?) {
guard let url = request.url else {
return (HTTPURLResponse(url: baseURL, statusCode: 400, httpVersion: "HTTP/1.1", headerFields: nil)!, nil)
public func response(to request: HTTPRequest) -> (HTTPTypes.HTTPResponse, Data?) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a source breaking change. Is there anywhere to maintain both declarations to ease this transition?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can guard the old function that uses URLRequest and URLResponse on #if canImport(Darwin) if that's ok? Otherwise its presence on non-Darwin platforms adds a dependency on FoundationNetworking with curl and the rest of the system networking/crypto libraries, which is what this PR was meant to remove in the first place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. I'm concerned that someone could have an existing setup on Linux with curl and the rest of the system networking libraries and that their setup would break when they update to a new DocC version.

We try to provide a transition period for breaking changes. If that's not possible then so be it (although we would try to inform people about it ahead of time in the Forums). But it there's a way to stage this, then I think that it would be less disruptive that way.

Would it be possible to use a compile time define to exclude FoundationNetworking? That way we could have a 3 stage transition (first opt-in to exclude, then exclude by default, and finally remove it completely)

Co-authored-by: David Rönnqvist <[email protected]>
Comment on lines 51 to 54
let fileHandle = try FileHandle(forReadingFrom: url)
guard let data = try fileHandle.readToEnd() else {
throw FileSystemError.noDataReadFromFile(path: url.path)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have to do this instead of Data(contentsOf:) everywhere in the future then I'm concerned that either we or someone new to the repo would forget and break things.

Is there anyway that we can shim Data(contentsOf:) on non-Darwin platforms to do this or any way that we could mark Data(contentsOf:) as unavailable so that we don't accidentally break things in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of a way to do that. There's no way to prevent someone from adding import FoundationNetworking in the future and shadowing the locally declared shim. The right way to do this is for swift-corelibs-foundation to deprecate this initializer, but that's up to its maintainers and whether they want to do the same thing on Darwin so that these different platform implementations don't diverge even more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing that comes to mind is just checking for FoundationNetworking imports in ./bin/test and raising an error if that's found. That will mean if anyone adds Data(contentsOf:) that will fail to build on Linux and Windows unless they add import FoundationNetworking, which in turn will make ./bin/test.

Is ./bin/test what you run on CI?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied the latter suggestion in f2b5908.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is ./bin/test what you run on CI?

Yes

@MaxDesiatov
Copy link
Member Author

@swift-ci test

@MaxDesiatov
Copy link
Member Author

@swift-ci test

@MaxDesiatov MaxDesiatov changed the title Remove dependency on FoundationNetworking and libcurl on non-Darwin platforms Avoid FoundationNetworking and libcurl on non-Darwin platforms Aug 2, 2023
@MaxDesiatov
Copy link
Member Author

@swift-ci test

@MaxDesiatov
Copy link
Member Author

@swift-ci test

@MaxDesiatov
Copy link
Member Author

I will get back to this PR and make exclusion of the old API conditional on a build setting. Marking it as a draft in the meantime.

@MaxDesiatov MaxDesiatov marked this pull request as draft August 8, 2023 18:21
@d-ronnqvist d-ronnqvist added the source breaking DocC's public API isn't source compatible with earlier versions label Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvements or enhancements to existing functionality source breaking DocC's public API isn't source compatible with earlier versions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants