Skip to content

Better Non-Rust Dependency Tracking in Cargo #1972

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mark-i-m opened this issue Apr 14, 2017 · 16 comments
Closed

Better Non-Rust Dependency Tracking in Cargo #1972

mark-i-m opened this issue Apr 14, 2017 · 16 comments
Labels
T-cargo Relevant to the Cargo team, which will review and decide on the RFC.

Comments

@mark-i-m
Copy link
Member

Hey all,

I have been working on a few projects that involve C++ and assembly code with Rust, and one of the annoying pain points is build systems. Here are some thoughts I had. Any feedback or alternate solutions are welcome. Thanks 😄

I propose adding an optional field in the [dependencies] section(s) of Cargo.toml to allow Cargo to be minimally aware of non-Rust dependencies and remove some boilerplate.

[dependencies]
# these are the same old rust dependencies we already have
hello_utils = { path = "hello_utils", version = "0.1.0" }
rand = { version = "0.3", optional = true }
rlibc = "*"

# but we can also have non-rust dependencies
my_asm = {path = "../my_asm/", non-rust-build = "make build"}
# can fetch from github, just as with other dependencies
my_cpp = {git = "http://github.com/example/my_cpp.git", non-rust-build = "make build"}

Suppose the my_asm dependency contains some assembly that needs to be built before the rust project. The non-rust-build field in the snippet above tells Cargo that it should just invoke make build and ensure that it completes with a 0 error code before continuing with the rest of the build. To the very best of my knowledge and from spending some time skimming through the docs on dependencies on crates.io, I don't think such a feature exists yet (please correct me if I am wrong).

Motivation

So far, Cargo is great at tracking rust dependencies, but not so great for projects with mixes of rust and not-rust. For example, suppose you have a Rust project with a dependency on some assembly code. You need to first compile the assembly into some sort of object before it can be linked in with your Rust program (aside: #1489 would make the linking portion significantly easier). AFAIK, you can take either of two routes: (1) write a big makefile (or some other build tool) to invoke Cargo, gcc, and then the linker, or (2) use a Cargo build script to invoke the makefile. Such build systems can get messy and hard to debug and require a lot of boilerplate. In some cases, it is just more ergonomic to write a makefile that invokes rustc directly, and I have done this in one of my projects.

To be clear, I not proposing Cargo as a replacement for make. I simply think that Cargo could be made more aware of the other dependencies and build systems in a project in some minimal way.

Specifically, I think there are two cases from the point of view of Cargo: (1) the Rust crate is dependent on some non-Rust object which needs to be obtained somehow, and (2) the Rust crate is a dependency of some non-Rust object which needs to obtain the Rust object before it can be compiled/linked. I would argue that in case (2), some other build system should be responsible for tracking dependencies and making sure that Cargo is invoked when needed. However, in case (1), I believe Cargo should be responsible for obtaining the needed object, even if that simply involves invoking some programmer-defined command (e.g. make build).

In my opinion, its ok to have a build system with multiple parts (e.g. cargo + make) as long as the parts are modular, somewhat independent (i.e. they have very limited interactions that are easy to trace and debug), and they don't contain lots of boilerplate.

@durka
Copy link
Contributor

durka commented Apr 15, 2017

What would go in Cargo.lock for a non-Rust dependency?

@mark-i-m
Copy link
Member Author

TBH, I am not 100% sure since we cannot guarantee versioning for non-Rust build systems.

One could imagine a few alternatives:

  • Nothing: let the programmer fend for themselves
  • The git commit SHA if the non-rust dependency is from a git repo. Likewise, you can imagine storing some comparable value for other source control systems like SVN, Mercurial, Perforce, etc... if they have them (do they?)
  • Simply a boolean: has this dependency been downloaded?
  • Maybe there is a way to retrofit someone else's repo with version numbers without changing their repo (e.g. Add some uncommitted file or script or something to the workspace)? I am not sure how this option would work or if it would be more confusing than its worth.
  • A SHA hash of the contents of dependency. This seems like a pretty sketchy option, since you would need to know not to hash random artifacts that happen to be left over from the last build. Also, it does not record anything about versions, so is it really that useful?

My leaning is towards the commit hash option for source controlled repos and the "simply a boolean" option for anything else. I am not sure if cargo could/should try to do a consistent versioning scheme for all non-rust dependencies, though. It seems like it would be a good idea, especially for source-controlled dependencies where there is a notion of versions. But it could just be confusing if the dependencies is something you curl and untar or something like that.

There are a couple of related questions that this brings up too:

  1. Should cargo try to cache non-rust dependencies like it does with rust crates? I think not because this would complicate the registry, right?
  2. I don't think it should attempt to track whether non-rust dependencies are built or not. It should just blindly invoke the build command (e.g. make build). The dependency's build system should be responsible for tracking if their sub-dependencies are already built, which is convenient because most build systems I have seen already do this. This means that cargo doesn't gain a lot of complexity.

@est31
Copy link
Member

est31 commented Apr 15, 2017

You can package cargo with non rust code, look at curl-sys for an example. It could though be made more ergonomic, but without sacrificing stuff like pkg-config checks.

@mark-i-m
Copy link
Member Author

@est31 curl-sys is an interesting example. IIUC, a large portion of its build script is simply figuring out which make to call, for example. TBH, I don't really know much about pkg-config, though.

This does raise an interesting question: is it enough to just provide a really nice make crate like the gcc crate? It still involves some boilerplate, but it seems a lot more tolerable.


Also, a related note from some experimenting I did today... It would be nice if cargo left behind a symlink to most recent build it has done in some fixed location which does not depend on whether the build was a release or debug build. That way, larger build systems can invoke cargo and just look in one place for the artifact, rather than having to remember if they should look in target/debug or target/release...

@strega-nil
Copy link

strega-nil commented Apr 16, 2017

So, just an idea:

support building non-rust code with cargo. Add C++ package and build management. They really would like, and need, a cargo-alike. Why not use actual cargo? 😺

(this is mostly because I hate, when writing C++, the lack of a good build system)

@mark-i-m
Copy link
Member Author

@ubsan The idea sounds great in principle, but I am hesitant... This seems like it would open a new kettle of worms. Why just C++? What about C? Or Python? Or Java? I think if Cargo also tries to be a build manager for every language, it will stretch too thin (unless somebody comes up with a really clever idea). There are already lots of good, well-supported, well-known build managers out there. In my mind, the real value of Cargo is semantic versioning (which every large multi-component has problems with) and crates.io. That's what other languages are (sorely) missing IMO.

What would be cool IMO is if Cargo/crates.io could add semantic versioning to other languages (or really, arbitrary projects), which does not seem like much of a stretch (or even a long distance from what we have), but then shells out build management to some other build system, like make...

@Dushistov
Copy link

You can run build sytem of foreign languges from build.rs,
the problem is caching. For example if you change something in
in C part of your project, but nothing changed in rust part, cargo build does nothing,
because of cargo make desicion not run build.rs. So you need something like touch build.rs to rerun build.

@mark-i-m
Copy link
Member Author

@Dushistov I would still argue that running other build systems through build.rs contains a lot of boilerplate (witness curl-sys, linked above by @est31) and it could be made a lot more streamlined.

👍 I do agree with the caching problem. My recent experiments have been of the form "make starts cargo", so I had forgotten about that.

@eddyb
Copy link
Member

eddyb commented Apr 17, 2017

@Dushistov There are a bunch of directives you can print from the build script to handle error output, track input files, trigger various effects for the actual final compilation etc. Not sure if/where they're documented.

@durka
Copy link
Contributor

durka commented Apr 17, 2017

They are documented here, in particular you can use rerun-if-changed. (Plug for foreman).

But anyway, I don't really see how Cargo can expand in scope to manage other languages -- it interacts with Rust's design quite a bit to ensure reproducible builds etc. Maybe a lack of imagination on my part. We can and should develop better build script utilities along the lines of the gcc crate though.

@mark-i-m
Copy link
Member Author

Maybe a lack of imagination on my part.

Nope, I have trouble imagining it, too, but we can dream 😛

Also, thanks @eddyb and @durka for the info; I was not aware of those... It strikes me that even though the crate.io docs are pretty thorough, it still seems like there are features of Cargo I am completely unaware of! Maybe another book is in order? The Cargo book + Cargonomicon 😝

@eddyb
Copy link
Member

eddyb commented Apr 17, 2017

Also, you're not supposed to write your own build scripts from absolute scratch most of the time, and I believe the gcc and cmake crates do some advanced integration (using the features I mentioned).

@strega-nil
Copy link

@mark-i-m mostly a joke, unless someone actually wants to take it up. I just... I really don't want to have to use any build systems but cargo.

@mark-i-m
Copy link
Member Author

Another passing idea: we could require that the dependency is in a git repo or other VCS as a first pass (we could extend it later). A Cargo.toml dependency entry would look like this:

[dependencies.my_cpp]
git = "http://github.com/example/my_cpp.git"
version="40a964bf658d3fff9a956634d4dfeaf54547b089"
non-rust-build = "make build"

@mark-i-m
Copy link
Member Author

mark-i-m commented Sep 2, 2017

I think #2136 solves this

@Centril Centril added the T-cargo Relevant to the Cargo team, which will review and decide on the RFC. label Dec 6, 2017
@mark-i-m
Copy link
Member Author

Going to close in favor of #2136 to reduce clutter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-cargo Relevant to the Cargo team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests

7 participants