In the first part of this series we gave formal definitions for Programming Language and Software Development Language, followed by a list of typical expectations that working software devs have of their tools. In this post, we'll explore to what degree the Rust Language meets those expectations.
With a few Rust projects now under my belt, I can say that Rust is a very strong Software Development Language.
If you have been curious about trying Rust, I highly recommend it. Just think:
null
.The sections below don't need to be read in order. Feel free to jump around to what interests you via the Table of Contents above.
Rust is a fantastic language for Software Development. Not because it has or doesn't have Feature X, but because it enables me to efficiently write and maintain software. Here are some specific needs that Rust meets in each of the Five Pillars of Software Development:
Rust projects are managed with the Cargo tool.
cargo new <project-name>
This produces a Cargo.toml
project config file, a src/main.rs
, and a .gitignore
. It also creates an empty git repo. This new project is compilable as-is.
To compile, we typically don't invoke the compiler directly. Instead we use cargo
here too. Building a project is just:
cargo build
Similarly, cargo test
will run your tests, and cargo run
will run your exectuable.
There is actually no need to frequently recompile: cargo check
(see also cargo-watch) tells me immediately where and why my code is incorrect without
fully compiling. Error messages are clear, well-formatted, and even provide Error Codes that you can query the compiler for to obtain elaborate explanations of the cause of the error.
error[E0728]: `await` is only allowed inside `async` functions and blocks --> src/main.rs:250:5 | 249 | fn use_it() { | ------ this is not `async` 250 | this_is_special().await; | ^^^^^^^^^^^^^^^^^^^^^^^ only allowed inside `async` functions and blocks error: aborting due to previous error For more information about this error, try `rustc --explain E0728`.
Otherwise, Rust provides an official LSP, so you can expect all the convenience that that offers.
Rust could be called a Post-OOP language (like Golang), in that we're returning to fundamentals. We have structs, enums, functions, loops, and variables. Mutability is available, but is always explicit and not the default. There are no Classes to be seen, but the concept of the Method was accepted and is quite useful for code organization. Exceptions were left out, as was the concept of null
. Well wait, how do we model errors then?
Rust also paid attention to what the Functional Languages have been doing over the past decades. So enums can hold inner values (aka "Sums of Products" or "ADTs"), we have Pattern Matching, and we have Traits (which can be auto-derived!). Errors are handled explicitly with the Option
and Result
types. Further, we can do a kind of "error threading" with the ?
operator, which I believe is a wonderful distillation of the Lesson of Monads:
// If a line with `?` fails, the entire function is short-circuited.
// The `?` and `let` together automatically unwrap the inner value.
fn work() -> anyhow::Result<()> {
let config = cargo_config()?;
release_build()?;
tarball(&config.package)?;
let sha256 = sha256sum(&config.package)?;
let pkgbuild = pkgbuild(&config.package, &sha256);
fs::write("PKGBUILD", pkgbuild)?;
Ok(())
}
Rust is primarily strict, but accepts the Lesson of Laziness in the form of chainable Iterator operations.
Finally, Rust adds the new concepts of Ownership and Borrowing, which allow for its GC-less automatic memory management.
I find Rust more verbose than Haskell, but less than Golang. Here is what a few common constructs look like:
// Importing
use std::collections::HashMap;
// A public struct with private fields. `derive` macro lets us auto-derive Trait
// implementations.
#[derive(Deserialize)]
pub struct User {
name: String,
age: u32,
tall: bool,
}
impl User {
// Public method that borrows the `self` mutably.
pub fn older(&mut self) {
self.age += 1
}
}
// A publically exposed function with a docstring hyperlinked to other types.
/// Try to extract a position from the `Mess` as a nice integer, as if it
/// were a [`SemVer`](struct.SemVer.html).
pub fn nth(&self, x: usize) -> Option<u32> {
let i = self.chunk.get(x)?;
let (i, n) = parsers::unsigned(i).ok()?;
match i {
"" => Some(n),
_ => None,
}
}
rustfmt
makes all code layout standard, so say goodbye to style arguments.
Strong types and no null
. Thanks to Rust's Ownership system, the pitfalls of pointer and memory management in C are long gone. Yes there is technically IO
everywhere, but once again Ownership makes this hard to abuse. Special IO
and STM
Monads aren't necessary here.
Unit tests go in the file of the functions they're testing (even your main.rs
!):
You can also add tests to your docstrings inside a markdown ```
block, and cargo
will detect and run these. This way, your code samples can never drift out of date.
A quick note on the wording of this section title: when it comes to the multi-staged-multi-person development of long-lived software, raw executable performance is often not a priority for the business. This is due to a number of factors:
Of course there are fields where executable performance is critical. And at a point, sufficiently bad default performance can noticeably sour a user's experience. Hence the implication of the title: is it easy to accidentally write code that will perform poorly? Some languages punish you for writing them idiomatically, but luckily Rust is not one of them.
A major path to performance in any language is the avoidance of allocation. In Rust, mutability is readily available and hard to screw up:
fn mutability() {
let mut hm = HashMap::new();
hm.insert(1, 'a');
hm.insert(2, 'b');
hm.insert(3, 'c');
// The map is borrowed immutably by the next function, so can still be
// manipulated here. No memory is copied.
use_the_map(&hm);
// We still own the map, so we're free to continue mutating it.
hm.insert(4, 'd');
// Ownership has passed to the next function, the map can no longer be
// referenced here. It is deallocated automatically from `move_the_map`'s
// end when it returns.
move_the_map(hm);
// Won't compile.
// hm.insert(5, 'e');
}
We can also see how memory-conscious Rust is: heap memory is basically never copied without the programmer's consent. Further, by default, Rust puts as much onto the stack as it can. Primitive types are unboxed, and we have fast, compact Array types. Chaining iterator operations in a functional style is idiomatic and compiles to highly optimized code.
The lesson: If you write idiomatic Rust and use standard data structures, you will get good off-the-shelf performance.
Github's default Rust Action will have your project built and tested within a few minutes, even without a cache of dependencies. There's even an Action to automatically publish Rust Books.
Foremost, the Rust User Forums. Each question I have asked there was answered in about 15 minutes and by more than one person.
Release announcements and other interesting articles are frequently posted on the official Rust blog. A weekly summary of community developments is also available with the This Week in Rust newsletter.
Haskell and Scala devs will know what I mean by this question. Rust is mostly Rust when it comes to idioms or "sublanguages" introduced by libraries. The exception is the recent addition of the async
keyword and its associated functionality.
Concurrency was always possible with Rust, and still is without async
. Want to fork two system threads and share data between them? Go ahead:
use std::sync::{Arc, Mutex};
use std::thread;
fn concurrency() -> thread::Result<u32> {
// `Arc` is "Atomic Reference Counter". It's an addition to `Mutex` that
// ensures we're sharing memory responsibly.
let mutex0 = Arc::new(Mutex::new(0));
let mutex1 = mutex0.clone();
let mutex2 = mutex0.clone();
// Spawn system threads and mutate shared memory.
// No explicit unlock call is necessary.
let handle0 = thread::spawn(move || {
*mutex0.lock().unwrap() += 1;
});
let handle1 = thread::spawn(move || {
*mutex1.lock().unwrap() += 1;
});
// Wait for the threads to complete.
handle0.join()?;
handle1.join()?;
// 2
let result = *mutex2.lock().unwrap();
Ok(result)
}
Want to iterate over a collection in parallel? Go ahead:
use rayon::prelude::*;
fn parallel_iteration() {
let nums = vec![1, 2, 3, 4, 5]; // Could be any size.
// Maps, filters, and prints entirely in parallel with as many CPU cores as
// you have.
nums.par_iter()
.map(|n| n + 1)
.filter(|n| n % 2 == 0)
.for_each(|n| println!("{}", n));
}
Whereas async
functions look like this:
async fn this_is_special() {
println!("Hello, ");
}
async fn use_it() {
this_is_special().await;
println!("World!");
}
Where await
pauses the current function (Task, actually), yields control back to the concurrent runtime for other Tasks to be ran, and resumes eventually once the runtime sees that this_is_special
has completed. await
can't be called in a function that isn't itself marked with async
, so the asyncness spreads, much like IO
in Haskell.
async
was added as a way to formalize the creation of highly concurrent applications. However, this was all done at the Trait-level: no runtime to manage Tasks / Green Threads was provided by Rust itself. There are currently two main runtimes: async-std and Tokio. Both have growing ecosystems and seem well-adopted.
Libraries are now either "async-compatible" or not, but for libraries that are unconcerned with networking, this is an irrelevant distinction. For many uses of Rust, async
can be entirely ignored. This also means that the binary weight of the concurrent runtime is entirely left out of such projects.
Rust projects are called "crates" and are found on crates.io. cargo
manages dependencies for us too, downloading them if missing. Depending on another library looks like:
[dependencies] anyhow = "1.0" chrono = { version = "0.4", features = ["serde"] } counter = "0.5"
Many libraries have extra features that you can optionally activate. The version numbers follow Semantic Versioning, and this is strictly enforced.
Publishing a crate to crates.io is as easy as running cargo publish
, and the result appears as a page like this. Uploading a new version is the same command. Buggy versions can also be "yanked" off the registry to avoid accidental usage.
Rust docstrings are markdown and render quite nicely. As mentioned above, code samples in a docstring found within a ```
block will be ran as a test, and there is no extra configuration necessary to enable this.
All published libraries have docs automatically generated for them. You can also open your project's documentation (with all dependencies too!) locally with cargo doc --open
. From there, you can search for any type or function name.
Luckily, no. If two of your dependencies require different versions of the same transitive dependency, both will be brought into your binary. In practice this isn't a real problem because:
cargo build --release
. This will recompile all dependencies and activate optimizations. Add the following to your Cargo.toml
to reduce binary size and further improve performance:
[profile.release] lto = true
You can also run strip
on the final binary to reduce its size even more. Here are the stripped binary sizes of a few simple programs:
Program | Go | Haskell | Rust |
---|---|---|---|
Hello World | 1.4mb | 695kb | 207kb |
Server with / endpoint | 5.2mb | 2.0mb | 1.6mb |
Simple JSON Server | 5.5mb | 2.5mb | 1.7mb |
And since Rust has no runtime like Go or Haskell, there are no mysterious flags to pass to your executable to have it perform sanely.
For more information on how to reduce Rust binary sizes specifically, see this repo.
No matter the platform, all buildtool commands are the same. To discover what platforms are supported, do:
rustup target list
As of this writing, Rust supports 84 different platforms. Among those we see:
x86_64-apple-darwin x86_64-pc-windows-gnu x86_64-pc-windows-msvc
So Rust isn't Linux-only by any stretch of the imagination.
Rust was specifically designed not to crash for the usual reasons we encounter:
head []
)All of these are to a varying degree due to programmer negligence, and every language takes a stance (or a non-stance) on how to address them.
Garbage Collectors are convenient but can't save you from leaks or runaway processes that allocate more and more memory. Rust memory is freed as soon as it is no longer needed, the timing for which is known at compile time, not at runtime as with a GC.
Rust has no runtime at all, so there is no worry of your process hitting an arbitrary memory cap as with, say, the JVM.
Like Golang, Rust doesn't have Exceptions. Also like Golang it does have "panics", which are errors that should, morally, never be recovered from. It is a convention of documentation to warn a user if a function can panic, but in general panics should only occur in truly exceptional situations. Otherwise, all errors are modelled with the Option
and Result
types, even IO errors that other languages throw Exceptions for.
There are certain operations which Rust names as being unsafe. These are usually impossible to perform unless marked by the unsafe
keyword. Sometimes you'll have a legitimate reason to do this, but most of the time you won't need to.
Otherwise, the Ownership system is what makes it virtually impossible to misuse memory. Even higher-level constructs like file handles, mutexes, and database connections can't be reused after they've been relinquished. Code that attempts to do so won't even compile.
Sums-of-products with named fields can't be directly referenced as they can in Haskell:
enum Colour {
Red { a: bool },
Blue { b: u32 },
Green { c: char },
}
fn bad(colour: Colour) -> u32 {
colour.b // Won't compile.
}
But types like Option
and Result
do still have an unsafe unwrap
method that panics in the Error case. You're generally encouraged to use ?
, pattern match on the type, or call unwrap_or
instead. Likewise, types like Vec
offer methods like get_unchecked
for when you're very confident that you can avoid the Option
wrapping.
This is one of the most important aspects of development when considering software intended to last decades. As I described in another article, a language's ecosystem can "leave you behind" if you wait too long to upgrade your toolchain / dependencies.
Rust has three release channels (nightly, beta, and stable) and has frequent releases. They call this their "train schedule". Further, every three years a
new "Edition" is released which, breaking or not, allows the Rust team to look back, summarize the changes, and segregate language idioms. Which edition of Rust you're using is specified in your project's Cargo.toml
, so this is never a surprise:
[package] name = "foo" version = "0.1.0" edition = "2018"
New compiler/toolchain versions are also simple to upgrade to:
rustup update stable
Updating your compiler will require that you recompile whatever projects you're currently working on. Rust follows SemVer, and for the time being there are no plans to bump the major version, so updates are harmless. Unless, of course, your code depends on unsoundness, in which case it should never have compiled, and you can't complain about it being fixed.
Thanks to Semver, code that compiled once should always compile, since compatible versions of dependencies will always be fetched. Even a "yanked" version of a crate can still be downloaded by projects that were already using it. Yanking only prevents new projects from depending on the bad version.
Note also that the compiler has a CI system that runs the test suites of all crates on crates.io to look for regressions. In theory, a change to the compiler that would fundamentally break your library should be seen a long way off.
Old executables can break from underneath you if system libraries that they dynamically link to change. Rust binaries are mostly statically linked, but our friend libc
is always hanging around:
> ldd setwall linux-vdso.so.1 (0x00007ffe253ea000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f77a6eaa000) libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f77a6e88000) libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f77a6e82000) libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007f77a6e68000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f77a70fd000)
Fortunately, Rust projects can be compiled with MUSL to be fully statically linked:
> cargo build --release --target x86_64-unknown-linux-musl > cd target/x86_64-unknown-linux-musl/release/ > ldd setwall not a dynamic executable
Because of good namespacing, all symbols and function names can be given clear, logical names without the need for mangling to insure uniqueness:
struct Foo {
a: u32,
b: bool,
c: String,
}
struct Bar {
a: bool,
b: String,
c: u32,
}
enum Colour {
Red,
Green,
Blue,
}
enum Light {
Red,
Green,
Blue,
}
These same-namings cause no compilation problems. This is par-for-the-course for many languages, but Haskellers would appreciate this.
Futher, rustfmt
output is optimized for clean diffs. This sometimes makes code longer (top-to-bottom) than it otherwise could be, but small diffs improve the experience of code reviewers.
Rust has the strongest dead-code analysis that I've seen, and it is a first-class feature of the compiler.
There are also many tools to analyse one's dependencies for deprecations, bloat, etc.:
unsafe
in transitive dependencies.I try not to "fanboy" when it comes to languages. As someone who creates software, I have a set of needs. If those needs are met, I like the language. If I discover that another language meets them better, I move on.
Rust is a serious tool for Software Development, and not because of its language features, its performance, or how it looks. It's the entire package, and I see myself enjoying it for some time.
In the next addition to this series, we'll analyse Haskell in the same way.
Thanks to ssokolow for mentioning a number of dependency tools.
LinkedList
type, its use is not common. Vec
is preferred.streaming
, pipes
, conduit
, etc.NumericUnderscores
syntax is enabled by default.GeneralizedNewtypeDeriving
, although the derive_more crate offers a solution at the library level.