Colin Woodbury - Software Development Languages: Rust

In the first part of this series we gave formal definitions for Programming Language and Software Development Language, followed by a list of typical expectations that working software devs have of their tools. In this post, we'll explore to what degree the Rust Language meets those expectations.

With a few Rust projects now under my belt, I can say that Rust is a very strong Software Development Language.

If you have been curious about trying Rust, I highly recommend it. Just think:

C-speeds.
No exceptions.
No null.
Automatic memory management without a garbage collector.
A practical, balanced amount of FP semantics.
Modern project / dependency management.

The sections below don't need to be read in order. Feel free to jump around to what interests you via the Table of Contents above.

Rust for Software Development

Rust is a fantastic language for Software Development. Not because it has or doesn't have Feature X, but because it enables me to efficiently write and maintain software. Here are some specific needs that Rust meets in each of the Five Pillars of Software Development:

Programming

How are new projects created and compiled?

Rust projects are managed with the Cargo tool.

cargo new <project-name>

This produces a Cargo.toml project config file, a src/main.rs, and a .gitignore. It also creates an empty git repo. This new project is compilable as-is.

To compile, we typically don't invoke the compiler directly. Instead we use cargo here too. Building a project is just:

cargo build

Similarly, cargo test will run your tests, and cargo run will run your exectuable.

How is the moment-by-moment programming experience?

There is actually no need to frequently recompile: cargo check (see also cargo-watch) tells me immediately where and why my code is incorrect without fully compiling. Error messages are clear, well-formatted, and even provide Error Codes that you can query the compiler for to obtain elaborate explanations of the cause of the error.

  error[E0728]: `await` is only allowed inside `async` functions and blocks
     --> src/main.rs:250:5
      |
  249 | fn use_it() {
      |    ------ this is not `async`
  250 |     this_is_special().await;
      |     ^^^^^^^^^^^^^^^^^^^^^^^

  error: aborting due to previous error

  For more information about this error, try `rustc --explain E0728`.

Otherwise, Rust provides an official LSP, so you can expect all the convenience that that offers.

What language idioms are available?

Rust could be called a Post-OOP language (like Golang), in that we're returning to fundamentals. We have structs, enums, functions, loops, and variables. Mutability is available, but is always explicit and not the default. There are no Classes to be seen, but the concept of the Method was accepted and is quite useful for code organization. Exceptions were left out, as was the concept of null. Well wait, how do we model errors then?

Rust also paid attention to what the Functional Languages have been doing over the past decades. So enums can hold inner values (aka "Sums of Products" or "ADTs"), we have Pattern Matching, and we have Traits (which can be auto-derived!). Errors are handled explicitly with the Option and Result types. Further, we can do a kind of "error threading" with the ? operator, which I believe is a wonderful distillation of the Lesson of Monads:

  // If a line with `?` fails, the entire function is short-circuited.
  // The `?` and `let` together automatically unwrap the inner value.
  fn work() -> anyhow::Result<()> {
      let config = cargo_config()?;
      release_build()?;
      tarball(&config.package)?;
      let sha256 = sha256sum(&config.package)?;
      let pkgbuild = pkgbuild(&config.package, &sha256);
      fs::write("PKGBUILD", pkgbuild)?;

      Ok(())
  }

Rust is primarily strict, but accepts the Lesson of Laziness in the form of chainable Iterator operations.

Finally, Rust adds the new concepts of Ownership and Borrowing, which allow for its GC-less automatic memory management.

Is it verbose? Is it alright to look at?

I find Rust more verbose than Haskell, but less than Golang. Here is what a few common constructs look like:

  // Importing
  use std::collections::HashMap;

  // A public struct with private fields. `derive` macro lets us
  // auto-derive Trait implementations.
  #[derive(Deserialize)]
  pub struct User {
      name: String,
      age: u32,
      tall: bool,
  }

  impl User {
      // Public method that borrows the `self` mutably.
      pub fn older(&mut self) {
          self.age += 1
      }
  }

  // A publically exposed function with a docstring
  // hyperlinked to other types.
  /// Try to extract a position from the `Mess` as a nice integer,
  /// as if it were a [`SemVer`](struct.SemVer.html).
  pub fn nth(&self, x: usize) -> Option<u32> {
      let i = self.chunk.get(x)?;
      let (i, n) = parsers::unsigned(i).ok()?;
      match i {
          "" => Some(n),
          _ => None,
      }
  }

rustfmt makes all code layout standard, so say goodbye to style arguments.

Testing

How does the language protect me from myself?

Strong types and no null. Thanks to Rust's Ownership system, the pitfalls of pointer and memory management in C are long gone. Yes there is technically IO everywhere, but once again Ownership makes this hard to abuse. Special IO and STM Monads aren't necessary here.

How are tests written, especially for unexported functions?

Unit tests go in the file of the functions they're testing (even your main.rs!):

  fn double(n: u32) -> u32 {
      n * 2
  }

  #[test]
  fn is_it_double() {
      assert_eq!(4, double(2));
  }

You can also add tests to your docstrings inside a markdown ``` block, and cargo will detect and run these. This way, your code samples can never drift out of date.

  /// ```
  /// assert_eq!(6, double(3));
  /// ```
  fn double(n: u32) -> u32 {
      n * 2
  }

Is it easy to write slow code?

A quick note on the wording of this section title: when it comes to the multi-staged-multi-person development of long-lived software, raw executable performance is often not a priority for the business. This is due to a number of factors:

Much software is IO-bound, not CPU bound.
Developer time (compile cycles, CI cycles) is often more valuable than CPU time.
Refactors to improve performance have a cost.

Of course there are fields where executable performance is critical. And at a point, sufficiently bad default performance can noticeably sour a user's experience. Hence the implication of the title: is it easy to accidentally write code that will perform poorly? Some languages punish you for writing them idiomatically, but luckily Rust is not one of them.

A major path to performance in any language is the avoidance of allocation. In Rust, mutability is readily available and hard to screw up:

  fn mutability() {
      let mut hm = HashMap::new();

      hm.insert(1, 'a');
      hm.insert(2, 'b');
      hm.insert(3, 'c');

      // The map is borrowed immutably by the next function,
      // so can still be manipulated here. No memory is copied.
      use_the_map(&hm);

      // We still own the map, so we're free to continue mutating it.
      hm.insert(4, 'd');

      // Ownership has passed to the next function,
      // the map can no longer be referenced here.
      // It is deallocated automatically from `move_the_map`'s
      // end when it returns.
      move_the_map(hm);

      // Won't compile.
      // hm.insert(5, 'e');
  }

We can also see how memory-conscious Rust is: heap memory is basically never copied without the programmer's consent. Further, by default, Rust puts as much onto the stack as it can. Primitive types are unboxed, and we have fast, compact Array types. Chaining iterator operations in a functional style is idiomatic and compiles to highly optimized code.

The lesson: If you write idiomatic Rust and use standard data structures, you will get good off-the-shelf performance.

What is the CI situation?

Github's default Rust Action will have your project built and tested within a few minutes, even without a cache of dependencies. There's even an Action to automatically publish Rust Books.

Collaborating

Where do I find answers to my questions?

Foremost, the Rust User Forums. Each question I have asked there was answered in about 15 minutes and by more than one person.

How do I track changes to Rust itself?

Release announcements and other interesting articles are frequently posted on the official Rust blog. A weekly summary of community developments is also available with the This Week in Rust newsletter.

Are there competing paradigms to write Rust?

Haskell and Scala devs will know what I mean by this question. Rust is mostly Rust when it comes to idioms or "sublanguages" introduced by libraries. The exception is the recent addition of the async keyword and its associated functionality.

Concurrency was always possible with Rust, and still is without async. Want to fork two system threads and share data between them? Go ahead:

  use std::sync::{Arc, Mutex};
  use std::thread;

  fn concurrency() -> thread::Result<u32> {
      // `Arc` is "Atomic Reference Counter". It's an addition
      // to `Mutex` that ensures we're sharing memory responsibly.
      let mutex0 = Arc::new(Mutex::new(0));
      let mutex1 = mutex0.clone();
      let mutex2 = mutex0.clone();

      // Spawn system threads and mutate shared memory.
      // No explicit unlock call is necessary.
      let handle0 = thread::spawn(move || {
          *mutex0.lock().unwrap() += 1;
      });

      let handle1 = thread::spawn(move || {
          *mutex1.lock().unwrap() += 1;
      });

      // Wait for the threads to complete.
      handle0.join()?;
      handle1.join()?;

      // 2
      let result = *mutex2.lock().unwrap();

      Ok(result)
  }

Want to iterate over a collection in parallel? Go ahead:

  use rayon::prelude::*;

  fn parallel_iteration() {
      let nums = vec![1, 2, 3, 4, 5]; // Could be any size.

      // Maps, filters, and prints entirely in parallel with
      // as many CPU cores as you have.
      nums.par_iter()
          .map(|n| n + 1)
          .filter(|n| n % 2 == 0)
          .for_each(|n| println!("{}", n));
  }

Whereas async functions look like this:

  async fn this_is_special() {
      println!("Hello, ");
  }

  async fn use_it() {
      this_is_special().await;

      println!("World!");
  }

Where await pauses the current function (Task, actually), yields control back to the concurrent runtime for other Tasks to be ran, and resumes eventually once the runtime sees that this_is_special has completed. await can't be called in a function that isn't itself marked with async, so the asyncness spreads, much like IO in Haskell.

async was added as a way to formalize the creation of highly concurrent applications. However, this was all done at the Trait-level: no runtime to manage Tasks / Green Threads was provided by Rust itself. There are currently two main runtimes: async-std and Tokio. Both have growing ecosystems and seem well-adopted.

Libraries are now either "async-compatible" or not, but for libraries that are unconcerned with networking, this is an irrelevant distinction. For many uses of Rust, async can be entirely ignored. This also means that the binary weight of the concurrent runtime is entirely left out of such projects.

How do I depend on other libraries?

Rust projects are called "crates" and are found on crates.io. cargo manages dependencies for us too, downloading them if missing. Depending on another library looks like:

  [dependencies]
  anyhow = "1.0"
  chrono = { version = "0.4", features = ["serde"] }
  counter = "0.5"

Many libraries have extra features that you can optionally activate. The version numbers follow Semantic Versioning, and this is strictly enforced.

Releasing

How are Rust projects published?

Publishing a crate to crates.io is as easy as running cargo publish, and the result appears as a page like this. Uploading a new version is the same command. Buggy versions can also be "yanked" off the registry to avoid accidental usage.

How do I document a project?

Rust docstrings are markdown and render quite nicely. As mentioned above, code samples in a docstring found within a ``` block will be ran as a test, and there is no extra configuration necessary to enable this.

All published libraries have docs automatically generated for them. You can also open your project's documentation (with all dependencies too!) locally with cargo doc --open. From there, you can search for any type or function name.

Can a single old dependency hold the whole ecosystem back?

Luckily, no. If two of your dependencies require different versions of the same transitive dependency, both will be brought into your binary. In practice this isn't a real problem because:

Binaries optimize to a fairly small size anyway.
There are enough keeners in the community to detect these mismatches and update them. Tooling is also available for detection.

How do I produce an optimized release binary?

cargo build --release. This will recompile all dependencies and activate optimizations. Add the following to your Cargo.toml to reduce binary size and further improve performance:

[profile.release]
lto = true
strip = true

Here are the stripped binary sizes of a few simple programs:

Program	Go	Haskell	Rust
Hello World	1.4mb	695kb	207kb
Server with `/` endpoint	5.2mb	2.0mb	1.6mb
Simple JSON Server	5.5mb	2.5mb	1.7mb

And since Rust has no runtime like Go or Haskell, there are no mysterious flags to pass to your executable to have it perform sanely.

For more information on how to reduce Rust binary sizes specifically, see this repo.

How do I develop and release Rust on non-Linux systems?

No matter the platform, all buildtool commands are the same. To discover what platforms are supported, do:

rustup target list

As of this writing, Rust supports 84 different platforms. Among those we see:

x86_64-apple-darwin
x86_64-pc-windows-gnu
x86_64-pc-windows-msvc

So Rust isn't Linux-only by any stretch of the imagination.

Maintenance

Does Rust code crash a lot?

Rust was specifically designed not to crash for the usual reasons we encounter:

The machine/runtime ran out of memory.
An Exception was thrown from user code and wasn't caught.
C (etc.): Illegal memory access or other "use after free" scenarios.
Haskell: You hit a partial function's edge case. (e.g. head [])

All of these are to a varying degree due to programmer negligence, and every language takes a stance (or a non-stance) on how to address them.

Out-of-Memory

Garbage Collectors are convenient but can't save you from leaks or runaway processes that allocate more and more memory. Rust memory is freed as soon as it is no longer needed, the timing for which is known at compile time, not at runtime as with a GC.

Rust has no runtime at all, so there is no worry of your process hitting an arbitrary memory cap as with, say, the JVM.

Exceptions and Panics

Like Golang, Rust doesn't have Exceptions. Also like Golang it does have "panics", which are errors that should, morally, never be recovered from. It is a convention of documentation to warn a user if a function can panic, but in general panics should only occur in truly exceptional situations. Otherwise, all errors are modelled with the Option and Result types, even IO errors that other languages throw Exceptions for.

Use-after-Free

There are certain operations which Rust names as being unsafe. These are usually impossible to perform unless marked by the unsafe keyword. Sometimes you'll have a legitimate reason to do this, but most of the time you won't need to.

Otherwise, the Ownership system is what makes it virtually impossible to misuse memory. Even higher-level constructs like file handles, mutexes, and database connections can't be reused after they've been relinquished. Code that attempts to do so won't even compile.

Partial Functions

Sums-of-products with named fields can't be directly referenced as they can in Haskell:

  enum Colour {
      Red { a: bool },
      Blue { b: u32 },
      Green { c: char },
  }

  fn bad(colour: Colour) -> u32 {
      colour.b // Won't compile.
  }

But types like Option and Result do still have an unsafe unwrap method that panics in the Error case. You're generally encouraged to use ?, pattern match on the type, or call unwrap_or instead. Likewise, types like Vec offer methods like get_unchecked for when you're very confident that you can avoid the Option wrapping.

How much of a threat is bitrot? Will the ecosystem leave me behind?

This is one of the most important aspects of development when considering software intended to last decades. As I described in another article, a language's ecosystem can "leave you behind" if you wait too long to upgrade your toolchain / dependencies.

The Compiler

Rust has three release channels (nightly, beta, and stable) and has frequent releases. They call this their "train schedule". Further, every three years a new "Edition" is released which, breaking or not, allows the Rust team to look back, summarize the changes, and segregate language idioms. Which edition of Rust you're using is specified in your project's Cargo.toml, so this is never a surprise:

  [package]
  name = "foo"
  version = "0.1.0"
  edition = "2018"

New compiler/toolchain versions are also simple to upgrade to:

rustup update stable

Updating your compiler will require that you recompile whatever projects you're currently working on. Rust follows SemVer, and for the time being there are no plans to bump the major version, so updates are harmless. Unless, of course, your code depends on unsoundness, in which case it should never have compiled, and you can't complain about it being fixed.

Dependencies

Thanks to Semver, code that compiled once should always compile, since compatible versions of dependencies will always be fetched. Even a "yanked" version of a crate can still be downloaded by projects that were already using it. Yanking only prevents new projects from depending on the bad version.

Note also that the compiler has a CI system that runs the test suites of all crates on crates.io to look for regressions. In theory, a change to the compiler that would fundamentally break your library should be seen a long way off.

As of Rust 1.59 (2022 February), cargo will now warn you if you have dependencies that use deprecated language features.

System Libraries

Old executables can break from underneath you if system libraries that they dynamically link to change. Rust binaries are mostly statically linked, but our friend libc is always hanging around:

  > ldd setwall
  linux-vdso.so.1 (0x00007ffe253ea000)
  libc.so.6 => /usr/lib/libc.so.6 (0x00007f77a6eaa000)
  libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f77a6e88000)
  libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f77a6e82000)
  libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007f77a6e68000)

Fortunately, Rust projects can be compiled with MUSL to be fully statically linked:

  > cargo build --release --target x86_64-unknown-linux-musl
  > cd target/x86_64-unknown-linux-musl/release/
  > ldd setwall
    not a dynamic executable

How does code stay readable?

Because of good namespacing, all symbols and function names can be given clear, logical names without the need for mangling to insure uniqueness:

  struct Foo {
      a: u32,
      b: bool,
      c: String,
  }

  struct Bar {
      a: bool,
      b: String,
      c: u32,
  }

  enum Colour {
      Red,
      Green,
      Blue,
  }

  enum Light {
      Red,
      Green,
      Blue,
  }

These same-namings cause no compilation problems. This is par-for-the-course for many languages, but Haskellers would appreciate this.

Futher, rustfmt output is optimized for clean diffs. This sometimes makes code longer (top-to-bottom) than it otherwise could be, but small diffs improve the experience of code reviewers.

How do I get rid of code I don't need?

Rust has the strongest dead-code analysis that I've seen, and it is a first-class feature of the compiler.

There are also many tools to analyse one's dependencies for deprecations, bloat, etc.:

cargo tree: Display a text-based tree of all transitive dependencies.
cargo deps: Generate an image-based graph of all transitive dependencies.
cargo outdated: Display which of your transitive dependencies have updates available.
cargo bloat: Discover which dependencies have the heaviest code footprint in your binary.
cargo-llvm-lines: Discover which functions, when compiled to LLVM, have the biggest footprint.
cargo deny: Discover multiple versions of transitive dependencies lurking in your dep graph.
cargo geiger: Detect usage of unsafe in transitive dependencies.

Conclusion

I try not to "fanboy" when it comes to languages. As someone who creates software, I have a set of needs. If those needs are met, I like the language. If I discover that another language meets them better, I move on.

Rust is a serious tool for Software Development, and not because of its language features, its performance, or how it looks. It's the entire package, and I see myself enjoying it for some time.

If you liked the article, consider sending me a coffee!

Appendix

Acknowledgements

Thanks to ssokolow for mentioning a number of dependency tools.

Extra Notes for Haskellers

Athough Rust has a LinkedList type, its use is not common. Vec is preferred.
Missing pattern match branches are an error, not a warning.
Rust knows how to pretty-print types by default.
It is not possible to write orphan instances.
Generics are monomorphized, meaning there's no runtime penalty for using them!
Iterator streaming is first-class, so no need for a library-level streaming, pipes, conduit, etc.
NumericUnderscores syntax is enabled by default.
Unfortunately there's no first-class GeneralizedNewtypeDeriving, although the derive_more crate offers a solution at the library level.
Being able to pass ownership solves the problem that Haskell's Linear Types are aiming at.
Rust has official WASM support, and the generated binaries are very small.
There is no equivalent to Stackage, and while dependency bounds can be defined, I've never seen it done. Everyone relies on SemVer, pins to a major version, and updates frequently.
Rust has no hole-fit suggestions, which might be the thing I miss the most.