Table of Contents

We often move to and from the `[]`

type in Haskell, for example in the transformation `Map k v -> Vector v`

. Luckily, GHC is good at "fusing away" these intermediate lists. One function that's easy to reach for during this process is `Foldable.toList`

, since it is always imported by the `Prelude`

. But we also see that every major container type exports its own variant, like `Map.toList`

, `Vector.toList`

, etc. Similar to my post on Measuring Haskell
Containers Sizes, this made me wonder if `Foldable.toList`

is "safe" (performance wise) in all cases, or if we should trust the variants exported by each module. Let's find out.

`Foldable.toList`

The default implementation of `Foldable.toList`

looks like this:

`build`

? This is where the magic comes from:

```
-- | A list producer that can be fused with 'foldr'.
-- This function is merely
--
-- > build g = g (:) []
--
-- but GHC's simplifier will transform an expression of the form
-- @'foldr' k z ('build' g)@, which may arise after inlining,
-- to @g k z@, which avoids producing an intermediate list.
build :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
{-# INLINE [1] build #-}
-- The INLINE is important, even though build is tiny,
-- because it prevents [] getting inlined in the version that
-- appears in the interface file. If [] *is* inlined, it
-- won't match with [] appearing in rules in an importing module.
build g = g (:) []
```

For the purpose of this analysis, if we see `build`

in an implementation, we'll assume that fusion occurs. Otherwise, we'll assume it doesn't. Unless the code comments claim otherwise. **If this assumption is incorrect, please let me
know.**

*Emily Pillmore points out:*

This is an okay assumption to make. It may be correct in some cases, but wrong in others. It's implementation dependent. It also depends on the caller - if one uses: V.fromList $! M.toList m then you may break the fusion rules so even in the case of "yes this fuses", you have to assume you're lazily composing.

- "Same?" means "Is
`Foldable.toList`

defined in terms of the custom`toList`

? - "Fusion?" means that the
`build`

function was used in a`toList`

implementation.

Container | Custom `toList` | `Foldable.toList` | Same? | Fusion? | Notes |
---|---|---|---|---|---|

`NonEmpty` | `O(1)` | `O(1)` | No | N/A | |

`Map` | `O(n)` | `O(n)` | No | Yes? | Implementations nearly identical |

`Set` | `O(n)` | `O(n)` | Yes | Yes? | |

`Seq` | N/A | `O(n)` | N/A | No? | |

`Tree` | `O(n)` | `O(n)` | Yes | Yes | Custom called `flatten` |

`Array` | `O(n)` | `O(n)` | Yes | Yes | Custom called `elems` |

`HashMap` | `O(n)` | `O(n)` | No | Yes | Custom called `elems` |

`HashSet` | `O(n)` | `O(n)` | No | Yes | |

`Vector` | `O(n)` | `O(n)` | Yes | Yes |

`NonEmpty`

What we'd expect:

`Map`

```
-- | /O(n)/. Convert the map to a list of key\/value pairs.
-- Subject to list fusion.
toList :: Map k a -> [(k,a)]
toList = toAscList
toAscList :: Map k a -> [(k,a)]
toAscList = foldrWithKey (\k x xs -> (k,x):xs) []
foldrWithKey :: (k -> a -> b -> b) -> b -> Map k a -> b
foldrWithKey f z = go z
where
go z' Tip = z'
go z' (Bin _ kx x l r) = go (f kx x (go z' r)) l
```

But the `Foldable`

instance advertises proactively that its `toList`

isn't the same.

```
instance Foldable (Map k) where
toList = elems -- NB: Foldable.toList /= Map.toList
elems :: Map k a -> [a]
elems = foldr (:) []
-- | /O(n)/. Fold the values in the map using the given
-- right-associative binary operator, such that
-- @'foldr' f z == 'Prelude.foldr' f z . 'elems'@.
foldr :: (a -> b -> b) -> b -> Map k a -> b
foldr f z = go z
where
go z' Tip = z'
go z' (Bin _ _ x l r) = go (f x (go z' r)) l
```

Notice that this `foldr`

does not claim that it is subject to list fusion, although as far as I can tell by comparing to `foldrWithKey`

, the implementations appear identical. Further, I don't see any call to `build`

.

`Set`

Although `Set`

and `Map`

have a similar internal structure, it seems that `Set`

*does* use its custom `toList`

as its `Foldable.toList`

.

`Seq`

`Seq`

has no custom `toList`

, nor does its `Foldable`

instance provide a custom implementation. This means its `toList`

is based on its `foldr`

:

```
instance Foldable Seq where
foldr f z = foldr (f .# getElem) z .# getSeq
getSeq :: Seq a -> FingerTree (Elem a)
getSeq (Seq xs) = xs
```

The `Foldable`

instance of `FingerTree`

has a large custom `foldr`

, and thus no call to `build`

. This is `O(n)`

, but it's not clear to me that it fuses.

`Tree`

Tree doesn't have a function named `toList`

, but it does have ```
flatten :: Tree a
-> [a]
```

which returns its elements in pre-order.

```
flatten :: Tree a -> [a]
flatten t = squish t []
where
squish (Node x ts) xs = x:Prelude.foldr squish xs ts
```

And look!

So, `toList = flatten`

if you're using any recent GHC. `base-4.8.0`

was bundled with GHC 7.10. Since `Prelude.foldr`

is `Foldable.foldr`

, let's assume this fuses.

`Array`

`Array`

has `elems`

for fetching all its elements as a list:

```
elems :: Array i e -> [e]
elems arr@(Array _ _ n _) =
[e | i <- [0 .. n - 1], e <- unsafeAtA arr i]
instance Foldable (Array i) where
toList = elems
```

This paper (see section 3.2) suggests that raw list comprehensions are subject to fusion, so we can trust this, even though the docstring for `elems`

makes no claim about fusion.

`HashMap`

Recall that `HashMap`

had an interesting story when we were doing the analysis for `Foldable.length`

. Let's see what happens for `toList`

:

```
-- | /O(n)/ Return a list of this map's values.
-- The list is produced lazily.
elems :: HashMap k v -> [v]
elems = L.map snd . toList
toList :: HashMap k v -> [(k, v)]
toList t = build (\ c z -> foldrWithKey (curry c) z t)
foldrWithKey :: (k -> v -> a -> a) -> a -> HashMap k v -> a
foldrWithKey f = go
where
go z Empty = z
go z (Leaf _ (L k v)) = f k v z
go z (BitmapIndexed _ ary) = A.foldr (flip go) z ary
go z (Full ary) = A.foldr (flip go) z ary
go z (Collision _ ary) =
A.foldr (\ (L k v) z' -> f k v z') z ary
```

So, a function named `toList`

exists but it gives us the pairs. As with `Array`

, `elems`

is the "real" version. And look, `build`

is back.

And `Foldable`

?

Right, nearly empty. But since the default `toList`

is defined in terms of `foldr`

, both variants go through `foldrWithKey`

and use `build`

, so we basically end up with the same thing.

`HashSet`

Different from `HashMap`

:

And `foldrWithKey`

comes from `HashMap`

. So as with `length`

, this is the same.

`Vector`

```
-- | /O(n)/ Convert a vector to a list
toList :: Vector a -> [a]
toList = G.toList
toList :: Vector v a => v a -> [a]
toList = Bundle.toList . stream
toList :: Bundle v a -> [a]
-- toList s = unId (M.toList s)
toList s = build (\c n -> toListFB c n s)
-- This supports foldr/build list fusion that GHC implements
toListFB :: (a -> b -> b) -> b -> Bundle v a -> b
toListFB c n M.Bundle{M.sElems = Stream step t} = go t
where
go s = case unId (step s) of
Yield x s' -> x `c` go s'
Skip s' -> go s'
Done -> n
```

`Bundle.toList`

has a strange commented-out previous implementation, but otherwise this looks good. This also claims directly that it supports the list fusion we're looking for.

And for `Foldable`

:

So, it uses the custom implementation so long as you're on any recent GHC.

Most of these structures convert to `[]`

in `O(n)`

, but most also (have evidence to suggest that they) fuse. Overall, I think we've shown that it's **safe to use
Foldable.toList**!

Thanks to Emily Pillmore for elaborating on a number of points.

Blog Archive