# Difference between revisions of "Thunk"

Uchchwhash (talk | contribs) (start) |
m (Removed an unnecessary comma.) |
||

(6 intermediate revisions by 4 users not shown) | |||

Line 1: | Line 1: | ||

⚫ | |||

+ | [[Category:Glossary]] |
||

+ | A '''thunk''' is a value that is yet to be evaluated. |
||

+ | It is used in Haskell systems that implement [[non-strict semantics]] by [[lazy evaluation]]. |
||

+ | A lazy run-time system does not evaluate a thunk unless it has to. |
||

⚫ | Expressions are translated into a graph (''not'' a tree, as you might have expected, this enables sharing, and infinite lists!) and a Spineless Tagless G-machine (STG, G-machine is short for graph reduction machine) [https://en.wikipedia.org/wiki/Lambda_calculus#Reduction ''reduces''] it, chucking out any unneeded thunk, unevaluated. |
||

== Why are thunks useful? == |
== Why are thunks useful? == |
||

Line 24: | Line 28: | ||

== When are thunks not so useful? == |
== When are thunks not so useful? == |
||

− | If you keep building up a very |
+ | If you keep building up a very complicated graph to reduce later, it consumes memory (naturally), and can hinder performance, like, (from a blog comment made by Cale Gibbard) |

<haskell> |
<haskell> |
||

foldl (+) 0 [1,2,3] |
foldl (+) 0 [1,2,3] |

## Latest revision as of 20:07, 19 October 2020

A **thunk** is a value that is yet to be evaluated.
It is used in Haskell systems that implement non-strict semantics by lazy evaluation.
A lazy run-time system does not evaluate a thunk unless it has to.
Expressions are translated into a graph (*not* a tree, as you might have expected, this enables sharing, and infinite lists!) and a Spineless Tagless G-machine (STG, G-machine is short for graph reduction machine) *reduces* it, chucking out any unneeded thunk, unevaluated.

## Why are thunks useful?

Well, if you don't need it, why evaluate it? Take for example the "lazy" `&&`

(and) operation. Two boolean expressions joined by `&&`

together is true if and only if both of them are. If you find out that one of them is false, you immediately know the joined expression cannot be true.

```
-- the first is false, so is the answer, don't even need to know what the other is
False && _ = False
-- so the first turns out to be true, hmm...
-- if the second is true, then the result is true
-- if it's false, so is the result
-- in other words, the result is the second!
True && x = x
```

This function only evaluates the first parameter, because that's all that is needed. Even if the first parameter *is* true, you don't need to evaluate the second, so don't (so this version effectively is smarter than the explicit truth table). Who knows, the second parameter may get thrown out later as well!

Perhaps a more convincing example is a (naive but intuitive) algorithm to find out if a given number is prime.

```
-- the non-trivial factors are those who divide the number so no remainder
factors n = filter (\m -> n `mod` m == 0) [2 .. (n - 1)]
-- a number is a prime if it has no non-trivial factors
isPrime n = n > 1 && null (factors n)
```

Fascinatingly, `isPrime`

evaluates to `False`

as soon as it finds a factor (due to the lazy definition of `null`

), and discards the rest of the list.

## When are thunks not so useful?

If you keep building up a very complicated graph to reduce later, it consumes memory (naturally), and can hinder performance, like, (from a blog comment made by Cale Gibbard)

```
foldl (+) 0 [1,2,3]
==> foldl (+) (0 + 1) [2,3]
==> foldl (+) ((0 + 1) + 2) [3]
==> foldl (+) (((0 + 1) + 2) + 3) []
==> ((0 + 1) + 2) + 3
==> (1 + 2) + 3
==> 3 + 3
==> 6
```

and by `==>`

, "is reduced to" is meant. Lucky, the example is not applied to `[1 .. 2^40]`

because it would have taken a very long time to load this page then, and so would your program when run. For more involved (and subtle!) examples, see Performance/Strictness.