Queenside Castling on Cardano : Bitwise Operations for Better Solutions and Improved Plutus Scripts
Introduction
Currently, MLabs is undertaking the work of bringing bitwise primitive operations to Plutus Core, specifically CIP-122 and CIP-123. As part of this work, we were asked to provide some benchmarks demonstrating how such primitive operations can be applied. We decided that one of these should be a backtracking solver for the 8-queens problem. While designing and implementing this, we came across an interesting solution that we thought would be fun to share and demonstrate.
In this article, we will describe two solutions to the n-queens problem, both using operations over a bit-oriented structure. Our first solution will be intuitive, while our second will demonstrate how we can improve on our first attempt, both in theory and in practice. We will also cover some interesting topics relating to bitwise operations, mutable operations, and how we can represent bit-packed structures, specifically bitvectors.
No knowledge of Plutus or Plutus Core is required to enjoy this article. We refer to some intermediate Haskell concepts (most notably the use of ST
and mutability), as well as some light asymptotic analysis, but our goal is to make this article accessible to all Haskellers.
The problem
As the title suggests, we're going to talk about the venerable n-queens problem. To recap, the problem statement is this:
Given a dimension n
, give n
sets of row-column coordinates on an n
by n
chessboard such that, if a queen is placed at each such position, no two queens would threaten each other.
This problem has been used as a puzzle, as well as a demonstration of backtracking performance, for a considerable time[1]. While completely analytical solutions exist, which technically make backtracking unnecessary, we will work on a backtracking-oriented solution here.
Furthermore, we're going to try and make use of a bitwise approach, specifically making use of (a flavour of) rank-select dictionary to help us. More specifically, we will be using such a structure to keep track of our search state while demonstrating (both theoretically and practically) how this can be beneficial. To keep the presentation manageable (as rank-select dictionaries are quite a deep topic), we will use a simplified presentation of the relevant operations here.
Basics
Before we present any solutions, we cover a few basic ideas that we will use throughout this article. We assume all indexing is zero-based. Furthermore, to make the presentation less fiddly, we will assume that any n
we are given is an exact multiple of 8[2].
Backtracking and n-queens
We make the following observation:
Any valid n-queens solution will use each row and column exactly once.
To see why this is the case, we first note that if a row or column is not used, because we have to place n
queens, we would have to use a different row or column more than once. Furthermore, if we placed two (or more) queens in the same row or column, our solution would immediately fail, as they'd threaten each other.
Based on this observation, we can follow a row-by-row approach that does something similar to this:
Try placing a queen somewhere in the current row; if you can't, backtrack.
Mark all positions the queen from 1 would threaten in the rows below.
Repeat for the row below; if you get backtracked to, unmark, then try another position. If you can't find one, backtrack again.
Vital to this approach is the ability to track the state of our search. Specifically, we need to know:
What choices of placement we have at the current row;
What positions are 'marked'; and
What choice ordinal (first, second, etc) we're up to on the current row
This information must be modifiable but also restorable because every time we backtrack we may need to 'undo' any modifications potentially all the way back to the start. We also want to use as little space as possible to represent this information: more space used will mean more work for us every time we backtrack or commit.
Bitwise
Throughout, we will use a bitvector data structure. We can think of a bitvector as an array of individual bits packed as tightly together as possible. Such structures have a range of applications, but in our case, we will use them as extremely compact arrays of Bool
s. There are numerous reasons why this is beneficial: not only is the required memory pretty much as low as possible (namely, bitvectors are implicit), it is also packed densely in memory making it cache-friendly and efficient.
There are a range of data types provided to us by Haskell libraries that could be used to implement a bitvector type:
Strict
ByteString
frombytestring
ByteArray
fromprimitive
(or its corresponding unboxed variant frombase
)UArray i w
fromarray
, wherew
is any type with aFiniteBits
instanceUnboxed
Vector
s fromvector
of any type with aFiniteBits
instance (or ofBit
frombitvec
)Storable
versions of all the abovemassiv
array versions of all the above
We will use ByteArray
for our description, for three reasons: firstly, it is now available in base
(although its operations still reside in primitive
for now); secondly, it has a small and simple API, which helps our presentation; and thirdly, it has a mutable counterpart in MutableByteArray
which we can use in ST
, which is clearer for what we want to do.
Defining this type is straightforward:
newtype Bitvector = Bitvector ByteArray
The next choice we have to make is the bit indexing scheme. This is because ByteArray
is oriented around byte accesses, not bit accesses, and we have several choices in how we can access individual bits in a packed, byte-oriented structure. We use a 'right-to-left' indexing scheme; namely, the bit at position 0 is the rightmost (least-significant bit in the last byte), and indexes proceed to the right (increasing significance within bytes, reducing-indexing across them). For an example, consider the Bitvector
whose representation is a length 2 ByteArray
with (byte) index 0 being 0x8F
and (byte) index 1 being 0x37
. Then, its bit indexes will look as follows:
Bit indexes | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
Bit values | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 0 |
Byte indexes | 0 | 1 |
We make this choice primarily because in this setting, bit indexing within a Bitvector
agrees with bit indexing for bytes (by way of the Bits
type class), as well as because it means consecutive bit indexes within a byte are consistent with consecutive bit indexes across bytes. This will make certain operations easier to define. Unfortunately, this choice forces us to 'run backwards' as compared to byte indexes, but since we don't need to access whole bytes of a Bitvector
as part of our required API, we won't worry about it.
We define one construction operation for Bitvector
s, which produces a Bitvector
storing n
bits, all of which are clear. Due to our restriction to exact multiples of 8 for n-queens dimensions, we won't worry about being given a length that's not a multiple of 8.
zeroes :: Int -> Bitvector
zeroes n = runST $ do
let !len = len `quot` 8
result <- BA.newByteArray len
BA.fillByteArray result 0 len 0x00
Bitvector <$> BA.unsafeFreezeByteArray result
We will use this pattern of mutable allocation, followed by mutable operations, followed by freezing throughout this article. We do this inside of ST
which as this allows us to minimize the amount of copying we need to do as well as giving us control over when any copying occurs. This is convenient with MutableByteArray
which works in any PrimMonad
which is a capability-based type class indicating that we have access to mutable state. The processing involves writing a zero byte to every position.
'Freezing' in this context refers to converting from a mutable to an immutable representation of the same data. There is a reverse counterpart to this operation called 'thawing' that converts an immutable representation to a mutable one. Each comes in two variants: a 'safe' one (which copies the memory before converting it), and an 'unsafe' one (which doesn't). Which to use depends on whether the memory being operated on could be aliased (namely, someone other than our function would be able to see its changes). In this case, we use the 'unsafe' freeze operation, as the memory is newly allocated. Nobody else can have a reference to it.
One key operation we will need is the ability to set bits at an index (that is, make the bit at that position into a 1). Because Bitvector
is an immutable type, we don't modify it in place: instead, we return a new copy with the change applied.
setBit :: Bitvector -> Int -> Bitvector
setBit (Bitvector ba) ix = runST $ do
let !len = BA.sizeofByteArray ba
thawed <- BA.thawByteArray ba 0 len
let !bigIx = len - (ix `quot` 8) - 1
let !smallIx = ix `rem` 8
!originalByte :: Word8 <- BA.readByteArray thawed bigIx
let !newByte = originalByte .|. bit smallIx
BA.writeByteArray thawed bigIx newByte
Bitvector <$> BA.unsafeFreezeByteArray thawed
This operation is similar in structure to zeroes
, except instead of allocating new memory, we use a 'safe' thaw to first copy our input Bitvector
, then mark that copy as mutable. This is exactly what we want, as then the original Bitvector
we passed as an argument won't be modified. To set the bit at the given index, we do the following:
Split the index into a 'byte part' (
bigIx
) and a 'bit part' (smallIx
). ThebigIx
will need adjustment as it will be 'from the end'.Read the byte at
bigIx
from the originalBitvector
.Use a bitwise OR, combined with a single-bit mask based on
smallIx
, to set the required bit in the byte read at 2.Write back the result from 3 to the mutable copy of our original
Bitvector
atbigIx
.
This operation requires a copy of the entire input Bitvector
. We may sometimes want to set many bits in a row, which means an operation like this would make sense to have:
setBits :: Bitvector -> [Int] -> Bitvector
We could implement it as a fold:
setBits = foldl' setBit
However, this would be extremely inefficient: for a bitvector of length n
, and a list of m
changes, we would have to copy the bitvector m
times (for a total of θ(mn)
work), only to throw all but the last copy away. To avoid this, we instead factor out the mutable bit setting code and define both setBit
and setBits
relative that helper:
setBit :: Bitvector -> Int -> Bitvector
setBit (Bitvector ba) ix = runST $ do
let !len = BA.sizeofByteArray ba
thawed <- BA.thawByteArray ba 0 len
setBit' thawed len ix
Bitvector <$> BA.unsafeFreezeByteArray thawed
setBits :: Bitvector -> [Int] -> Bitvector
setBits (Bitvector ba) ixes = runST $ do
let !len = BA.sizeofByteArray ba
thawed <- BA.thawByteArray ba 0 len
traverse_ (setBit' thawed len) ixes
Bitvector <$> BA.unsafeFreezeByteArray thawed
setBit' ::
forall (s :: Type).
MutableByteArray s ->
Int ->
Int ->
ST s ()
setBit' mba !len !ix = do
let !bigIx = len - (ix `quot` 8) - 1
let !smallIx = ix `rem` 8
!originalByte :: Word8 <- BA.readByteArray mba bigIx
let !newByte = originalByte .|. bit smallIx
BA.writeByteArray mba bigIx newByte
The passing of len
is a little redundant here as we have this information available in constant time (MutableByteArray
s are counted).Yet, since we have to obtain this information anyway to thaw we may as well use it as much as we can. This way, we only need θ(n)
space regardless of how many changes we need to make.
The last operation we will need is one of the two standard operations provided by rank-select dictionaries: select
, more specifically select0
. Traditionally, select0
, when given a structure and a rank (first, second, etc.), returns the index of the 0 bit of that rank. For example, if we ask for the second 0 this will give the index of the 0 whose index is the second lowest. If we request too high a rank, select0
can fail: thus, we return in Maybe
.
Additionally, we augment select0
to also allow us to specify a subrange of a Bitvector
's bits in which to search. We specify this range as a pair of (start, span)
, where start
is the beginning of the subrange we are interested in, and span
is how many bits the range extends from the beginning. For simplicity, we will assume these are both multiples of 8.
selectIn :: Bitvector -> (Int, Int) -> Int -> Maybe Int
selectIn (Bitvector ba) (start, span) !which = do
let !len = BA.sizeofByteArray ba
let !startIx = len - (start `quot` 8) - 1
let !stopIx = startIx - (span `quot` 8)
go startIx stopIx which
where
go :: Int -> Int -> Int -> Maybe Int
go !ix !lim !remaining
| ix == lim = Nothing
| otherwise = do
let !compCurrentByte :: Word8 = complement . BA.indexByteArray ba $ ix
-- Since we complemented the byte, this counts zeroes
let !counted = popCount compCurrentByte
if counted > remaining
then pure . dig compCurrentByte $ remaining
else (8 +) <$> go (ix - 1) lim (remaining - counted)
-- Goes through the complemented byte to find our position
dig :: Word8 -> Int -> Int
dig !src !pos =
let !counted = countTrailingZeros src
in if pos == 0
then counted
else
let !offset = counted + 1
in offset + dig (unsafeShiftR src offset) (pos - 1)
The operation proceeds in three stages:
Determine, based on
start
andspan
, what the index range for the search should be. This is similar to how we do bit indexing.Starting from the highest byte index in the range determined at 1, read the byte at that index, complement it, then determine its Hamming weight. This tells us the number of 0 bits in this byte.
If the required rank is less than the Hamming weight calculated in 2, go to step 4. Otherwise, add 8 to the result of recursively searching in the next index down, reducing the required rank by the Hamming weight. If we run out of bytes without having reached step 4, fail with
Nothing
.'Count off' zero bits in the byte from 2 until we find the one that has the required rank. Return its position (which will accumulate with step 3 if needed).
This is a somewhat inefficient method of calculating select0
: more capable rank-select dictionaries use a range of techniques to make this operation more efficient both asymptotically and in practice. To simplify the presentation of our ideas, we won't delve into this topic.
First attempt
With the preceding operations and insights, we are ready to try and implement an n-queens solver.
We'll start by choosing the first representation of our state that could possibly work: an n
by n
bitmatrix (two-dimensional bitvector), represented in row major order. We treat set bits in this bitmatrix as marked and clear bits as, well, clear. Essentially, we represent the entire board, marking it as we descend. The board starts out completely clear.
We first define a data type to represent boards, as well as a way to construct a blank one:
data Board = Board Int Bitvector
-- Assume that dim is a multiple of 8
blankBoard :: Int -> Board
blankBoard n = Board n . zeroes $ n * n
We have to store the dimension, as otherwise, we have no way of knowing exactly how large each row is. While we could potentially avoid this, it would require computing square roots, which isn't particularly pleasant. We also note that this representation is quadratic relative to n
(or θ(n^2)
if you prefer).
At each stage of our search, we have a board, a current row, and a current choice ordinal. We start with a blank board, row 0, and choice ordinal 0. Our first step is to determine what column our current choice ordinal refers to. To illustrate this, consider the following row with X
indicating a marked position and _
indicating an unmarked one:
0 1 2 3 4 5 6 7
_ _ X X X _ X X
Here, choice ordinal 2 (that is, the third option) corresponds to position 5. However, if we instead had a row like this:
0 1 2 3 4 5 6 7
_ X X X X X _ _
then choice ordinal 2 would instead correspond to position 7. It's also possible that we request a choice ordinal that simply does not exist. In this row
0 1 2 3 4 5 6 7
_ X X X X _ X X
choice ordinal 2 is unavailable (as there is no third available position).
To determine the corresponding position (if any), we can use the selectIn
operation we described previously:
-- Position argument is implicitly passed
availableInRow :: Board -> Int -> Int -> Maybe Int
availableInRow (Board dim vec) row = selectIn vec (row * dim, dim)
This is why we chose a row-major ordering for our bitmatrix, as it keeps row bits adjacent to each other. This also ties into our choice of bit indexing scheme. We also see why we need to store the dimension together with a Board
: without this information, it would be difficult to determine what the start index, or span, that we need for any given row is.
If we fail to find a position corresponding to a given choice ordinal, we are 'locked out', because any queen placement in this row would threaten a queen placed previously. In that case, we backtrack. Otherwise, suppose we have a position pos
from availableInRow
. What we do next depends on which row we are in:
If we are in the last row, there are no more queens to place, so we 'lock in' all our placements and produce the solution.
Otherwise, we mark all positions on our
Board
that a queen placed in the columnpos
would threaten and attempt to move to the next row.
The attempt in the next row might fail; if we are backtracked to, we raise the choice ordinal and try this whole process again.
When marking our board, we must mark the entire column pos
, as well as the left and right diagonals extending from our current position downwards. We don't have to consider upward diagonals, as we cannot affect queens we have placed in such diagonals[3]. We do this marking in one step, using our previously-specified setBits
function:
markUnavailable :: Board -> Int -> Int -> Board
markUnavailable (Board dim vec) row col =
Board dim $ setBits vec updateIxes
where
updateIxes :: [Int]
updateIxes =
map rowColToIx $
-- Our column
map (,col) subsequentRows
-- Left diagonal below us
<> zip subsequentRows [col - 1, col - 2 .. 0]
-- Right diagonal below us
<> zip subsequentRows [col + 1, col + 2 .. dim - 1]
subsequentRows :: [Int]
subsequentRows = [row + 1, row + 2 .. dim - 1]
rowColToIx :: (Int, Int) -> Int
rowColToIx (r, c) = r * dim + c
The column positions are straightforward. The left and right diagonals work as described because, given pos
as a marked queen position, the column corresponding to its left diagonal k
rows down will be pos - k
(to a minimum of 0), and the column corresponding to its right diagonal k
rows down will be pos + k
(to a maximum of n - 1
). We can use the inherent truncating behaviour of zip
to ensure that we don't accidentally produce an invalid position.
Putting this all together, we have:
nqueens :: Int -> [(Int, Int)]
nqueens dim
-- Solution for 1 queen is trivial, 2 or 3 is impossible
| dim < 4 = []
-- Our restriction from before
| dim `rem` 8 /= 0 = []
| otherwise =
let state = blankBoard dim
in go 0 0 state
where
go :: Int -> Int -> Board -> [(Int, Int)]
go row choice state
-- Choice ordinal exceeds dimension
| choice == dim = []
-- Try to see if the current choice ordinal is available and where
| otherwise = case availableInRow state row choice of
-- There is no choice available at this ordinal
Nothing -> []
Just pos ->
if row == dim - 1
then -- This is the last row, commit
[(row, pos)]
else
let marked = markUnavailable state row pos
in case go (row + 1) 0 marked of
-- We dead-ended ourselves, go to the next choice
-- ordinal and retry.
[] -> go row (choice + 1) state
-- We managed, so stick this choice on.
next -> (row, pos) : next
Note the use of recursion here for backtracking: after we 'commit' to a position for a queen, we update our Board, then try to continue. If we succeed, we can 'assemble' the solution, and if not, we increase the choice ordinal and continue.
Running this code for a standard chessboard, we get the following solution:
ghci> nqueens 8
[(0,0),(1,4),(2,7),(3,5),(4,2),(5,6),(6,1),(7,3)]
This corresponds to the following chess board, where Q
marks a queen, and _
marks an empty cell:
0 1 2 3 4 5 6 7
0 Q _ _ _ _ _ _ _
1 _ _ _ _ Q _ _ _
2 _ _ _ _ _ _ _ Q
3 _ _ _ _ _ Q _ _
4 _ _ Q _ _ _ _ _
5 _ _ _ _ _ _ Q _
6 _ Q _ _ _ _ _ _
7 _ _ _ Q _ _ _ _
However, if we time how long this procedure takes, we will be disappointed:
8 queens: OK
22.4 μs ± 1.6 μs, 180 KB allocated, 3 B copied, 34 MB peak memory
16 queens: OK
4.26 ms ± 300 μs, 29 MB allocated, 900 B copied, 34 MB peak memory
24 queens: OK
248 ms ± 8.1 ms, 1.6 GB allocated, 54 KB copied, 34 MB peak memory
These results are not great: both the time and space required seem to grow much faster than the input size. Ultimately, this isn't surprising, if we consider the markUnavailable
function we defined above. Given a dimension of size n
, we have to modify (and thus copy) something whose size is θ(n^2)
each time we 'commit' to a queen placement[4]. This means our algorithm is Ω(n^3)
and O(n^4)
, with the exact running time depending on how much backtracking we end up doing.
Can we do better than this?
Doing better
It turns out that we can: in the paper Bit-vector encoding of n-queen problem, Qiu Zongyan demonstrates a way we can avoid storing the entire board state. This involves using an observation we have already made when defining markUnavailable
, which we will discuss and implement.
The key idea of the paper is that, if we place a queen in a row r
and a column c
, for any subsequent row, we can establish which positions are unavailable, knowing only r
, c
and the offset k
to the subsequent row in question. More precisely, the left diagonal from this queen in row r + k
will make column c - k
unavailable in that row, as long as c - k >= 0
; analogously, the right diagonal from this queen in row r + k
will make column c + k
unavailable in that row, as long as c + k < n
, where n
is the dimension. Furthermore, we observe that in the case of multiple queens, we can 'overlay' these observations: even if the unavailable diagonals (or indeed, any unavailable position) stemming from two different queens overlap, this doesn't matter. Once a position is marked unavailable as a result of one 'commitment' to a queen's position, it will remain unavailable no matter how many other queen 'commitments' may block it.
To see this in effect, consider the following partial board with dimension 8:
0 1 2 3 4 5 6 7
0 _ _ _ _ _ _ _ _
1 _ _ _ _ _ _ _ _
2 _ _ _ _ _ _ _ _
3 _ _ _ _ _ _ _ _
...
Suppose we 'commit' to a queen in position (0, 2)
. This will make column 2 unavailable for all future placements:
0 1 2 3 4 5 6 7
0 _ _ X _ _ _ _ _
1 _ _ X _ _ _ _ _
2 _ _ X _ _ _ _ _
3 _ _ X _ _ _ _ _
...
We will also block the left diagonal from that position:
0 1 2 3 4 5 6 7
0 _ _ _ _ _ _ _ _
1 _ X _ _ _ _ _ _
2 X _ _ _ _ _ _ _
3 _ _ _ _ _ _ _ _
...
And the right diagonal from that position:
0 1 2 3 4 5 6 7
0 _ _ _ _ _ _ _ _
1 _ _ _ X _ _ _ _
2 _ _ _ _ X _ _ _
3 _ _ _ _ _ X _ _
...
Consider row 3 in the context of the rules stated above: here, r = 0, c = 2, k = 3
. Thus, in this row, the left diagonal of our queen placed in row 0 would make column 2 - 3
unavailable, but as this is less than 0, this diagonal has no effect on row 3. Analogously, the right diagonal of our queen from row 0 would make column 2 + 3
unavailable; this is less than 8, so that makes (3, 5)
unavailable for queen placement. Our diagram above agrees with this: this suggests that we don't have to mark all subsequent rows 'ahead of time', as we do in our original approach.
To implement this, instead of maintaining an entire board, the paper suggests maintaining three bitvectors, named down
, left
, and right
. These track the positions made unavailable in the current row by the queens placed in previous rows, specifically the column(s) directly below them, in their left diagonal(s) and right diagonal(s) respectively. Each of these has a length equal to the dimension of the problem, and all of them start with all their bits clear. We also track the current row number and choice ordinal in the same way as our previous attempt, and both start the same way.
In light of this, we use a slightly modified approach for each row:
Construct a bitvector
control
such that for any bit position incontrol
, that bit position is clear if the corresponding position in all ofdown
,left
, andright
is clear, and is set otherwise[5].Use
control
and the choice ordinal to determine in which column we can place a queen for this row. If none are available, backtrack.Update
down
,left
, andright
by setting the bit corresponding to our choice from 2.'Slide over' all bits in
left
by 1 bit position higher, filling in 0 at the lowest bit position. Do the same forright
but by 1-bit position lower and filling in 0 at the highest bit position.Repeat for the row below; if you get backtracked to, revert
down
,left
, andright
, and try another position. If none is available, backtrack.
To perform these steps, we will need some slightly modified operations. Firstly, to construct the control
bitvector, we define an operation that effectively takes the bitwise OR of three other bitvectors:
or3 :: Bitvector -> Bitvector -> Bitvector -> Bitvector
or3 (Bitvector x) (Bitvector y) (Bitvector z) = runST $ do
let !len = BA.sizeofByteArray x
result <- BA.newByteArray len
for_ [0 .. len - 1] $ \ix -> do
let !newByte :: Word8 =
BA.indexByteArray x ix
.|. BA.indexByteArray y ix
.|. BA.indexByteArray z ix
BA.writeByteArray result ix newByte
Bitvector <$> BA.unsafeFreezeByteArray result
The main reason we define this operation in this way is that we can avoid an unnecessary intermediate copy. If we defined a more standard bitwise OR of two bitvectors, we would end up creating an intermediate copy, only to throw it away immediately.
Since our position selection no longer requires slicing, we have a wrapper for selectIn
that starts selection from the beginning and spans the entire bitvector:
select :: Bitvector -> Int -> Maybe Int
select bv@(Bitvector ba) = selectIn bv (0, BA.sizeofByteArray ba * 8)
Lastly, we need an operation that combines the bit setting and 'sliding' we need to update left
and right
. We describe the update operation for adjustLeft
here, as adjustRight
is similar (just symmetric):
adjustLeft :: Bitvector -> Int -> Bitvector
adjustLeft (Bitvector ba) i = runST $ do
let !len = BA.sizeofByteArray ba
thawed <- BA.thawByteArray ba 0 len
setBit' thawed len i
let !topBitMask :: Word8 = 0x80
for_ [0, 1 .. len - 2] $ \ix -> do
!currentByte :: Word8 <- BA.readByteArray thawed ix
!nextByte :: Word8 <- BA.readByteArray thawed (ix + 1)
let !nextOverflowBit = nextByte .&. topBitMask
let !newCurrentByte = unsafeShiftL currentByte 1 .|. unsafeShiftR nextOverflowBit 7
BA.writeByteArray thawed ix newCurrentByte
let !lastIx = len - 1
!lastByte :: Word8 <- BA.readByteArray thawed lastIx
BA.writeByteArray thawed lastIx (unsafeShiftL lastByte 1)
Bitvector <$> BA.unsafeFreezeByteArray thawed
The bit setting helper is re-used from our definition of setBit
and setBits
. The 'sliding' operation is a little bit more difficult, as we have to move bits across byte boundaries. To make this work, we use a loop that looks not only at a 'current' position but also at the position at the next index. We then use a mask to extract the top bit of the next byte, then use a combination of shift and OR to construct a new byte, based on the current byte and the bit extracted. The only exception is the last byte, as this just requires 'filling in' a zero bit, which a shift can already do. This is important, as this ensures that left diagonals 'fall off' when the row number gets large enough.
Putting all these together, we have the following, improved nqueens
:
nqueens :: Int -> [(Int, Int)]
nqueens dim
| dim < 4 = []
| dim `rem` 8 /= 0 = []
| otherwise =
let down = zeroes dim
left = zeroes dim
right = zeroes dim
in go 0 0 down left right . zeroes $ dim
where
-- We hoist the control vector into the arguments, as this way, we won't be
-- forced to reconstruct it on each backtrack.
go ::
Int ->
Int ->
Bitvector ->
Bitvector ->
Bitvector ->
Bitvector ->
[(Int, Int)]
go row choice down left right control
-- Choice ordinal matches dimension
| choice == dim = []
-- Try to see if the current choice ordinal is available and where
| otherwise = case select control choice of
-- No choice here
Nothing -> []
Just pos ->
if row == dim - 1
then -- this is the last row, commit
[(row, pos)]
else
let newDown = setBit down pos
newLeft = adjustLeft left pos
newRight = adjustRight right pos
newControl = or3 newDown newLeft newRight
in case go (row + 1) 0 newDown newLeft newRight newControl of
-- We dead-ended ourselves, go to the next choice ordinal
-- and retry
[] -> go row (choice + 1) down left right control
next -> (row, pos) : next
We see that the basic structure of the solution is the same. The only difference is that the state we maintain is different, and the way we convert a choice ordinal into a position is slightly different. We also no longer need a special data type for boards, as modifications to the bitvectors we work with don't need the extra information.
Even in theory, this implementation should be more efficient: we now require only θ(n)
copying (and work) at each step of the search, as we no longer maintain an entire board. Thus, the entire search is Ω(n^2)
and O(n^3)
, depending on how much backtracking we have to do. This is an asymptotic improvement in both time and space over our initial attempt.
However, does this actually help in practice?
8 queens: OK
8.77 μs ± 201 ns, 40 KB allocated, 1 B copied, 34 MB peak memory
16 queens: OK
942 μs ± 28 μs, 3.6 MB allocated, 223 B copied, 34 MB peak memory
24 queens: OK
41.2 ms ± 2.1 ms, 159 MB allocated, 10 KB copied, 34 MB peak memory
Yes, significantly! Even the smallest case is almost three times faster (and uses almost five times less memory overall), and for larger cases, the improvement is much larger. We can also see the 'copy-through' is smaller for our solution. This is likely because we avoid the list-passing from markUnavailable
. We still unfortunately see the non-linear growth from before, but it appears slower, which is consistent with the analysis predicting an asymptotic improvement.
Could we do even better than this? Not asymptotically at least: due to our row-by-row approach, we have to accept a factor of n
even in the best case (and as much as n^2
in the worst), and our requirement for backtracking means that it's difficult to see how we would need less than linear time or space at each search step. While it might be possible to use a structure that doesn't require copying in its entirety (such as some kind of red-black tree), it's unlikely that we could find a structure that efficiently supports all operations we need (select
being particularly tricky) while also not having large constant factors in its memory use. This is one advantage of a bitwise representation: while the copying is linear, the amount of memory copied is small and compact which makes it significantly cheaper. While we could potentially reduce the constant factors at play here, asymptotically there aren't really improvements to be had.
The Plutus bit
This article came from the author's experience with implementing CIP-122 and CIP-123, as well as the n-queens benchmark, into Plutus. In this context, the type of bitwise-oriented code we described above has advantages above and beyond what would be the case in regular Haskell. This is due to the severe limitations on memory imposed for any Plutus script; thus, succinct (or in our case, implicit) data structures go from a luxury to practically a necessity. Furthermore, while we've had a compact byte-oriented structure available for us to use in Plutus from the very beginning (in the form of BuiltinByteString
), until the work by MLabs, there were no reasonable ways to work with individual bits[6].
With CIP-122 and CIP-123, the kind of code we described above is possible in Plutus scripts as well. While it is unlikely that any script would want an 8-queens solver, the ability to define and use succinct data structures in such Plutus scripts enables work that was previously either impractical or impossible. We had this kind of problem - and this kind of use case - in mind when we wrote CIP-122 and CIP-123. We also provided a version of the second approach to solving n-queens as a benchmark for Plutus[7], both to demonstrate the effectiveness of these operations and to provide a guide to others for how they could be used.
Your Project, Our Expertise
As we continue to explore innovative solutions like the bitwise operations discussed in this article, we at MLabs are dedicated to transforming ideas into reality. Whether optimizing existing work or building from scratch, we assist teams at any stage of their project journey. As a leading AI and blockchain consultancy, we pride ourselves on helping clients tackle complex projects with outstanding project management and developer support.
Our mission is to empower businesses with innovative, secure, and mission-critical software solutions that deliver exceptional value and drive growth. We have a proven track record of building robust projects tailored to client needs.
If you have an idea or project that could benefit from our expertise, we invite you to contact us today. Discover how MLabs can turn your vision into a successful reality.
- In fact, GHC itself uses it as part of its
nofib
benchmarking suite. - This is not a significant restriction and can be worked around. We don't do it here because it would make the bitwise operations significantly more fiddly, and wouldn't help the exposition at all.
- This is because threatening is symmetric: if a queen threatens us, we also threaten her, and since we've marked all positions that prior-placed queens could threaten, if we avoid them, we can't threaten them either.
- Working mutably won't save us here. Due to backtracking, we have to retain the prior version of our state each time we recurse, possibly as deep as
n
recursions! - Technically, we don't even need
left
andright
to constructcontrol
, as knowingdown
, as well as the current row number, is enough to reconstructleft
andright
on demand. However, this is unlikely to be of much benefit: we exchange less stored state for significantly more work. We might be able to do less copying, but it's not clear if this would even benefit us much. - This is essentially the case we made as part of CIP-122 and CIP-123.
- The only differences are due to the slightly different operations available to us. Conceptually, the implementation is the same.