I have been trying to use the bitvec: Space-efficient bit vectors library, but most of the interesting operations are using Immutable Vectors. I however is planning to use mutable bit-vectors in my code.
Is it possible to use a mutable vector temporarily as a vector in a function call, without copying it?
Essentially, I want something like this:
extract ::
PrimMonad m =>
MVector (PrimState s) a
-> (Vector a -> b)
-> m b
You cannot do it in a safe way: what if you modify the original MVector while the Vector a -> b function is running? However, you can use unsafeFreeze.
b might contain (Vector a) and therefore keep the reference alive outside the scope, and
since (Vector a -> b) is a pure function and might first be evaluated when b is forced.
Keeping that i mind, Iām unsure how unsafeFreeze solves the problem since it states that: āThe mutable vector may not be used after this operation.ā
Can I implement the unsafeExtract function like this:
unsafeExtract ::
PrimMonad m =>
MVector (PrimState s) a
-> (Vector a -> b)
-> m b
unsafeExtact mv fn = do
v <- unsafeFreeze mv
let !b = fn v
pure b
And use it in a context like a bit-vector set:
import Data.Bit as BV
import Data.Vector.Unboxed.Mutable as UM
add ::
PrimMonad m =>
UM.MVector (PrimState m) BV.Bit
-> Int
-> m ()
add mv ix = do
UM.insert mv ix 1
countUnique ::
PrimMonad m =>
UM.MVector (PrimState m) BV.Bit
-> m Int
countUnique mv = extractUnsafe mv countBits
Yeah, you can do that and it might be somewhat safer than just using unsafeFreeze directly. But either way there is some burden of proof on the user of these functions. The compiler does not check that you are always using them correctly.
This is kind of interesting, so we are missing a way in the type-system to represent āBurn after readingā items, with two abilities 1) cannot be saved in hunks, and 2) cannot be saved in constructors or returned without being copied. This smells a little like the borrow checker in Rust.
Youāre completely right and there is already some work in that direction for Haskell. We call it Linear Types. I think this video is a pretty good introduction.
Linear types are not quite right, since itās all about arguments being consumed exactly once, in this case we want our argument to be consumed many times, but never returned. Itās about āborrowingā the argument to the function, expecting it to not leak out on execution.
āBorrowingā the syntax of linear types, we might want something like a %0 -> b, ei, something that is returned exactly zero times. Destructive functions have these types:
index :: Int -> Vector a %0-> a
maybe :: (a -> b) -> b -> Maybe a %0-> a
Maybe something like this could also be beneficial to reduce the use of the garbage collector. If you call a destructive function with an argument, you can reuse the piece of memory right after the call, because it is not refereed to in the results.
Iām looking at the ācountBitsā and ānthBitIndexā, there does not seem to be mutable versions of the same operations. Iām specifically trying to implement a succinct dictionary to improve the storage of a data structure Iām working on in the https://discourse.haskell.org/t/counting-words-but-can-we-go-faster/ project.
Ps. Always happy to talk to the author of the library
Iāll look into them, they look interesting! Do you recommend them instead of the bitvec library and should bitvec be considered obsolete? Alternatively, is calling the immutable operators on a mutable bit vector through the āunsafeFreezeā operator the correct way to use the library?
bitvec is a general purpose library for bit vectors, while popkey is an implementation of a succinct data structure. They have different purposes and are not interchangeable.
I have not used popkey myself, so cannot tell if itās a right choice for your purposes.
Yes, your unsafeExtract is fine, as much as āunsafeā functions can be āfineā. But Iām also open for someone to extend bitvec with countBitsM and nthBitIndexM for mutable vectors, it should be straightforward to implement.
Itās mostly a pet project so āpopkeyā is not the best choice! Thank you for your help though. If I can figure it out, Iāll love to send a pull request. Iāll be busy the next couple of though.