Best way to remove failures due to lazy IO?

Take a look at this example:

#! cabal
{- cabal:
build-depends : base, text
default-extensions: BlockArguments GHC2021 UnicodeSyntax
-}
import Data.Text.Lazy.IO qualified as UnicodeStream
main = do
  let fileName = "example.txt"
  fileContents ← UnicodeStream.readFile fileName
  UnicodeStream.writeFile fileName fileContents

It will error out like so:

% ./script.hs
cabal-script-script.hs: example.txt: withFile: resource busy (file is locked)

What is the standard way of removing such errors from my programs?

As I understand, this error happens exactly because I am trying to write to a file while it is still being lazily read. I have been hearing people say «do not use lazy IO, lazy IO is bad». I guess this is an example of lazy IO. How could I tell that this particular function readFile is «lazy IO»? Why does it exist in standard libraries? Could it not have had some helpful type annotations?

I can imagine that this error is hard to prevent in general while retaining lazy IO. For example, say I have two hard linked files:

main = do
  fileContents ← UnicodeStream.readFile "example.txt"
  UnicodeStream.writeFile "example-hardlink.txt" fileContents
% ln example.txt example-hardlink.txt
% ./script.hs
cabal-script-script.hs: example-hardlink.txt: withFile: resource busy (file is locked)

Tragic. My program has failed without a fault of my own.

However, we could in principle have found out that these two are the same file, by looking more closely at the file system. Then, it seems like we could automatically lock the files being lazily read in such a way that an attempt to write forces the lazy IO all the way, thus avoiding the error.

In my ideal world, there is indeed such a fancy locking system and moreover all standard libraries work this way. Second best would be to have clear documentation that explains this issue — but I am sure I have never seen any specific study of it. Yet this seems to me an urgent issue.

2 Likes

I recall reading somewhere that writing to existing files is problematic any way, instead you should generally create a new file and overwrite the old when you’re done writing. That also means you don’t necessarily have to store the whole file in memory.

3 Likes

Everything about files is problematic any way. This is why writing such code should not be asked from «general purpose» programmers like me. There must be a better way, a safer set of IO primitives.

✼ ✼ ✼

So, this is the «safe» way to do what I asked, the way you offer:

main = do
  fileContents ← UnicodeStream.readFile "example.txt"
  (temporaryFile, temporaryFileHandle) ← openTempFile "." "example.txt"
  hClose temporaryFileHandle
  UnicodeStream.writeFile temporaryFile fileContents
  removeFile "example.txt"
  renameFile temporaryFile "example.txt"

Is it truly safe in the face of a sudden power off event? For example, does the temporary file get written wholly before the original file is removed? I cannot tell.

✼ ✼ ✼

Anyway… this is too much code for a task so simple. It can be abstracted like so:

#! cabal
{- cabal:
build-depends : base, directory, filepath, text
default-extensions: BlockArguments GHC2021 OverloadedStrings UnicodeSyntax
-}

import Data.Text.Lazy qualified as UnicodeStream
import Data.Text.Lazy.IO qualified as UnicodeStream
import System.IO
import System.Directory
import System.FilePath

main = interactWithFile "example.txt" (UnicodeStream.reverse)

interactWithFile path function = do
  fileContents ← UnicodeStream.readFile path
  (temporaryFile, temporaryFileHandle) ← openTempFile (takeDirectory path) (takeFileName path)
  hClose temporaryFileHandle
  UnicodeStream.writeFile temporaryFile (function fileContents)
  removeFile path
  renameFile temporaryFile path

Looks good?

1 Like

Yes, writeFile (at least the one from base) writes the file fully before performing further I/O actions.

You don’t need to do the removeFile. If you leave that out then you get the benefit that the renaming is atomic (at least on non-windows platforms).

Yeah, but I’m also not an expert in this area.

1 Like