How to properly write to a file using File::Map?

An mmap is a fixed-sized mapping of a portion of a file to memory.

The various mapping functions set the string buffer of the provided scalar to the mapped memory page. The OS will reflect any changes to that buffer to the file and vice versa if requested.

The proper way to work with an mmap is to modify the string buffer, not replace it.

  • Anything that changes the string buffer without changing its size is appropriate.

    $ perl -e'print "\0"x16' >scratch
    
    $ perl -MFile::Map=map_file -we'
       map_file my $map, "scratch", "+<";
       $map =~ s/\x00/\xFF/g;             # ok
       substr($map, 6, 2, "00");          # ok
       substr($map, 8, 2) = "11";         # ok
       substr($map, 7, 2) =~ s/../22/;    # ok
    '
    
    $ hexdump -C scratch
    00000000  ff ff ff ff ff ff 30 32  32 31 ff ff ff ff ff ff  |......0221......|
    00000010
    
  • Anything that replaces the string buffer (such as assigning to the scalar) is not ok.

    ...kinda. The module notices you've replaced the scalar's buffer. It proceeds to copy the contents of the new buffer to the mapped memory, then replaces the scalar's buffer with the pointer to the mapped memory.

    $ perl -e'print "\0"x16' >scratch
    
    $ perl -MFile::Map=map_file -we'
       map_file my $map, "scratch", "+<";
       $map = "4" x 16;  # Effectively: substr($map, 0, 16, "4" x 16)
    '
    Writing directly to a memory mapped file is not recommended at -e line 3.
    
    $ hexdump -C scratch
    00000000  34 34 34 34 34 34 34 34  34 34 34 34 34 34 34 34  |4444444444444444|
    00000010
    

    Aside from the warning can be silenced using no warnings qw( substr );,[1] the only down side is that doing this way requires using memcpy to copy length($map) bytes, while using substr($map, $pos, length($repl), $repl) only requires copying length($repl) bytes.

  • Anything that changes the size of string buffer is not ok.

    $ perl -MFile::Map=map_file -we'
       map_file my $map, "scratch", "+<";
       $map = "5" x 32;  # Effectively: substr($map, 0, 16, "5" x 16)
    '
    Writing directly to a memory mapped file is not recommended at -e line 3.
    Truncating new value to size of the memory map at -e line 3.
    
    $ hexdump -C scratch
    00000000  35 35 35 35 35 35 35 35  35 35 35 35 35 35 35 35  |5555555555555555|
    00000010
    

WARNING: The module doesn't warn if you shrink the buffer, even though this has no effect except to clobber one of the bytes with a NUL.

$ perl -e'print "\0"x16' >scratch

$ perl -MFile::Map=map_file -we'
   map_file my $map, "scratch", "+<";
   substr($map, 0, 16, "6" x 16);
   substr($map, 14, 2, "");
'

$ hexdump -C scratch
00000000  36 36 36 36 36 36 36 36  36 36 36 36 36 36 00 36  |66666666666666.6|
00000010

I've submitted a ticket.


  1. This is somewhat ironic, seeing as it more or less warns when not using substr, but I suppose it also warn when using substr "incorrectly".

The first quote,

Files are mapped into a variable that can be read just like any other variable, and it can be written to using standard Perl techniques such as regexps and substr.

is under the heading "Simplicity".

And it is true: You can simply write Perl code that manipulates strings and the data will end up in the file.

However, in the section Warnings we have:

Writing directly to a memory mapped file is not recommended

Due to the way perl works internally, it's not possible to write a mapping implementation that allows direct assignment yet performs well. As a compromise, File::Map is capable of fixing up the mess if you do it nonetheless, but it will warn you that you're doing something you shouldn't. This warning is only given when use warnings 'substr' is in effect.

That is, writing through an mmap'd variable is not efficient unless the modification of the string buffer can be done in place (the string has to be assembled and stored in memory first and is only copied over to the file afterwards). If you're OK with this, you can disable the warning with no warnings 'substr'.

Additionally, looking at the code, there seems to be only one use case in which writing doesn't result in a warning at all, though I don't understand how I'm able to trigger that.

That's the case where you're trying to write a buffer to itself. This happens when a scalar is actually modified in place. The other cases are workarounds for when the string buffer is replaced (e.g. because it's overwritten: $foo = $bar). For a real in-place modification no extra work is necessary and you don't get the warning.

But this doesn't help you because growing a string cannot be done in-place with a fixed size mapped buffer.

Changing the size of the file is not possible. This is not because of File::Map, but because the underlying mmap system call works on fixed size mappings and does not provide any option to resize files automatically.

If you need to edit files (especially small files), I recommend using edit in Path::Tiny instead.