Garbage collection of seemingly PROTECTed pairlist

Instead of trying to grow then convert a pairlist, you can use a standard list (a VECSXP). The reason why you don't need to grow a list is that a quick one-line loop through your matrix will tell you how many "gaps" you have in your numbers, and therefore how many vectors you need to pre-allocate in the list. This turns out to make things considerably simpler and probably a bit more efficient too.

The other changes I have made are to move to a single helper function, which simply assigns a length-2 integer vector from two ints, and to UNPROTECT en masse at the end of your C_fullocate function. This is simple to do, since we have only assigned one vector per element of the final list, plus the list itself.

The function for creating length-2 INTSXPs from two ints looks like this:

#include <Rinternals.h>

SEXP C_intsxp2(int first, int second) 
{
  SEXP out = PROTECT(Rf_allocVector(INTSXP, 2));
  INTEGER(out)[0] = first;
  INTEGER(out)[1] = second;
  UNPROTECT(1);
  return out;
}

And your main function becomes:

SEXP C_fullocate(SEXP int_mat)
{
  int rows       = Rf_nrows(int_mat);
  int *values    = INTEGER(int_mat);
  int total_rows = rows;
  int rownum     = 1;

  // Counts how many elements we need in our list
  for(int i = 0; i < (rows - 1); ++i) {
    if(values[rows + i] != values[i + 1] - 1) ++total_rows;
  }

  // Creates the main list we will output at the end of the function
  SEXP list = PROTECT(Rf_allocVector(VECSXP, total_rows));

  // Creates and assigns first row
  SET_VECTOR_ELT(list, 0, PROTECT(C_intsxp2(values[0], values[rows])));

  for(int i = 1; i < rows; ++i) // Cycle through rest of the rows
  {
    if(values[rows + i - 1] != values[i] - 1) // Insert extra row if there's a gap
    {
      SEXP extra = PROTECT(C_intsxp2(values[rows + i - 1] + 1, values[i] - 1));
      SET_VECTOR_ELT(list, rownum++, extra);
    }
    // Copy next row of original matrix into our list
    SEXP next_row = PROTECT(C_intsxp2(values[i], values[i + rows]));
    SET_VECTOR_ELT(list, rownum++, next_row);
  }

  UNPROTECT(total_rows + 1);  // Unprotects all assigned rows plus main list

  return list;
}

So in R we have

test_mat <- matrix(as.integer(c(2, 10, 11, 20, 30, 40, 50, 60)),
                   ncol = 2, byrow = TRUE)

test_mat
#>      [,1] [,2]
#> [1,]    2   10
#> [2,]   11   20
#> [3,]   30   40
#> [4,]   50   60

And we can do:

fullocate(test_mat)
#> [[1]]
#> [1]  2 10
#> 
#> [[2]]
#> [1] 11 20
#> 
#> [[3]]
#> [1] 21 29
#> 
#> [[4]]
#> [1] 30 40
#> 
#> [[5]]
#> [1] 41 49
#> 
#> [[6]]
#> [1] 50 60

Of course, the whole thing can be done much more simply using a single function in Rcpp. Here's an example where you can just grow the list, making the code considerably simpler (if maybe a little less efficient).

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
List fullocate(IntegerMatrix m)
{
  List l = List::create(m(0, _));
  for(int i = 1; i < m.nrow(); ++i)
  {
    if(m(i, 0) != m(i - 1, 1) + 1){
      l.push_back(NumericVector::create(m(i - 1, 1) + 1, m(i, 0) - 1));
    }
    l.push_back(NumericVector::create(m(i, 0), m(i, 1)));
  }
  return l;
}

This one is really complicated. You made a great effort to create a reproducible example of this hard to track error.

I tried fixing your problem, unfortunately I failed. But nevertheless I'll try to share my findings with you, since nobody else answered so far (maybe it helps)

I installed your testpkg and additionally added the fullocate function to the namespace. To have it as an exported function.

This way I was able to build the package and run the function with testpkg::fullocate(int_mat) and to run it via devtools::check().

Interestingly if I run it via check() it fails everytime, when running your testthat test.

Running ‘testthat.R’:

 ── Test failures ───────────────────────── testthat ────

 library(testthat)
 library(testpkg)
 
 test_check("testpkg")
row_num: 2
[[1]]
.Primitive("for")

here1row_num: 3
[[1]]
.Primitive("for")

[[2]]
[[2]][[1]]

 *** caught segfault ***
address 0xa00000007, cause 'memory not mapped'

Traceback:
 1: fullocate(int_mat)
 2: eval_bare(expr, quo_get_env(quo))
 3: quasi_label(enquo(object), label, arg = "object")
 4: expect_equal(fullocate(int_mat), list(c(5L, 6L), c(7L, 10L),     c(11L, 19L), c(20L, 30L)))
 5: eval(code, test_env)
 6: eval(code, test_env)
 7: withCallingHandlers({    eval(code, test_env)    if (!handled && !is.null(test)) {        skip_empty()    }}, expectation = handle_expectation, skip = handle_skip, warning = handle_warning,     message = handle_message, error = handle_error)
 8: doTryCatch(return(expr), name, parentenv, handler)

So pretty similar to what you got, some memory issue:

address 0xa00000007, cause 'memory not mapped'

When I just run the function, interestingly, I can run it several times successfully, until it gives an error. Seems kind of random if it succeeds or not. From time to time the complete R session crashes.

Here is the error I get when running it without check().

Fehler in h(simpleError(msg, call)) : Fehler bei der Auswertung des Argumentes 'object' bei der Methodenauswahl für Funktion 'show': nicht implementierter Typ (27) in 'eval' Fehler während wrapup: nicht implementierter Typ (27) in 'lazy_duplicate'

Error: no more error handlers available (recursive errors?); invoking 'abort' restart Here is the error messages I get:

Fehler in h(simpleError(msg, call)) : 
  Fehler bei der Auswertung des Argumentes 'object' bei der Methodenauswahl für Funktion 'show': nicht implementierter Typ (27) in 'eval'
Fehler während wrapup: nicht implementierter Typ (27) in 'lazy_duplicate'

Error: no more error handlers available (recursive errors?); invoking 'abort' restart

Does not say too much...

I actually had some ideas why it might have failed based on the Writing R Extensions Manual. There is a special section about the C Garbage Collection issues. (https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Garbage-Collection) This is definitely worth a look, if you have not read it yet.

Some interesting things to check:

  1. Notice that it is the object which is protected, not the pointer variable. It is a common mistake to believe that if you invoked PROTECT(p) at some point then p is protected from then on, but that is not true once a new object is assigned to p.

  2. In some cases it is necessary to keep better track of whether protection is really needed. Be particularly aware of situations where a large number of objects are generated. The pointer protection stack has a fixed size (default 10,000) and can become full.

Shouldn't be the second case, since the test example is quite small ;) From the fact, that the problem occurs so random, I (like you) would guess somethings what needs to be Protected isn't actually protected.

I wasn't so sure about the point of code, which you pointed out as the cause of the failure. But if Rf_PrintValue(prlst); really is always the point, where the error occurs - it might be an indicator, to closer check prlst and what is inside.

As I told - in the end I couldn't fix it - but I also did not spend too much time on it.


Function C_int_mat_nth_row_nrnc is writing values beyond the allocated limits.

  1. Allocation in line 5 is of size nc.
  2. Then, line 12 uses nr as a limit
  3. ... which is larger than nc in line 39.
SEXP C_int_mat_nth_row_nrnc(int *int_mat_int, int nr, int nc, int n) {
  SEXP out = PROTECT(Rf_allocVector(INTSXP, nc)); // allocating with `nc`
  ...
      for (int i = 0; i != nr; ++i) { // but `nr` is used as a limit
        out_int[i] = ...
      }
}
...
SEXP C_fullocate(SEXP int_mat) {
  ...
  row_num = 2;
  while (row_num <= nr) {
    ...
    SEXP row = PROTECT(C_int_mat_nth_row_nrnc(int_mat_int, nr, 2, row_num)); // !!!
    ...
  }
}

Tags:

C

R