reading last n lines from file in c/c++

Comments in the code

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    FILE *in, *out;
    int count = 0;
    long int pos;
    char s[100];

    in = fopen("input.txt", "r");
    /* always check return of fopen */
    if (in == NULL) {
        perror("fopen");
        exit(EXIT_FAILURE);
    }
    out = fopen("output.txt", "w");
    if (out == NULL) {
        perror("fopen");
        exit(EXIT_FAILURE);
    }
    fseek(in, 0, SEEK_END);
    pos = ftell(in);
    /* Don't write each char on output.txt, just search for '\n' */
    while (pos) {
        fseek(in, --pos, SEEK_SET); /* seek from begin */
        if (fgetc(in) == '\n') {
            if (count++ == 10) break;
        }
    }
    /* Write line by line, is faster than fputc for each char */
    while (fgets(s, sizeof(s), in) != NULL) {
        fprintf(out, "%s", s);
    }
    fclose(in);
    fclose(out);
    return 0;
}

There are a number of problems with your code. The most important one is that you never check that any of the functions succeeded. And saving the results an ftell in an int isn't a very good idea either. Then there's the test pos < begin; this can only occur if there was an error. And the fact that you're putting the results of fgetc in a char (which results in a loss of information). And the fact that the first read you do is at the end of file, so will fail (and once a stream enters an error state, it stays there). And the fact that you can't reliably do arithmetic on the values returned by ftell (except under Unix) if the file was opened in text mode.

Oh, and there is no "EOF character"; 'ÿ' is a perfectly valid character (0xFF in Latin-1). Once you assign the return value of fgetc to a char, you've lost any possibility to test for end of file.

I might add that reading backwards one character at a time is extremely inefficient. The usual solution would be to allocate a sufficiently large buffer, then count the '\n' in it.

EDIT:

Just a quick bit of code to give the idea:

std::string
getLastLines( std::string const& filename, int lineCount )
{
    size_t const granularity = 100 * lineCount;
    std::ifstream source( filename.c_str(), std::ios_base::binary );
    source.seekg( 0, std::ios_base::end );
    size_t size = static_cast<size_t>( source.tellg() );
    std::vector<char> buffer;
    int newlineCount = 0;
    while ( source 
            && buffer.size() != size
            && newlineCount < lineCount ) {
        buffer.resize( std::min( buffer.size() + granularity, size ) );
        source.seekg( -static_cast<std::streamoff>( buffer.size() ),
                      std::ios_base::end );
        source.read( buffer.data(), buffer.size() );
        newlineCount = std::count( buffer.begin(), buffer.end(), '\n');
    }
    std::vector<char>::iterator start = buffer.begin();
    while ( newlineCount > lineCount ) {
        start = std::find( start, buffer.end(), '\n' ) + 1;
        -- newlineCount;
    }
    std::vector<char>::iterator end = remove( start, buffer.end(), '\r' );
    return std::string( start, end );
}

This is a bit weak in the error handling; in particular, you probably want to distinguish the between the inability to open a file and any other errors. (No other errors should occur, but you never know.)

Also, this is purely Windows, and it supposes that the actual file contains pure text, and doesn't contain any '\r' that aren't part of a CRLF. (For Unix, just drop the next to the last line.)


This can be done using circular array very efficiently. No additional buffer is required.

void printlast_n_lines(char* fileName, int n){

    const int k = n;
    ifstream file(fileName);
    string l[k];
    int size = 0 ;

    while(file.good()){
        getline(file, l[size%k]); //this is just circular array
        cout << l[size%k] << '\n';
        size++;
    }

    //start of circular array & size of it 
    int start = size > k ? (size%k) : 0 ; //this get the start of last k lines 
    int count = min(k, size); // no of lines to print

    for(int i = 0; i< count ; i++){
        cout << l[(start+i)%k] << '\n' ; // start from in between and print from start due to remainder till all counts are covered
    }
}

Please provide feedback.

Tags:

C++

File