std::string stops at \0

Think about it: if you are given const char*, how will you detemine, where is a true terminating 0, and where is embedded one?

You need to either explicitely pass a size of string, or construct string from two iterators (pointers?)

#include <string>
#include <iostream>


int main()
{
    auto& str = "String!\0 This is a string too!";
    std::string s(std::begin(str), std::end(str));
    std::cout << s.size() << '\n' << s << '\n';
}

Example: http://coliru.stacked-crooked.com/a/d42211b7199d458d

Edit: @Rakete1111 reminded me about string literals:

using namespace std::literals::string_literals;
auto str = "String!\0 This is a string too!"s;

Your std::string really has only 7 characters and a terminating '\0', because that's how you construct it. Look at the list of std::basic_string constructors: There is no array version which would be able to remember the size of the string literal. The one at work here is this one:

basic_string( const CharT* s,
              const Allocator& alloc = Allocator() );

The "String!\0 This is a string too!" char const[] array is converted to a pointer to the first char element. That pointer is passed to the constructor and is all information it has. In order to determine the size of the string, the constructor has to increment the pointer until it finds the first '\0'. And that happens to be one inside of the array.


If you happen to work with a lot zero bytes in your strings, then chances are that std::vector<char> or even std::vector<unsigned char> would be a more natural solution to your problem.


You are constructing your std::string from a string literal. String literals are automatically terminated with a '\0'. A string literal "f\0o" is thus encoded as the following array of characters:

{'f', '\0', 'o', '\0'}

The string constructor taking a char const* will be called, and will be implemented something like this:

string(char const* s) {
    auto e = s;
    while (*e != '\0') ++e;

    m_length = e - s;
    m_data = new char[m_length + 1];
    memcpy(m_data, s, m_length + 1);
}

Obviously this isn't a technically correct implementation, but you get the idea. The '\0' you manually inserted will be interpreted as the end of the string literal.

If you want to ignore the extra '\0', you can use a std::string literal:

#include <iostream>
#include <string>

int main ()
{
    using namespace std::string_literals;

    std::string s("String!\0 This is a string too!"s);
    std::cout << s.length(); // same result as with s.size()
    std::cout << std::endl << s;

    return 0;
}

Output:

30
String! This is a string too!

Tags:

C++

String