C++ for a C# developer

I know you say you've got a good grasp of pointers and memory management, but I'd still like to explain an important trick. As a general rule of thumb, never have new/delete in your user code.

Every resource acquisition (whether it's a synchronization lock, a database connection or a chunk of memory or anything else that must be acquired and released) should be wrapped in an object so that the constructor performs the acquisition, and the destructor releases the resource. The technique is known as RAII, and is basically the way to avoid memory leaks. Get used to it. The C++ standard library obviously uses this extensively, so you can get a feel for how it works there. Jumping a bit in your questions, the equivalent of List<T> is std::vector<T>, and it uses RAII to manage its memory. You'd use it something like this:

void foo() {

  // declare a vector *without* using new. We want it allocated on the stack, not
  // the heap. The vector can allocate data on the heap if and when it feels like
  // it internally. We just don't need to see it in our user code
  std::vector<int> v;
  v.push_back(4);
  v.push_back(42); // Add a few numbers to it

  // And that is all. When we leave the scope of this function, the destructors 
  // of all local variables, in this case our vector, are called - regardless of
  // *how* we leave the function. Even if an exception is thrown, v still goes 
  // out of scope, so its destructor is called, and it cleans up nicely. That's 
  // also why C++ doesn't have a finally clause for exception handling, but only 
  // try/catch. Anything that would otherwise go in the finally clause can be put
  // in the destructor of a local object.
} 

If I had to pick one single principle that a C++ programmer must learn and embrace, it's the above. Let the scoping rules and the destructors work for you. They offer all the guarantees you need to write safe code.

String handling:

std::string is your friend there. In C, you'd use arrays of char's (or char pointers), but those are nasty, because they don't behave as strings. In C++, you have a std::string class, which behaves as you'd expect. The only thing to keep in mind is that "hello world" is of type char[12] and NOT std::string. (for C compatibility), so sometimes you have to explicitly convert your string literal (something enclosed in quotes, like "hello world") to a std::string to get the behavior you want: You can still write

std::string s = "hello world";

because C-style strings (such as literals, like "hello world") are implicitly convertible to std::string, but it doesn't always work: "hello" + " world" won't compile, because the + operator isn't defined for two pointers. "hello worl" + 'd' however, will compile, but it won't do anything sensible. Instead of appending a char to a string, it will take the integral value of the char (which gets promoted to an int), and add that to the value of the pointer.

std::string("hello worl") + "d" does as you'd expect however, because the left hand side is already a std::string, and the addition operator is overloaded for std::string to do as you'd expect, even when the right hand side is a char* or a single character.

One final note on strings: std::string uses char, which is a single-byte datatype. That is, it is not suitable for unicode text. C++ provides the wide character type wchar_t which is 2 or 4 bytes, depending on platform, and is typically used for unicode text (although in neither case does the C++ standard really specify the character set). And a string of wchar_t's is called std::wstring.

Libraries:

They don't exist, fundamentally. The C++ language has no notion of libraries, and this takes some getting used to. It allows you to #include another file (typically a header file with the extension .h or .hpp), but this is simply a verbatim copy/paste. The preprocessor simply combines the two files resulting in what is called a translation unit. Multiple source files will typically include the same headers, and that only works under certain specific circumstances, so this bit is key to understanding the C++ compilation model, which is notoriously quirky. Instead of compiling a bunch of separate modules, and exhanging some kind of metadata between them, as a C# compiler would, each translation unit is compiled in isolation, and the resulting object files are passed to a linker which then tries to merge the common bits back together (if multiple translation units included the same header, you essentially have code duplicated across translation units, so the linker merges them back into a single definition) ;)

Of course there are platform-specific ways to write libraries. On Windows, you can make .dll's or .libs, with the difference that a .lib is linked into your application, while a .dll is a separate file you have to bundle with your app, just like in .NET. On Linux, the equivalent filetypes are .so and .a, and in all cases, you have to supply the relevant header files as well, for people to be able to develop against your libraries.

Data type conversions:

I'm not sure exactly what you're looking for there, but one point I feel is significant is that the "traditional" cast as in the following, is bad:

int i = (int)42.0f; 

There are several reasons for this. First, it attempts to perform several different types of casts in order, and you may be surprised by which one the compiler ends up applying. Second, it's hard to find in a search, and third, it's not ugly enough. Casts are generally best avoided, and in C++, they're made a bit ugly to remind you of this. ;)

// The most common cast, when the types are known at compile-time. That is, if 
// inheritance isn't involved, this is generally the one to use
static_cast<U>(T); 

// The equivalent for polymorphic types. Does the same as above, but performs a 
// runtime typecheck to ensure that the cast is actually valid
dynamic_cast<U>(T); 

// Is mainly used for converting pointer types. Basically, it says "don't perform
// an actual conversion of the data (like from 42.0f to 42), but simply take the
// same bit pattern and reinterpret it as if it had been something else). It is
// usually not portable, and in fact, guarantees less than I just said.
reinterpret_cast<U>(T); 

// For adding or removing const-ness. You can't call a non-const member function
// of a const object, but with a const-cast you can remove the const-ness from 
// the object. Generally a bad idea, but can be necessary.
const_cast<U>(T);

As you'll note, these casts are much more specific, which means the compiler can give you an error if the cast is invalid (unlike the traditional syntax, where it'd just try any of the above casts until it finds one that works), and it's big and verbose, allowing you to search for it, and reminds you that they should be avoided when possible. ;)

The standard library:

Finally, getting back to data structures, put some effort into understanding the standard library. It is small, but amazingly versatile, and once you learn how to use it, you'll be in a far better position.

The standard library consists of several pretty distinct building blocks (the library has kind of accumulated over time. Parts of it were ported from C. The I/O streams library are adopted from one place, and the container classes and their associated functionality are adopted from a completely different library, and are designed noticeably different. The latter are part of what is often referred to as the STL (Standard Template Library). Strictly speaking, that is the name of the library that, slightly modified, got adopted into the C++ Standard Library.

The STL is key to understanding "modern C++". It is composed of three pillars, containers, iterators and algorithms. In a nutshell, containers expose iterators, and algorithms work on iterator pairs.

The following example takes a vector of int's, adds 1 to each element, and copies it to a linked list, just for the sake of example:

int add1(int i) { return i+1; } // The function we wish to apply

void foo() {
  std::vector<int> v;
  v.push_back(1);
  v.push_back(2);
  v.push_back(3);
  v.push_back(4);
  v.push_back(5); // Add the numbers 1-5 to the vector

  std::list<int> l;

  // Transform is an algorithm which applies some transformation to every element
  // in an iterator range, and stores the output to a separate iterator
  std::transform ( 
  v.begin(),
  v.end(), // Get an iterator range spanning the entire vector
  // Create a special iterator which, when you move it forward, adds a new 
  // element to the container it points to. The output will be assigned to this
  std::back_inserter(l) 
  add1); // And finally, the function we wish to apply to each element
}

The above style takes some getting used to, but it is extremely powerful and concise. Because the transform function is templated, it can accept any types as input, as long as they behave as iterators. This means that the function can be used to combine any type of containers, or even streams or anything else that can be iterated through, as long as the iterator is designed to be compatible with the STL. We also don't have to use the begin/end pair. Instead of the end iterator, we could have passed one pointing to the third element, and the algorithm would then have stopped there. Or we could have written custom iterators which skipped every other elements, or whatever else we liked. The above is a basic example of each of the three pillars. We use a container to store our data, but the algorithm we use to process it doesn't actually have to know about the container. It just has to know about the iterator range on which it has to work. And of course each of these three pillars can be extended by writing new classes, which will then work smoothly together with the rest of the STL.

In a sense, this is very similar to LINQ, so since you're coming from .NET, you can probably see some analogies. The STL counterpart is a bit more flexible though, at the cost of slightly weirder syntax. :) (As mentioned in comments, it is also more efficient. In general, there is zero overhead to STL algorithms, they can be just as efficient as hand-coded loops. This is often surprising, but is possible because all relevant types are known at compile-time (which is a requirement for templates to work), and C++ compilers tend to inline aggressively.)


You've got some toolkits available. For example, there are STL (Standard Template Library) and Boost/TR1 (extensions to STL) that are considered industry standards (well, STL is, at least). These provide lists, maps, sets, shared pointers, strings, streams, and all sorts of other handy tools. Best of all, they're widely supported across compilers.

As for data conversions, you can either do casts or create explicit converter functions.

Libraries - You can either create static libraries (get absorbed into the final executable) or DLLs (you're familiar with these, already). MSDN is an awesome resource for DLLs. Static libraries depend on your build environment.

In general, this is my advice: - Get to know your IDE of choice very well - Purchase "C++ The Complete Reference" by Herbert Schildt, which I consider to be an excellent tome on all things C++ (includes STL)

Considering your background, you should be well set once you do both of those.


I'll not repeat what others have said about libraries and such, but if you're serious about C++, do yourself a favor and pick up Bjarne Stroustrup's "The C++ Programming Language."

It took me years of working in C++ to finally pick up a copy, and once I did, I spent an afternoon slapping my forehead saying "of course! I should have realized! etc."

(Ironically, I had EXACTLY the same experience with K&R's "The C Programming Language." Someday, I'll learn to just go get "The Book" on day 1.)

Tags:

C++