JavaScript String concatenation behavior with null or undefined values

You can use Array.prototype.join to ignore undefined and null:

['a', 'b', void 0, null, 6].join(''); // 'ab6'

According to the spec:

If element is undefined or null, Let next be the empty String; otherwise, let next be ToString(element).


Given that,

  • What is the history behind the oddity that makes JS converting null or undefined to their string value in String concatenation?

    In fact, in some cases, the current behavior makes sense.

    function showSum(a,b) {
        alert(a + ' + ' + b + ' = ' + (+a + +b));
    }

    For example, if the function above is called without arguments, undefined + undefined = NaN is probably better than + = NaN.

    In general, I think that if you want to insert some variables in a string, displaying undefined or null makes sense. Probably, Eich thought that too.

    Of course, there are cases in which ignoring those would be better, such as when joining strings together. But for those cases you can use Array.prototype.join.

  • Is there any chance for a change in this behavior in future ECMAScript versions?

    Most likely not.

    Since there already is Array.prototype.join, modifying the behavior of string concatenation would only cause disadvantages, but no advantages. Moreover, it would break old codes, so it wouldn't be backwards compatible.

  • What is the prettiest way to concatenate String with potential null or undefined?

    Array.prototype.join seems to be the simplest one. Whether it's the prettiest or not may be opinion-based.


What is the prettiest way to concatenate String with potential null or undefined object without falling into this problem [...]?

There are several ways, and you partly mentioned them yourself. To make it short, the only clean way I can think of is a function:

const Strings = {};
Strings.orEmpty = function( entity ) {
    return entity || "";
};

// usage
const message = "This is a " + Strings.orEmpty( test );

Of course, you can (and should) change the actual implementation to suit your needs. And this is already why I think this method is superior: it introduced encapsulation.

Really, you only have to ask what the "prettiest" way is, if you don't have encapsulation. You ask yourself this question because you already know that you are going to get yourself into a place where you cannot change the implementation anymore, so you want it to be perfect right away. But that's the thing: requirements, views and even envrionments change. They evolve. So why not allow yourself to change the implementation with as little as adapting one line and perhaps one or two tests?

You could call this cheating, because it doesn't really answer how to implement the actual logic. But that's my point: it doesn't matter. Well, maybe a little. But really, there is no need to worry because of how simple it would be to change. And since it's not inlined, it also looks a lot prettier – whether or not you implement it this way or in a more sophisticated way.

If, throughout your code, you keep repeating the || inline, you run into two problems:

  • You duplicate code.
  • And because you duplicate code, you make it hard to maintain and change in the future.

And these are two points commonly known to be anti-patterns when it comes to high-quality software development.

Some people will say that this is too much overhead; they will talk about performance. It's non-sense. For one, this barely adds overhead. If this is what you are worried about, you chose the wrong language. Even jQuery uses functions. People need to get over micro-optimization.

The other thing is: you can use a code "compiler" = minifier. Good tools in this area will try to detect which statements to inline during the compilation step. This way, you keep your code clean and maintainable and can still get that last drop of performance if you still believe in it or really do have an environment where this matters.

Lastly, have some faith in browsers. They will optimize code and they do a pretty darn good job at it these days.


The ECMA Specification

Just to flesh out the reason it behaves this way in terms of the spec, this behavior has been present since version one. The definition there and in 5.1 are semantically equivalent, I'll show the 5.1 definitions.

Section 11.6.1: The Addition operator ( + )

The addition operator either performs string concatenation or numeric addition.

The production AdditiveExpression : AdditiveExpression + MultiplicativeExpression is evaluated as follows:

  1. Let lref be the result of evaluating AdditiveExpression.
  2. Let lval be GetValue(lref).
  3. Let rref be the result of evaluating MultiplicativeExpression.
  4. Let rval be GetValue(rref).
  5. Let lprim be ToPrimitive(lval).
  6. Let rprim be ToPrimitive(rval).
  7. If Type(lprim) is String or Type(rprim) is String, then
    a. Return the String that is the result of concatenating ToString(lprim) followed by ToString(rprim)
  8. Return the result of applying the addition operation to ToNumber(lprim) and ToNumber(rprim). See the Note below 11.6.3.

So, if either value ends up being a String, then ToString is used on both arguments (line 7) and those are concatenated (line 7a). ToPrimitive returns all non-object values unchanged, so null and undefined are untouched:

Section 9.1 ToPrimitive

The abstract operation ToPrimitive takes an input argument and an optional argument PreferredType. The abstract operation ToPrimitive converts its input argument to a non-Object type ... Conversion occurs according to Table 10:

For all non-Object types, including both Null and Undefined, [t]he result equals the input argument (no conversion). So ToPrimitive does nothing here.

Finally, Section 9.8 ToString

The abstract operation ToString converts its argument to a value of type String according to Table 13:

Table 13 gives "undefined" for the Undefined type and "null" for the Null type.

Will it change? Is it even an "oddity"?

As others have pointed out, this is very unlikely to change as it would break backward compatibility (and bring no real benefit), even more so given that this behavior is the same since the 1997 version of the spec. I would also not really consider it an oddity.

If you were to change this behavior, would you change the definition of ToString for null and undefined or would you special-case the addition operator for these values? ToString is used many, many places throughout the spec and "null" seems like an uncontroversial choice for representing null. Just to give a couple of examples, in Java "" + null is the string "null" and in Python str(None) is the string "None".

Workaround

Others have given good workarounds, but I would add that I doubt you want to use entity || "" as your strategy since it resolves true to "true" but false to "". The array join in this answer has the more expected behavior, or you could change the implementation of this answer to check entity == null (both null == null and undefined == null are true).