Using Integer vs String for a "type" value (Database and class design)

Well, here goes.. potential downVote fodder, but enums is one of my favorite things..

Perspective

Focus on what's best for your design and code. Not using enums because a DB doesn't have that "data type?" OK. then Let's not make any custom classes for the same reason. That's nonsense.

enum goodness

(Note: I code in C#)

  1. Allows you to declare things in terms of your problem domain. A string is a string but a MonsterType enum IS a Monster (type).
  2. enum is strongly typed, essentially. Using a non-defined enum value is a compile error. OTOH, a string typo is another debugging session.
  3. I particularly like it for switch statements. Beats theHellOutta integers: switch(MonsterType)
  4. an enum can exhaustively define all valid values
  5. tends to be document-y
  6. Code quality enhancer - sure you have to recompile to add a new member, but that's deliberative, and missing a related code change will blow up (a good thing) vis-a-vis a new string value not handled somewhere that manifests as an elusive logic/processing error, requiring a debug session on your part.
  7. defaults to zero, not null. It's funny how the null concept can muck things up. Keep "null" in the database where it belongs.
  8. Pro coding tip: ALWAYS explicitly define a member for default. re: MonsterType.Unknown = 0. Your reward will be easier written code that is less buggy and easier to read. Your maintenance programmer will thank you.

string badness

  1. default value is null. It's a constant battle flip-flopping around null and "empty string". I see stupid, bug ridden code like this frequently: if(string.IsNullorEmpty(myString.Trim()) ... - if myString is null your program blows up with a runtime exception. Cannot happen with enums.
  2. string comparison is case sensitive
  3. error prone to typos
  4. "string" type is generic and in no way helps you express your problem in terms of your domain
  5. setting myString = null, fetched from the database IS NOT a variable with a value, it IS NOT even an object, but we try to treat it like it is!

integer obsfucation

if (MonsterType == 3)... What the heck does that mean? Nuff said.

Here's a practical exercise. In your IDE, click on that "3" and ask it to "find definition." If your IDE could talk it would say "why don't you define the things in your problem domain instead of expecting me to guess what "3" means?"

Oh, I know. I'll declare a mess of constants somewhere.. somewhere. OMG. Let's pretend we're using enums! And it's more fun than working with a cohesive set of type safe values!

null drives me nuts

C# has a String.Empty static property for a reason. It goes to not hijacking a valid value (space, for example) or null to represent an instantiated string object who's value is explicitly not any (valid) string. THIS is not the same thing as null.

null means nothing, literally. It means "go away, no one is home". Coders too often want it to mean "monster type unknown" for example, but for crying out loud define such concepts explicitly (see enum pro tip above). IMHO using null like this means your design is missing something.

null is the code zombie. It's walking around but it's nothing, dead, whatever. And it will bite you if you're not careful!

You will experience the never ending joy of deciding to store "empty" strings as a space character or as null;

and that a space is actually a valid string-domain value, not a "no value" value;

and that null and "empty string" really don't mean the same thing any way and trying to make them so causes problems when you need their "natural" functionality;

and the constant fussing w/in code converting from string.empty to null to space and vice versa to suit the task at hand. Over time your code will become inconsistent in this regard.

Hear me now and believe me later.


Long answer

Your database guy is obviously wrong. There, of course, if something like an ENUM data type. You provided one in your example. And MySQL knows of it. And lots of programming languages have something like an ENUM.

But he is also right, in that ENUMs are often (always?) optimized by compilers. If there are four choices that could be represented perfectly by 1 through 4, but we happen to find identifiable strings easier to read in code. But compilers have no such problems and in fact don't care about the number. The Aerial monster is type 4 and the Flying Spaghetti Monster is type 4. It is even easier by an order of several magnitudes for a CPU to compare bytes compared to comparing strings.

Also, he is right that having an ENUM in C code or whatever code can be a problem:

  1. If you change the definitions (especially the order), you need to recompile the program and all linked programs if it is a library. That is a pain.
  2. If you need to interface with different languages, that can also be a pain. You need to sync multiple definitions.
  3. Managing ENUMs is cumbersome, especially removing types.

You can work around this by having a function that translates strings to enums or the other way around.

But there are also benefits:

  1. If you change the name of the monster type, any Aerial monster can remain type 4, even if you rename them to Flying monsters. Data consistency in the database is guaranteed, because it is a number. If you use strings, there is no painless way of converting. Well, find and replace will do in code, but not in databases.
  2. It is an efficient format. It will save you 10 bytes. My experience is that this rarely matters, except if you have tens of millions of entries.

TL;DR

No there is no objective answer.

If you find programmer ease most important, then strings can be a better option. If you find compiler optimization most important, then an enum is a better option.

My opinion is that compiler optimization is rarely important, but scarce programmer time is. I myself mostly use strings, except in certain databases.

So yes, your guy has a point.