Why Kotlin data classes can have nulls in non-nullable fields with Gson?

This happens because Gson uses an unsafe (as in java.misc.Unsafe) instance construction mechanism to create instances of classes, bypassing their constructors, and then sets their fields directly.

See this Q&A for some research: Gson Deserialization with Kotlin, Initializer block not called.

As a consequence, Gson ignores both the construction logic and the class state invariants, so it is not recommended to use it for complex classes which may be affected by this. It ignores the value checks in the setters as well.

Consider a Kotlin-aware serialization solution, such as Jackson (mentioned in the Q&A linked above) or kotlinx.serialization.


A JSON parser is translating between two inherently incompatible worlds - one is Java/Kotlin, with their static typing and null correctness and the other is JSON/JavaScript, where everything can be everything, including null or even absent and the concept of "mandatory" belongs to your design, not the language.

So, gaps are bound to happen and they have to be handled somehow. One approach is to throw exception on the slightest problem (which makes lots of people angry on the spot) and the other is to fabricate values on the fly (which also makes lots of people angry, just bit later).

Gson takes the second approach. It silently swallows absent fields; sets Objects to null and primitives to 0 and false, completely masking API errors and causing cryptic errors further downstream.

For this reason, I recommend 2-stage parsing:

package com.example.transport
//this class is passed to Gson (or any other parser)
data class CountriesResponseTransport(
   val count: Int?,
   val countries: List<CountryTransport>?,
   val error: String?){
   
   fun toDomain() = CountriesResponse(
           count ?: throw MandatoryIsNullException("count"),
           countries?.map{it.toDomain()} ?: throw MandatoryIsNullException("countries"),
           error ?: throw MandatoryIsNullException("error")
       )
}

package com.example.domain
//this one is actually used in the app
data class CountriesResponse(
   val count: Int,
   val countries: Collection<Country>,
   val error: String)

Yes, it's twice as much work - but it pinpoints API errors immediately and gives you a place to handle those errors if you can't fix them, like:

   fun toDomain() = CountriesResponse(
           count ?: countries?.count ?: -1, //just to brag we can default to non-zero
           countries?.map{it.toDomain()} ?: ArrayList()
           error ?: MyApplication.INSTANCE.getDeafultErrorMessage()
       )

Yes, you can use a better parser, with more options - but you shouldn't. What you should do is abstract the parser away so you can use any. Because no matter how advanced and configurable parser you find today, eventually you'll need a feature that it doesn't support. That's why I treat Gson as the lowest common denominator.

There's an article that explains this concept used (and expanded) in a bigger context of repository pattern.