Polymorphism and inheritance in Avro schemas

I found this question having similar problem. In my case I needed just to impose marker interface and only to some types (to distinguish particular classes later). Thanks to your answer, I dug deeper into structure of record.vm template. I found out it's possible to define "javaAnnotation": "my.full.AnnotationName" key in .avsc definition JSON. @my.full.AnnotationName is then added to generated class.

Admittedly, this solution is not built on marker interface finally, though for my purpose is good enough and keeping template untouched is big advantage.


I decided to use the ReflectData API to generate the Schema from the class at runtime and then use the ReflectDatumWriter for serialization. Use of reflection will be slower. But, it looks like the schema is cached internally. I will report back if I see performance issues.

Schema schema = ReflectData.AllowNull.get().getSchema(sourceObject.getClass());
ReflectDatumWriter<T> reflectDatumWriter = new ReflectDatumWriter<>(schema);

DataFileWriter<T> writer = new DataFileWriter<>(reflectDatumWriter);
try {
    writer.setCodec(CodecFactory.snappyCodec());
    writer.create(schema, new File("data.avro"));
    writer.append(sourceObject);
    writer.close();
}
catch (IOException e) {
    // log exception
}

I found a better way to solve this problem. Looking at the Schema generation source in Avro, I figured out that internally the class generation logic uses Velocity schemas to generate the classes.

I modified the record.vm template to also implement my specific interface. There is a way to specify the location of velocity directory using the templateDirectory configuration in the maven build plugin.

I also switched to using SpecificDatumWriter instead of reflectDatumWriter.

<plugin>
  <groupId>org.apache.avro</groupId>
  <artifactId>avro-maven-plugin</artifactId>
   <version>${avro.version}</version>
   <executions>
    <execution>
      <phase>generate-sources</phase>
      <goals>
        <goal>schema</goal>
      </goals>
      <configuration>
         <sourceDirectory>${basedir}/src/main/resources/avro/schema</sourceDirectory>
         <outputDirectory>${basedir}/target/java-gen</outputDirectory>
         <fieldVisibility>private</fieldVisibility>
         <stringType>String</stringType>
         <templateDirectory>${basedir}/src/main/resources/avro/velocity-templates/</templateDirectory>
       </configuration>
    </execution>
  </executions>
</plugin>

I hope it will be helpful for others if I'll write it here that I've created maven plugin for exactly this case - https://github.com/tunguski/interfacer.

It goes through auto generated classes and check do they conform to interfaces found on classpath in specific package. If yes, interface is added to the class. It works with generic interfaces, at least in basic examples I had to deal with.

The plugin is not avro specific, works as a generated code post processor, so it may be used in other cases too.

<!-- 
  post process avro generated sources and add interfaces from package
  pl.matsuo.interfacer.showcase to every generated class that has 
  all methods from specific interface
 -->
<plugin>
    <groupId>pl.matsuo.interfacer</groupId>
    <artifactId>interfacer-maven-plugin</artifactId>
    <version>0.0.6</version>
    <executions>
        <execution>
            <configuration>
                <interfacesDirectory>${project.basedir}/src/main/java</interfacesDirectory>
                <interfacePackage>pl.matsuo.interfacer.showcase</interfacePackage>
            </configuration>
            <goals>
                <goal>add-interfaces</goal>
            </goals>
        </execution>
    </executions>
</plugin>
// src/main/java manually defined interface
public interface HasName {
  String getName();
}

// target/generated-sources/avro
public class Person {

  String name;

  public String getName() {
    return name;
  }
  // [...]
}

public class Company {

  String name;

  public String getName() {
    return name;
  }
  // [...]
}

// after this plugin run

// target/generated-sources/avro
public class Person implements HasName {

  String name;

  public String getName() {
    return name;
  }
  // [...]
}

public class Company implements HasName {

  String name;

  public String getName() {
    return name;
  }
  // [...]
}

Tags:

Avro