How to extract schema from an avro file in Java

You can use the data bricks library as shown here https://github.com/databricks/spark-avro which will load the avro file into a Dataframe (Dataset<Row>)

Once you have a Dataset<Row>, you can directly get the schema using df.schema()


Thanks for @Helder Pereira's answer. As a complement, the schema can also be fetched from getSchema() of GenericRecord instance.
Here is an live demo about it, the link above shows how to get data and schema in java for Parquet, ORC and AVRO data format.


If you want know the schema of a Avro file without having to generate the corresponding classes or care about which class the file belongs to, you can use the GenericDatumReader:

DatumReader<GenericRecord> datumReader = new GenericDatumReader<>();
DataFileReader<GenericRecord> dataFileReader = new DataFileReader<>(new File("file.avro"), datumReader);
Schema schema = dataFileReader.getSchema();
System.out.println(schema);

And then you can read the data inside the file:

GenericRecord record = null;
while (dataFileReader.hasNext()) {
    record = dataFileReader.next(record);
    System.out.println(record);
}