Automatically remove fields from .bib file containing biblatex entries such as @Thesis

Notice: By default, biber silently drops fields which are unknown to the datamodel. So, if you happen to use non-standard fields, see update below.

You can use biber's tool mode with an appropriate sourcemap.

In biber's tool mode it operates on your datasource, so you should run if on command line as, e.g.:

biber --tool --configfile=biber-tool.conf <mybibfile>.bib

(Of course, <> are there just for you to substitute with the adequate file name).

biber-tool.conf specifies what you want biber to do with your file. In your case, you want to delete certain fields from your entries, so a sourcemap is the adequate tool for that. The contents of biber-tool.conf would than be (with some other options relevant to the control of output appearence):

<?xml version="1.0" encoding="UTF-8"?>
<config>
    <output_fieldcase>lower</output_fieldcase>
    <output_indent>2</output_indent>
    <output_align>true</output_align>
    <sourcemap>
        <maps datatype="bibtex" map_overwrite="1">
            <map map_overwrite="1">
                <map_step map_field_set="abstract" map_null="1"/>
                <map_step map_field_set="review" map_null="1"/>
                <map_step map_field_set="group" map_null="1"/>
                <map_step map_field_set="file" map_null="1"/>
            </map>
        </maps>
    </sourcemap>
</config>

With this setup, the command above biber will output a new file <mybibfile>_bibertool.bib having removed the specified fields.

The result for your entry would be:

@thesis{Author_18_TheThesis,
  author      = {Author, Mr},
  institution = {Department of Documents, University of Stackexchange},
  date        = {2018},
  title       = {The Thesis},
  type        = {Doctoral Dissertation},
}

Update: By default, biber silently drops fields which are unknown to the datamodel. So, if you have any of those in your datasource, or if you are unsure and wants to be warned about any ignored fields, use the option --validate-datamodel:

biber --tool --validate-datamodel --configfile=biber-tool.conf <mybibfile>.bib

For your entry, that would give you the following warnings:

WARN - Datamodel: Entry 'Author_18_TheThesis' (references.bib): Field 'groups' invalid in data model - ignoring
WARN - Datamodel: Entry 'Author_18_TheThesis' (references.bib): Field 'ispreprintpublic' invalid in data model - ignoring

Now, if the dropping of these fields is not wanted and you must keep them, you have to provide biber with a data model which includes them. As far as I tried, unfortunately one cannot simply "add" a field to the default data model, so you have to bring the whole default data model to your custom biber-tool.conf. Biber provides an easy way to find the default biber-tool.conf which contains the default data model:

biber --tool-config

That should return the location of the default biber-tool.conf. If you open that file, you will find the default data model specifications (everything between <datamodel> and </datamodel>). Copy that (yes, all that) in your custom biber-tool.conf, just below your sourcemap specifications, as defined above. Then add the line(s) of your non-standard field(s) within the <fields>...</fields> group. In your case (assuming here these are "literal" type fields):

<field fieldtype="field" datatype="literal">ispreprintpublic</field>
<field fieldtype="field" datatype="literal">groups</field>

And, within the group <entryfields><entrytype>thesis</entrytype>...<\entryfields> add:

<field>ispreprintpublic</field>
<field>groups</field>

Unfortunatelly, I cannot include the entire resulting biber-tool.conf for it exceeds the limits of the site. But I hope the procedure is clear. Having done this, for this input:

 @Thesis{Author_18_TheThesis,
  author      = {Mr Author},
  title       = {The Thesis},
  type        = {Doctoral Dissertation},
  institution = {Department of Documents, University of Stackexchange},
  year        = {2018},
  abstract    = {This is the abstract.},
  file        = {:author/Author_18_TheThesis.pdf:PDF},
  review      = {This is the review.},
  groups      = {publications},
  ispreprintpublic = {test},
}

The output is:

@thesis{Author_18_TheThesis,
  author           = {Author, Mr},
  institution      = {Department of Documents, University of Stackexchange},
  date             = {2018},
  ispreprintpublic = {test},
  title            = {The Thesis},
  type             = {Doctoral Dissertation},
}

This is not specially straightforward. But, to quote a comment from PLK on the matter: "The benefits of having a datamodel in tool mode outweigh this sort of problem."


Andrew Swann's answer using bibtool originally linked in the OP does work, provided the resource biblatex is given (ht to moewe).

So, for a file remove-fields.rsc:

preserve.keys = On
preserve.key.case = On
resource{biblatex}
delete.field = { abstract }
delete.field = { review }
delete.field = { groups }
delete.field = { file }

The command:

bibtool -r remove-fields ./references.bib -o new.bib

will result in:

@Thesis{      Author_18_TheThesis,
  Author    = {Mr Author},
  Title     = {The Thesis},
  Type      = {Doctoral Dissertation},
  Institution   = {Department of Documents, University of Stackexchange},
  Year      = {2018},
  ispreprintpublic={test}
}

Another option is the bib2bib tool, which provides pretty flexible and reliable ways to filter/extract/expand bibtex entries. This (little known) utility is part of the bibtex2html tool suite. (Note: you have to look for the PDF documentation, the HTML documentation does not discuss bib2bib.)

For instance, to remove certain fields from a biblatex.bib file and save the output to bibtex.bib:

bib2bib --remove abstract --remove file --remove review -ob bibtex.bib biblatex.bib   

It is also possible to specify filter and sorting options, rename fields (--rename <old> <new>) and so on.