Double entry bookkeeping database design

A Journal is a chronological listing of all transactions of a specified type for an accounting system. Here is a classical presentation on ledger paper of a simple Sales (on Account) Journal:

enter image description here

Note that every line is a single transaction, with Total Debits = Total Credits; and that every transaction hits the same three accounts. A Cash Sales Journal would look similar but replace the column Accounts Receivable Debit with one labelled Cash Debit. A Cash Disbursements Journal would have a first column labelled Cash Credit and additional columns such as Accounts Payable Debit and Employee Expenses Debit.

This presentation was standard for hundreds of years until personal computers became affordable a couple of decades ago. It has a significant advantage of easily verifying that every transaction is balanced by simply checking each row. Likewise, prior to posting a page of such transaction to the Ledgers the page totals could be similarly verified. That is a batch-processing model used in many accounting systems.

It is easy to see that this paper model has significant disadvantages when literally transcribed to an automated system:

  1. The data structure is a pivoted one, whereas experience has shown that it is much simpler (also more robust and more easily verified) to program against a folded data structure; and
  2. Because every specialized Journal hits a different set of accounts, each such Journal would have to be separately designed and programmed, a wasteful and error-prone activity.

For these reasons it is generally recommended to use the Specialized Journals design as the interface to the accounting system, but to design a data structure that can be the numerical repository for multiple journals. In a modern RDBMS this has the potential advantage that the General Ledger, and even the specialized subledgers, can become Indexed Views on the Journal, completely eliminating the requirement to code a Posting Process (the step where a Journal transaction is locked and has its account totals transcribed to the various ledgers).

Whatever data design you end up with, the key is to have a single Posting Entry for each transaction type (ie each specialized Journal in the equivalent paper system) where balance checks are made.


My point is that both approaches you propose are unsound mechanisms for double-entry bookkeeping. First point: Journals are write-once tables for very good reasons. Sometimes the two virtual tables Pending Entries and Posted Entries are collocated in a single data structure, but the IsPosted bit is always write-only and the system must ensure that the read-only nature of the Posted Entries records is maintained.

The way accountants have posted Journal Entries for the past 800 years is fully normalized. The only difference between a paper presentation and a sound electronic presentation is that a folded table structure is more convenient in the latter case while a pivoted table structure in the former, which historically was most significant when highly parallel processing was desired - i.e. many clerks each maintaining a single or small number of specialized journals. Only the Controller historically had permissions for the General Journal.

Note carefully that in the above I specified that Journal Entries are fully normalized; Journal Entries are a chronological record of all transactions, grouped in Journals specific to each transaction type. The posting of journal entries to the Ledgers is a separate endeavour and, while fully normalized on its own, is a redundant copy of the journal entries where all transactions are summarized (General Ledger) or detailed (Sub Ledger) by account. Both Journals and Ledgers employ double-entry bookkeeping independently.

@Codism Any accounting system, DEB or SEB, give you generalized reporting for all accounts recorded. Note that, internally, a sub-ledger is by definition a single-entry bookkeeping record; the other side is the corresponding control account(s) on the Balance Sheet. DEB ensures that for every asset dollar there is a precise record of the business claim on the asset, i.e. the equity in the asset whether it be a debt equity (aka liability) or an ownership equity, and of the priority of those claims if the organization becomes insolvent.


Indeed, the single row accounting schema proposed allows to do proper double entry accounting (to always specify the account debited and credited) without introducing the redundancy of the "amount" data.

The one row schema gives you an implementation of a double entry that balances by construction, so that it is impossible to ever "loose balance". The machine may recompute the ledgers on the fly.

You have to do 2 select instead of one to retrieve a ledger.

Note, besides transaction split, there are other transactions such as foreign exchange might end up with 2 records instead of 4. It all depends if you would denormalize a bit just enter 4 transactions with similar description.

You may prevent the entry or modification of any transaction to maintain an audit trail, this is required if you want to be able to audit the transaction log.

It appears that in the thread above, for the CPA, "fully normalized" appears to mean rules recognized by all accountants, whereas for the programmer it has a diferent meaning, that there is no derived or redundant data stored.

All there is to accounting data is a set of transactions which give amount and the accounts from which and to which they flow, along with their date, some description (and other attachements). Ledgers and balances are simple views derived from this transactional data by doing sums.


May I suggest that you take a look at the treasure trove of Open Source software that is out there in this area?

A Google of "open source double entry accounting software" gives several promising avenues of investigation.

You could look at the GNU accounting package here, a couple of review sites (1 & 2) and finally the wiki of accounting software with a section on Free Software.

I'm sure that by examining some of the F/LOSS source code and schemas, you might get some good hints and indications of how to write schemas and/or software which suit your own particular needs.