eCommerce orders table. Save prices, or use an audit/history table?

As you've identified, storing the price on the order makes the technical implementation easier. There are a number of business reasons why this may be beneficial though.

In addition to web transactions, many businesses support sales through other channels, e.g.:

Over the phone
Sales agents "on the road"
At a physical location (e.g. shop, office)

In these cases the order may be entered into the system at some time after the transaction took place. In these circumstances it can be difficult to impossible to correctly identify which historical price record should be used - storing the unit price directly on the order is the only feasible option.

Multiple channels often bring another challenge - different prices for the same product. Surcharges for phone orders are common - and some customers may negotiate themselves a discount. You may be able to represent all possible prices for all channels in your product schema, but incorporating this into your order tables can become (very) complex.

Anywhere that negotiation is allowed it becomes very difficult to link price history to the order price agreed (unlesss agents have very narrow negotiation limits). You need to store the price on the order itself.

Even if you only support web transactions and having a relatively simple pricing structure, there's still an interesting problem to overcome - how should price increases be handled for in flight transactions? Does the business insist that the customer must pay increases or do they honour the original price (when the product was added to the basket)? If it's the latter it the technical implementation is complicated - you need to find a way to ensure you're maintaining the price version in the session correctly.

Finally, many businesses are starting to use highly dynamic pricing. There may not be one fixed price for a given product - it is always calculated at runtime based on factors such as time of day, demand for the product and so on. In these cases the price may not be stored against the product in the first place!

I will add some practical points that I have seen.

Products are transient.

What they may signify today, may not be the same as what they used to signify an year back. The same sku code (and hence the product_id), might refer to different variant/kind of the product at different stages.

Not everyone understands all the concerns at hand; hence a user may change the atrributes of the original product instead of creating a fresh one out of his own ignorance. A lot of times, this could happen because of the plan a user is on (Hey! I can have 100 sku's only, so why not keep changing the older ones instead of upgrading the plan) So, you see, in a lot of carts, a product will never signify the same thing forever.
Different prices based on ordering and shipping conditions

As user @Chris has mentioned, different prices may be applicable in different scenarios.

In most carts, you will find at least 3 different fields being stored - the unit price, the discount amount and the discounted price. In more advanced ones, you will find 2 more - unit price with tax, discounted price with tax. You may find a couple more fields to describe the shipping method charges, and additional payment method charges. The tax percents may vary depending on the state, the product, the country, shipping method, and so on, and so do the other cost heads. Similarly the discounts can vary depending on geography, promotions, time of sale and so on. Hence, there is information which can be obtained at order level only, and this combined information cannot be generated from data in products table alone.
Separation of concerns

A lot of cart are implemented in a way, so that different teams can have control over different parts of data. Some one managing the order system doesn't always need to know what all products are in stock, what were there prices at different point of time, what are alternatives for a given sku, and so on. Keeping product related data alongwith the order data helps achieve a separation of concern. This could also be true at development stages, if different teams manage different parts of the system.
Easier scalability across multiple systems

A lot of the times, the Order Management System, the Rule Engine, the Catalog Engine, Content Management System are all built/maintained as separate systems of their own. This helps optimize for various load conditions and generate specialized intelligence for each of the system. One system, then, cannot be held to ransom because of non-availabilty of information from another system.
Faster development and running time

I have used the term "development time" here, though using "debugging time" would be more apt. Whenever any new development is happening, it will be faster if data needed is available without adding complexity of its own, because then, there will be comparatively smaller debugging cycles.

Imagine you were asked to generate on-demand reports for discounts offered on a day to day basis for a given month half a year back. If you have the original price, the discounted price in 1-2 tables along with the order, order item details, this is pretty much straight forward. If however, you have to go and fetch prices from another table, and then the applicable discounts from another table, and then figure out the details, both the development and running time will be higher.

A good design should try to optimize as much for the future, as it should for the present.

It may end up costing more in storage, but I prefer to house all relevant details of the sale with the transaction itself, so that if for whatever reason our audit trail gets broken, or an administrator overrides the safeties in place, the details of the sale like: currency used, unit-price, qty, taxes applied and what value they came to, etc. are all available. I generally store that as XML so it can be flexible from sale to sale.

EDIT: To expand on what I was briefly saying above, in my follow up comment below, and what @a_horse_with_no_name touched on above, redundancy in transaction data is not only important, but it's also necessary at scale.

I'm assuming that you're building out using OOP and so you should likely have a transaction object and either an all encompassing product object and/or a price object. In my own personal experience, I prefer to be verbose in my history, storage is relatively cheep.

What we've done is create an object history which you can facilitate using a your existing RDBMS or some flavour of NOSQL key value store ( or even better a RDBMS that allows NoSQL like connections like handlersocket or memcache ), and we store the object history that way, with every detail and price change in one place easily and quickly available. If you're serious you could even use DIFF's to save on storage and only store the changes forward, although it has it's own caveats. That should take care of your history, and the advantage of serialized objects is that your system will/should be able to bring them back up as the objects they were stored. That takes care of history.

With regard to my suggestion, storing the transaction details like taxes, currency, etc. with the transaction itself means no need to look elsewhere for those details, your transaction object will be aware of its properties and you're views can take care of presenting the varying data as you see fit. You get speedy access to the snapshot and have the added benefit of redundant and verifiable records.

It's worth it, trust me!

eCommerce orders table. Save prices, or use an audit/history table?

Tags:

Mysql

Database Design

Schema

Audit

Data Integrity

Related

Recent Posts