How would I track all price changes in a db in order to get the price of 'x' product on 'y' date

If I understand the scenario appropriately, you should define a table that retains a Price time series; therefore, I agree, this has a lot to do with the temporal aspect of the database you are working with.

Business rules

Let us start analyzing the situation from the conceptual level. So, if, in your business domain,

  • a Product is purchased at one-to-many Prices,
  • each Price of purchase becomes Current at an exact StartDate, and
  • the Price EndDate (which indicates the Date when a Price ceases to be Current) is equal to the StartDate of the immediately subsequent Price,

then that means that

  • there are no Gaps between the distinct Periods during which the Prices are Current (the time series is continuous or conjunct), and
  • the EndDate of a Price is a derivable datum.

The IDEF1X diagram shown in Figure 1, although highly simplified, depicts such a scenario:

Fig. 1 - Product Prices Simplified IDEF1X Diagram - Scenario A

Expository logical layout

And the following SQL-DDL logical-level design, based on said IDEF1X diagram, illustrates a feasible approach that you can adapt to your own exact needs:

-- At the physical level, you should define a convenient 
-- indexing strategy based on the data manipulation tendencies
-- so that you can supply an optimal execution speed of the
-- queries declared at the logical level; thus, some testing 
-- sessions with considerable data load should be carried out.

CREATE TABLE Product (
    ProductNumber INT      NOT NULL,
    Etcetera      CHAR(30) NOT NULL,
    --
    CONSTRAINT Product_PK PRIMARY KEY (ProductNumber)
);

CREATE TABLE Price (
    ProductNumber INT  NOT NULL,
    StartDate     DATE NOT NULL,
    Amount        INT  NOT NULL, -- Retains the amount in cents, but there are other options regarding the type of use.
    --
    CONSTRAINT Price_PK            PRIMARY KEY (ProductNumber, StartDate),
    CONSTRAINT Price_to_Product_FK FOREIGN KEY (ProductNumber)
        REFERENCES Product (ProductNumber),
    CONSTRAINT AmountIsValid_CK    CHECK       (Amount >= 0)
);

The Price table has a composite PRIMARY KEY made up of two columns, i.e., ProductNumber (constrained, in turn, as a FOREIGN KEY that makes a reference to Product.ProductNumber) and StartDate (pointing out the particular Date in which a certain Product was purchased at a specific Price).

In case that Products are purchased at different Prices during the same Day, instead of the StartDate column, you may include one labeled as StartDateTime that keeps the Instant when a given Product was purchased at an exact Price. The PRIMARY KEY would then have to be declared as (ProductNumber, StartDateTime).

As demonstrated, the aforementioned table is an ordinary one, because you can declare SELECT, INSERT, UPDATE and DELETE operations to manipulate its data directly, hence it (a) permits avoiding the installation of additional components and (b) can be used in all the major SQL platforms with some few adjustments, if necessitated.

Data manipulation samples

To exemplify some manipulation operations that appear useful, let us say that you have INSERTed the following data in the Product and Price tables, respectively:

INSERT INTO Product
    (ProductNumber, Etcetera)
VALUES
    (1750, 'Price time series sample'); 

INSERT INTO Price
    (ProductNumber, StartDate, Amount)
VALUES
    (1750, '20170601', 1000),
    (1750, '20170603', 3000),   
    (1750, '20170605', 4000),
    (1750, '20170607', 3000);

Since the Price.EndDate is a derivable data point, then you must to obtain it via, precisely, a derived table that can be created as a view in order to produce the “full” time series, as exemplified below:

CREATE VIEW PriceWithEndDate AS

    SELECT  P.ProductNumber,
            P.Etcetera AS ProductEtcetera,
           PR.Amount   AS PriceAmount,
           PR.StartDate,
           (
                SELECT MIN(StartDate)
                      FROM Price InnerPR
                     WHERE P.ProductNumber   = InnerPR.ProductNumber
                       AND InnerPR.StartDate > PR.StartDate
           ) AS EndDate
        FROM Product P
        JOIN Price   PR
          ON P.ProductNumber = PR.ProductNumber;

Then the following operation that SELECTs directly from that view

  SELECT ProductNumber,
         ProductEtcetera,
         PriceAmount,
         StartDate,
         EndDate
    FROM PriceWithEndDate 
ORDER BY StartDate DESC;

supplies the next result set:

ProductNumber  ProductEtcetera     PriceAmount  StartDate   EndDate
-------------  ------------------  -----------  ----------  ----------
         1750  Price time series…         4000  2017-06-07  NULL      -- (*) 
         1750  Price time series…         3000  2017-06-05  2017-06-07
         1750  Price time series…         2000  2017-06-03  2017-06-05
         1750  Price time series…         1000  2017-06-01  2017-06-03

-- (*) A ‘sentinel’ value would be useful to avoid the NULL marks.

Now, let us assume that you are interested in getting the whole Price data for the Product primarily identified by ProductNumber 1750 on Date 2 June 2017. Seeing that a Price assertion (or row) is current or effective during the entire Interval that runs from (i) its StartDate to (ii) its EndDate, then this DML operation

 SELECT ProductNumber,
        ProductEtcetera,
        PriceAmount,
        StartDate,
        EndDate
   FROM PriceWithEndDate
  WHERE ProductNumber = 1750        -- (1) 
    AND StartDate    <= '20170602'  -- (2)
    AND EndDate      >= '20170602'; -- (3)

-- (1), (2) and (3): You can supply parameters in place of fixed values to make the query more versatile.

yields the result set that follows

ProductNumber  ProductEtcetera     PriceAmount  StartDate   EndDate
-------------  ------------------  -----------  ----------  ----------
         1750  Price time series…         1000  2017-06-01  2017-06-03

which addresses said requirement.

As shown, the PriceWithEndDate view plays a paramount role in obtaining most of the derivable data, and can be SELECTed FROM in a fairly ordinary way.

Taking into account that your platform of preference is PostgreSQL, this content from the official documentation site contains information about “materialized” views, which can help to optimize the execution speed by means of physical level mechanisms, if said aspect becomes problematic. Other SQL database management systems (DBMSs) offer physical instruments that are very alike, although different terminology may be applied, e.g., “indexed” views in Microsoft SQL Server.

You can see the discussed DDL and DML code samples in action in this db<>fiddle and in this SQL Fiddle.

Related resources

  • In this Q & A we discuss a business context that includes the changes of Product Prices but has a more extense scope, so you may find it of interest.

  • These Stack Overflow posts cover very relevant points regarding the type of a column that holds a currency datum in PostgreSQL.

Responses to comments

This looks similar to work that I did, but I found it much more convenient / efficient to work with a table where a price (in this case) has a startdate column and an enddate column - so you're just looking for rows with targetdate >= startdate and targetdate <= enddate. Of course, if the data isn't stored with those fields (including enddate 31st December 9999, not Null, where no actual end date exists), then you'd have to do work to produce it. I actually made it run every day, with end date = today's date by default. Also, my description requires enddate 1 = startdate 2 minus 1 day. – @Robert Carnegie, on 2017-06-22 20:56:01Z

The method I propose above addresses a business domain of the characteristics previously described, consequently applying your suggestion about declaring the EndDate as a column —which is different from a “field”— of the base table named Price would imply that the logical structure of the database would not be reflecting the conceptual schema correctly, and a conceptual schema must be defined and reflected with precision, including the differentiation of (1) base information from (2) derivable information.

Apart from that, such a course of action would introduce duplication, since the EndDate could then be obtained by virtue of (a) a derivable table and also by virtue of (b) the base table named Price, with the therefore duplicated EndDate column. While that is a possibility, if a practitioner decides to follow said approach, he or she should decidedly warn the database users about the inconveniences and inefficiencies it involves. One of those inconveniences and inefficiencies is, e.g., the urgent need to develop a mechanism that ensures, at all times, that each Price.EndDate value is equal to that of the Price.StartDate column of the immediately successive row for the Price.ProductNumber value at hand.

In contrast, the work to produce the derived data in question as I put forward is, honestly, not special at all, and is required to (i) guarantee the correct correspondence between the logical and conceptual levels of abstraction of the database and (ii) ensure data integrity, both aspects that as noted before are decidedly of great importance.

If the efficiency aspect you are talking about is related to the execution speed of some data manipulation operations, then it must be managed at the appropriate place, i.e., at the physical level, via, e.g., an advantageous indexing strategy, based on (1) the particular query tendencies and (2) the particular physical mechanisms provided by the DBMS of use. Otherwise, sacrificing the appropriate conceptual-logical mapping and compromising the integrity of the data involved easily turns a robust system (i.e., a valuable organizational asset) into a non-reliable resource.

Discontinuous or disjunct time series

On the other hand, there are circumstances where retaining the EndDate of each row in a time series table is not only more conenient and efficient but demanded, although that depends entirely on business-environment-specific requirements of course. One example of that kind of circumstances comes about when

  • both the StartDate and the EndDate pieces of information are held before (and retained via) every INSERTion, and
  • there can be Gaps in the middle of the distinct Periods during which the Prices are Current (i.e., the time series is discontinuous or disjunct).

I have represented said scenario in the IDEF1X diagram displayed in Figure 2.

Fig. 2 - Product Prices Simplified IDEF1X diagram - Scenario B

In that case, yes, the hypothetical Price table must be declared in a manner that is similar to this:

CREATE TABLE Price (
    ProductNumber INT  NOT NULL,
    StartDate     DATE NOT NULL,
    EndDate       DATE NOT NULL,
    Amount        INT  NOT NULL,
    --
    CONSTRAINT Price_PK            PRIMARY KEY (ProductNumber, StartDate, EndDate),
    CONSTRAINT Price_to_Product_FK FOREIGN KEY (ProductNumber)
        REFERENCES Product (ProductNumber),
    CONSTRAINT DatesOrder_CK       CHECK       (EndDate >= StartDate)
);

And, yes, that logical DDL design simplifies administration at the physical level, because you can put up an indexing strategy that encompasses the EndDate column (which, as shown, is declared in a base table) in relatively easier configurations.

Then, a SELECT operation like the one below

 SELECT  P.ProductNumber,
         P.Etcetera,
        PR.Amount,
        PR.StartDate,
        PR.EndDate
   FROM Price   PR
   JOIN Product P
  WHERE P.ProductNumber = 1750       
    AND StartDate      <= '20170602'  
    AND EndDate        >= '20170602';

may be used to derive the whole Price data for the Product primarily identified by ProductNumber 1750 on Date 2 June 2017.


I believe you're going to want to look at Temporal Tables. These provide functionality to do exactly what you're looking for and are available in Postgres with the proper extensions.

This concept looks to be pretty DB agnostic as well, as it's offered on a variety of RDBMS platforms.