Difference between asfreq and resample

resample is more general than asfreq. For example, using resample I can pass an arbitrary function to perform binning over a Series or DataFrame object in bins of arbitrary size. asfreq is a concise way of changing the frequency of a DatetimeIndex object. It also provides padding functionality.

As the pandas documentation says, asfreq is a thin wrapper around a call to date_range + a call to reindex. See here for an example.

An example of resample that I use in my daily work is computing the number of spikes of a neuron in 1 second bins by resampling a large boolean array where True means "spike" and False means "no spike". I can do that as easy as large_bool.resample('S', how='sum'). Kind of neat!

asfreq can be used when you want to change a DatetimeIndex to have a different frequency while retaining the same values at the current index.

Here's an example where they are equivalent:

In [6]: dr = date_range('1/1/2010', periods=3, freq=3 * datetools.bday)

In [7]: raw = randn(3)

In [8]: ts = Series(raw, index=dr)

In [9]: ts
Out[9]:
2010-01-01   -1.948
2010-01-06    0.112
2010-01-11   -0.117
Freq: 3B, dtype: float64

In [10]: ts.asfreq(datetools.BDay())
Out[10]:
2010-01-01   -1.948
2010-01-04      NaN
2010-01-05      NaN
2010-01-06    0.112
2010-01-07      NaN
2010-01-08      NaN
2010-01-11   -0.117
Freq: B, dtype: float64

In [11]: ts.resample(datetools.BDay())
Out[11]:
2010-01-01   -1.948
2010-01-04      NaN
2010-01-05      NaN
2010-01-06    0.112
2010-01-07      NaN
2010-01-08      NaN
2010-01-11   -0.117
Freq: B, dtype: float64

As far as when to use either: it depends on the problem you have in mind...care to share?


Let me use an example to illustrate:

# generate a series of 365 days
# index = 20190101, 20190102, ... 20191231
# values = [0,1,...364]
ts = pd.Series(range(365), index = pd.date_range(start='20190101', 
                                                end='20191231',
                                                freq = 'D'))
ts.head()

output:
2019-01-01    0
2019-01-02    1
2019-01-03    2
2019-01-04    3
2019-01-05    4
Freq: D, dtype: int64

Now, resample the data by quarter:

ts.asfreq(freq='Q')

output:
2019-03-31     89
2019-06-30    180
2019-09-30    272
2019-12-31    364
Freq: Q-DEC, dtype: int64

The asfreq() returns a Series object with the last day of each quarter in it.

ts.resample('Q')

output:
DatetimeIndexResampler [freq=<QuarterEnd: startingMonth=12>, axis=0, closed=right, label=right, convention=start, base=0]

Resample returns a DatetimeIndexResampler and you cannot see what's actually inside. Think of it as the groupby method. It creates a list of bins (groups):

bins = ts.resample('Q')
bin.groups

output:
 {Timestamp('2019-03-31 00:00:00', freq='Q-DEC'): 90,
 Timestamp('2019-06-30 00:00:00', freq='Q-DEC'): 181,
 Timestamp('2019-09-30 00:00:00', freq='Q-DEC'): 273,
 Timestamp('2019-12-31 00:00:00', freq='Q-DEC'): 365}

Nothing seems different so far except for the return type. Let's calculate the average of each quarter:

# (89+180+272+364)/4 = 226.25
ts.asfreq(freq='Q').mean()

output:
226.25

When mean() is applied, it outputs the average of all the values. Note that this is not the average of each quarter, but the average of the last day of each quarter.

To calculate the average of each quarter:

ts.resample('Q').mean()

output:
2019-03-31     44.5
2019-06-30    135.0
2019-09-30    226.5
2019-12-31    318.5

You can perform more powerful operations with resample() than asfreq().

Think of resample as groupby + every method that you can call after groupby (e.g. mean, sum, apply, you name it) .

Think of asfreq as a filter mechanism with limited fillna() capabilities (in fillna(), you can specify limit, but asfreq() does not support it).

Tags:

Python

Pandas