pandas .at versus .loc

As you asked about the limitations of .at, here is one thing I recently ran into (using pandas 0.22). Let's use the example from the documentation:

df = pd.DataFrame([[0, 2, 3], [0, 4, 1], [10, 20, 30]], index=[4, 5, 6], columns=['A', 'B', 'C'])
df2 = df.copy()

    A   B   C
4   0   2   3
5   0   4   1
6  10  20  30

If I now do

df.at[4, 'B'] = 100

the result looks as expected

    A    B   C
4   0  100   3
5   0    4   1
6  10   20  30

However, when I try to do

 df.at[4, 'C'] = 10.05

it seems that .at tries to conserve the datatype (here: int):

    A    B   C
4   0  100  10
5   0    4   1
6  10   20  30

That seems to be a difference to .loc:

df2.loc[4, 'C'] = 10.05

yields the desired

    A   B      C
4   0   2  10.05
5   0   4   1.00
6  10  20  30.00

The risky thing in the example above is that it happens silently (the conversion from float to int). When one tries the same with strings it will throw an error:

df.at[5, 'A'] = 'a_string'

ValueError: invalid literal for int() with base 10: 'a_string'

It will work, however, if one uses a string on which int() actually works as noted by @n1k31t4 in the comments, e.g.

df.at[5, 'A'] = '123'

     A   B   C
4    0   2   3
5  123   4   1
6   10  20  30

Update: df.get_value is deprecated as of version 0.21.0. Using df.at or df.iat is the recommended method going forward.


df.at can only access a single value at a time.

df.loc can select multiple rows and/or columns.

Note that there is also df.get_value, which may be even quicker at accessing single values:

In [25]: %timeit df.loc[('a', 'A'), ('c', 'C')]
10000 loops, best of 3: 187 µs per loop

In [26]: %timeit df.at[('a', 'A'), ('c', 'C')]
100000 loops, best of 3: 8.33 µs per loop

In [35]: %timeit df.get_value(('a', 'A'), ('c', 'C'))
100000 loops, best of 3: 3.62 µs per loop

Under the hood, df.at[...] calls df.get_value, but it also does some type checking on the keys.