ItemSize in DynamoDB

An item’s size is the sum of all its attributes’ sizes, including the hash and range key attributes. Attributes themselves have a name and a value. Both the name and value contribute to an attribute’s size. Names are sized the same way as string values. All values are sized differently based on their data type.

If you're interested in the nitty-gritty details, have a read of this blog post.

Otherwise, I've also created a DynamoDB Item Size and Consumed Capacity Calculator that accurately determines item sizes.

Numbers are easily DynamoDB's most complicated type. AWS does not publicly document how to determine how many bytes are in a number. They say this is so they can change the internal implementation without anyone being tied to it. What they do say, however, sounds simple but is more complicated in practice.

Very roughly, though, the formula is something like 1 byte for every 2 significant digits, plus 1 extra byte for positive numbers or 2 for negative numbers. Therefore, 27 is 2 bytes and -27 is 3 bytes. DynamoDB will round up if there’s an uneven amount of digits, so 461 will use 3 bytes (including the extra byte). Leading and trailing zeros are trimmed before calculating the size.


That's a non trivial topic indeed - You already quoted the somewhat sloppy definition from the Amazon DynamoDB Data Model:

An item size is the sum of lengths of its attribute names and values (binary and UTF-8 lengths).

This is detailed further down the page within Amazon DynamoDB Data Types a bit:

  • String - Strings are Unicode with UTF8 binary encoding.
  • Number - Numbers are positive or negative exact-value decimals and integers. A number can have up to 38 digits of precision after the decimal point, and can be between 10^-128 to 10^+126. The representation in Amazon DynamoDB is of variable length. Leading and trailing zeroes are trimmed.

A similar question than yours has been asked in the Amazon DynamoDB forum as well (see Curious nature of the "Number" type) and the answer from Stefano@AWS sheds more light on the issue:

  • The "Number" type has 38 digits of precision These are actual decimal digits. So it can represent pretty large numbers, and there is no precision loss.
  • How much space does a Number value take up? Not too much. Our internal representation is variable length, so the size is correlated to the actual (vs. maximum) number of digits in the value. Leading and trailing zeroes are trimmed btw. [emphasis mine]

Christopher Smith's follow up post presents more insights into the resulting ramifications regarding storage consumption and its calculation, he concludes:

The existing API provides very little insight in to storage consumption, even though that is part (admittedly not that significant) of the billing. The only information is the aggregate table size, and even that data is potentially hours out of sync.

While Amazon does not expose it's billing data via an API yet, they they'll hopefully add an option to retrieve some information regarding item size to the DynamoDB API at some point, as suggested by Christopher.


You can use the algorithm for computing DynamoDB item size in the DynamoDB Storage Backend for Titan DynamoDBDelegate class.


I found this answer in amazon developer forum answered by Clarence@AWS:

eg:-

"Item":{
"time":{"N":"300"},
"feeling":{"S":"not surprised"},
"user":{"S":"Riley"}
}

in order to calculate the size of the above object:

The item size is the sum of lengths of the attribute names and values, interpreted as UTF-8 characters. In the example, the number of bytes of the item is therefore the sum of

Time : 4 + 3 
Feeling : 7 + 13 
User : 4 + 5          

Which is 36

For the formal definition, refer to: http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/WorkingWithDDItems.html