AWS Athena: use "folder" name as partition

It is possible to do this now using storage.location.template. This will partition by some part of your path. Be sure to NOT include the new column in the column list, as it will automatically be added. There are a lot of options you can search to tweak this for your date example. I used "id" to show the simplest version i could think of.

CREATE EXTERNAL TABLE `some_table`(
  `col1` bigint, 
PARTITIONED BY (
  `id` string
  )
ROW FORMAT SERDE 
  'org.openx.data.jsonserde.JsonSerDe' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION
  's3://path/bucket/'
TBLPROPERTIES (
  'has_encrypted_data'='false',
  'projection.enabled'='true', 
  'projection.id.type' = 'injected',
  'storage.location.template'='s3://path/bucket/${id}/'
  )

official docs: https://docs.amazonaws.cn/en_us/athena/latest/ug/partition-projection-dynamic-id-partitioning.html


Sadly this is not supported in Athena. For partitioning to work with folders, there are requirements on how the folder must be named.

e.g. s3://my-bucket/{columnname}={columnvalue}/data.json

In your case, you can still use partitioning if you add those partitions manually to the table.

e.g. ALTER TABLE tablename ADD PARTITION (datecolumn='2017-01-01') location 's3://my-bucket/2017-01-01/

The AWS docs have some good examples on that topic.

AWS Athena Partitioning