How to skip headers when we are reading data from a csv file in s3 and creating a table in aws athena.
This is what works in Redshift:
You want to use
table properties ('skip.header.line.count'='1')
Along with other properties if you want, e.g.
Here's a sample:
create external table exreddb1.test_table (ID BIGINT ,NAME VARCHAR ) row format delimited fields terminated by ',' stored as textfile location 's3://mybucket/myfolder/' table properties ('numRows'='100', 'skip.header.line.count'='1');
This is a known deficiency.
The best method I've seen was tweeted by Eric Hammond:
...WHERE date NOT LIKE '#%'
This appears to skip header lines during a Query. I'm not sure how it works, but it might be a method for skipping NULLs.