Reading specific partitions from a partitioned parquet dataset with pyarrow

As of pyarrow version 0.10.0 you can use filters kwarg to do the query. In your case it would look like something like this:

import pyarrow.parquet as pq
dataset = pq.ParquetDataset('path-to-your-dataset', filters=[('part2', '=', 'True'),])
table = dataset.read()

Ref


Question: How do I read specific partitions from a partitioned parquet dataset with pyarrow?

Answer: You can't right now.

Can you create an Apache Arrow JIRA requesting this feature on https://issues.apache.org/jira?

This is something that we should be able to support in the pyarrow API but it will require someone to implement it. Thank you