# PR ## Morsel - Issue: [Introduce morsel-driven Parquet scan #20529](https://github.com/apache/datafusion/issues/20529) - Issue: [Support Morsel output for Parquet known to be non blocking #21598](https://github.com/apache/datafusion/issues/21598) - PR: [Introduce Morselizer API, rewrite ParquetOpener to ParquetMorselizer](https://github.com/apache/datafusion/pull/21327) - PR: [Rewrite FileStream in terms of Morsel API #21342](https://github.com/apache/datafusion/pull/21342) - PR: [Refactor parquet datasource into an explicit state machine #21190](https://github.com/apache/datafusion/pull/21190) # Others - [Add io-uring based ObjectStore for local file I/O](https://github.com/apache/datafusion/pull/21673) - [feat: Cache Parquet metadata in built in parquet reader #16971](https://github.com/apache/datafusion/pull/16971) ## Filter pushdown - **DataFusion** - [Push down InList or hash table references from HashJoinExec depending on the size of the build side #18393](https://github.com/apache/datafusion/pull/18393) - dynamic filter - **Arrow** - [Parquet: Adaptive Parquet Predicate Pushdown #8733](https://github.com/apache/arro/w-rs/pull/8733) - [Speed up Parquet filter pushdown v4 Predicate evaluation cache for async_reader #7850](https://github.com/apache/arrow-rs/pull/7850) ---- # Issue - [Fix performance regressions when enabling parquet filter pushdown](https://github.com/apache/datafusion/issues/20324) - [EPIC: Sort pushdown / partially sorted scans #17348](https://github.com/apache/datafusion/issues/17348) ---- # Posts - [Query Optimization - When Ordering Requirement is Satisfied?](https://akurmustafa.github.io/blogs/query_optimization_ordering_analysis/query_optimization_ordering_analysis.html) - [Using Ordering for Better Plans in Apache DataFusion](https://datafusion.apache.org/blog/2025/03/11/ordering-analysis/) - [Using Rust async for Query Execution and Cancelling Long-Running Querie](https://datafusion.apache.org/blog/2025/06/30/cancellation/) - [x] [Using External Indexes, Metadata Stores, Catalogs and Caches to Accelerate Queries on Apache Parquet](https://datafusion.apache.org/blog/2025/08/15/external-parquet-indexes/) - filter & pruning - [x] [Querying Parquet with Millisecond Latency](https://www.influxdata.com/blog/querying-parquet-millisecond-latency/) - [[query-parquet-low-latency-blog]] - [x] [Efficient Filter Pushdown in Parquet](https://datafusion.apache.org/blog/2025/03/21/parquet-pushdown/) - [x] [Parquet Pruning in DataFusion: Read Only What Matters](https://datafusion.apache.org/blog/2025/03/20/parquet-pruning/) - [x] [Dynamic Filters: Passing Information Between Operators During Execution for 25x Faster Queries](https://datafusion.apache.org/blog/2025/09/10/dynamic-filters/) - [Turning LIMIT into an I/O Optimization: Inside DataFusion’s Multi-Layer Pruning Stack](https://datafusion.apache.org/blog/2026/03/20/limit-pruning) - string view - [Using StringView / German Style Strings to Make Queries Faster: Part 1- Reading Parquet](https://datafusion.apache.org/blog/2024/09/13/string-view-german-style-strings-part-1/) - [Using StringView / German Style Strings to make Queries Faster: Part 2 - String Operations](https://datafusion.apache.org/blog/2024/09/13/string-view-german-style-strings-part-2/) - [Faster DataFusion with StringView - Xiangpeng Hao (Aug 15, 2024)](https://www.youtube.com/watch?v=RVLshX6fbds) --- # Meetup - [Video: The Spice Cluster-Sidecar Architecture with Apache DataFusion: Seattle DataFusion Meetup, April 2026](https://www.youtube.com/watch?v=3rvmlGyPrX8) - [Apache Ballista at Spice AI: Distributed Query Execution Without the Operational Tax](https://spice.ai/blog/apache-ballista-at-spice-ai) --- # Code - Range Based Pruning ([PruningPredicate](https://docs.rs/datafusion/latest/datafusion/physical_optimizer/pruning/struct.PruningPredicate.html)) for cases where your index stores min/max values. - Range analysis for predicates ([cp_solver](https://docs.rs/datafusion/latest/datafusion/physical_expr/intervals/cp_solver/index.html)) for interval-based range analysis (e.g. `col > 5 AND col < 10`). - [ParquetAccessPlan](https://docs.rs/datafusion/latest/datafusion/datasource/physical_plan/parquet/struct.ParquetAccessPlan.html)