Project: UFFS - Ultra Fast File Search (Rust Implementation) Start Date: 2026-01-15 Target Completion: 2026-06-15 (21 weeks) Status: π’ Substantially Complete Architecture: Cargo Workspace with 6 crates (Polars-based)
crates/
βββ uffs-polars/ π§ Polars facade (compilation isolation) β
βββ uffs-mft/ π¦ MFT reading β Polars DataFrame β
βββ uffs-core/ π¦ Query engine using Polars lazy API β
βββ uffs-cli/ π§ Command-line interface β
βββ uffs-tui/ π₯οΈ Terminal UI β
βββ uffs-gui/ πͺ Graphical UI (placeholder)
| Phase | Milestone | Crate(s) | Target | Status | Progress |
|---|---|---|---|---|---|
| 0 | Workspace Setup | all | Week 1 | π’ Complete | 100% |
| 1 | MFT Foundation | uffs-mft | Week 4 | π’ Complete | 100% |
| 2 | MFT DataFrame | uffs-mft | Week 8 | π’ Complete | 100% |
| 2.5 | RAW MFT Persistence | uffs-mft | - | π’ Complete | 100% |
| 2.6 | Multi-Drive Parallel Reading | uffs-mft, uffs-cli | - | π’ Complete | 100% |
| 3 | Core Processing | uffs-core | Week 12 | π’ Complete | 100% |
| 3.5 | Directory Tree Structure | uffs-core | - | π’ Complete | 100% |
| 4 | CLI & Performance | uffs-cli | Week 16 | π’ Complete | 100% |
| 5 | TUI & Polish | uffs-tui | Week 21 | π’ Complete | 95% |
| 6 | Performance Optimization | uffs-mft, uffs-core | Week 31 | β¬ Not Started | 0% |
Legend: β¬ Not Started | π‘ In Progress | π’ Complete | π΄ Blocked
Phase 6 focuses on making the Rust implementation 2-5x faster than the reference binary through:
- Pipelined I/O β (2x speedup, 250x memory reduction) - overlap I/O and CPU work
- Arena-based name storage (eliminate allocations)
- Vector-based FRS lookup (O(1) direct indexing)
- In-place path building (append-reverse strategy)
- Direct-to-column parsing (skip intermediate structs)
- SIMD pattern matching (AVX2/AVX-512)
Key Insight: The C++ code uses Windows IOCP to pipeline I/O and parsing. When a read completes, it immediately queues the next read BEFORE processing the current buffer. This ensures disk and CPU are never idle, achieving near-perfect overlap.
See MFT Architecture Deep Dive for detailed analysis.
| Crate | Responsibility |
|---|---|
uffs-mft |
Pure MFT reading & storage (DataFrame, CSV, Parquet, RAW) |
uffs-core |
Post-processing (tree structure, queries, derived metrics) |
Design Principle:
uffs-mftdoes pure MFT reading and storage. No post-processing. Tree calculations and derived metrics belong inuffs-corequery engine.
Goal: Establish modern Rust workspace with Polars facade Target Date: Week 1 Status: π’ Complete
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 0.1 | Workspace Cargo.toml | - | π’ | [workspace] manifest with all crates |
| 0.2 | crates/ directory structure | - | π’ | 6 crate directories created |
| 0.3 | uffs-polars facade crate | - | π’ | Re-exports polars::prelude, column constants |
| 0.4 | uffs-mft crate skeleton | - | π’ | Full implementation |
| 0.5 | uffs-core crate skeleton | - | π’ | Full implementation |
| 0.6 | uffs-cli crate skeleton | - | π’ | Full implementation |
| 0.7 | Workspace dependencies | - | π’ | [workspace.dependencies] configured |
| 0.8 | rustfmt.toml | - | π’ | Code formatting configured |
| 0.9 | clippy.toml | - | π’ | Linting rules configured |
| 0.10 | GitHub Actions CI | - | π’ | CI pipeline configured |
| 0.11 | MSRV policy | - | π’ | rust-version = "1.85" (Edition 2024) |
-
cargo build --workspacesucceeds -
cargo test --workspaceruns (even if no tests yet) -
cargo clippy --workspacepasses - CI pipeline runs on push/PR
- All crates have proper Cargo.toml with workspace inheritance
Goal: Core NTFS structures and raw disk access
Crate: uffs-mft
Target Date: Week 4
Status: π’ Complete
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 1.1 | NtfsBootSector struct | - | π’ | ntfs.rs - Full boot sector parsing |
| 1.2 | FileRecordHeader struct | - | π’ | ntfs.rs - FileRecordSegmentHeader |
| 1.3 | AttributeHeader struct | - | π’ | ntfs.rs - AttributeRecordHeader |
| 1.4 | Resident attribute parsing | - | π’ | ntfs.rs - ResidentAttributeData |
| 1.5 | Non-resident attribute parsing | - | π’ | ntfs.rs - NonResidentAttributeData |
| 1.6 | Windows volume opening | - | π’ | platform.rs - VolumeHandle::open() |
| 1.7 | FSCTL_GET_NTFS_VOLUME_DATA | - | π’ | platform.rs - NtfsVolumeData |
| 1.8 | FSCTL_GET_RETRIEVAL_POINTERS | - | π’ | platform.rs - get_mft_extents(), MftExtent |
| 1.9 | Raw cluster reading | - | π’ | io.rs - AlignedBuffer, MftRecordReader |
| 1.10 | Error types with thiserror | - | π’ | error.rs - MftError enum |
| 1.11 | Unit tests | - | π’ | Tests in ntfs.rs, io.rs |
- Can open NTFS volume with admin privileges
- Can read boot sector and extract MFT location
- Can read raw MFT clusters
- Can parse MFT record headers
- All unit tests pass
-
cargo doc --package uffs-mftgenerates docs
Goal: Complete MFT parsing with Polars DataFrame output
Crate: uffs-mft
Target Date: Week 8
Status: π’ Complete
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 2.1 | $STANDARD_INFORMATION parsing | - | π’ | io.rs - parse_standard_info() |
| 2.2 | $FILE_NAME parsing | - | π’ | io.rs - parse_file_name() |
| 2.3 | $DATA parsing (resident) | - | π’ | io.rs - parse_record() |
| 2.4 | $DATA parsing (non-resident) | - | π’ | io.rs - size from non-resident header |
| 2.5 | Multi-sector fixup (unfixup) | - | π’ | io.rs - apply_fixup(), fixup_file_record() |
| 2.6 | $BITMAP parsing | - | π’ | platform.rs - MftBitmap, get_mft_bitmap() |
| 2.7 | $REPARSE_POINT parsing | - | π’ | ntfs.rs - ReparsePointHeader, ReparseMountPointBuffer |
| 2.8 | Run list (mapping pairs) | - | π’ | ntfs.rs - DataRun, parse_data_runs(), extract_data_runs_from_attribute() |
| 2.9 | Attribute iteration | - | π’ | ntfs.rs - AttributeIterator, AttributeRef |
| 2.10 | Attribute list support | - | π’ | ntfs.rs - AttributeListEntry (large files) |
| 2.11 | Index structures | - | π’ | ntfs.rs - IndexHeader, IndexRoot (directories) |
| 2.12 | DataFrame construction | - | π’ | reader.rs - build_dataframe() |
| 2.13 | Parquet persistence | - | π’ | reader.rs - save_parquet()/load_parquet() |
| 2.14 | Unit tests | - | π’ | Tests in io.rs, reader.rs, ntfs.rs |
- Can parse all standard NTFS attributes
- Multi-sector fixup correctly applied
- Can extract file names and parent references
- MFT data returned as Polars DataFrame
- DataFrame can be saved/loaded as Parquet
- All unit tests pass
Goal: Save/load complete raw MFT bytes for offline analysis
Crate: uffs-mft
Status: π’ Complete
Completed: 2026-01-16
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 2.5.1 | save_raw_mft() function |
- | π’ | raw.rs - Save complete MFT bytes to file |
| 2.5.2 | load_raw_mft() function |
- | π’ | raw.rs - Load saved MFT bytes |
| 2.5.3 | Handle fragmented MFT | - | π’ | reader.rs - read_raw() reassembles extents |
| 2.5.4 | Optional zstd compression | - | π’ | raw.rs - zstd feature flag |
| 2.5.5 | CLI commands save-raw / load-raw |
- | π’ | uffs-cli/commands.rs |
New raw.rs module with:
RawMftHeader- 64-byte header with magic, version, flags, sizesRawMftData- Loaded raw MFT with record iterationSaveRawOptions/LoadRawOptions- Configuration structssave_raw_mft()/load_raw_mft()- File I/O with optional zstd compressionMftReader::read_raw()- Read MFT as raw bytes (handles fragmented MFT)MftReader::save_raw_to_file()- Convenience method to read and saveMftReader::load_raw_to_dataframe()- Load saved MFT and parse to DataFrame
- Can save complete raw MFT to file (including fragmented)
- Can load saved raw MFT and parse it
- Compression reduces file size significantly (zstd feature)
- CLI commands work correctly (
uffs save-raw,uffs load-raw) - No automatic saving (user must explicitly request)
Goal: Read MFTs from multiple drives concurrently and merge into unified DataFrame
Crates: uffs-mft, uffs-cli
Status: π’ Complete
When searching across multiple NTFS volumes (C:, D:, E:, etc.), reading each MFT sequentially
is inefficient. Since each drive has independent I/O, we can read all MFTs in parallel and
merge the results into a single DataFrame with a drive column to distinguish sources.
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 2.6.1 | MultiDriveMftReader struct |
- | π’ | reader.rs - Orchestrates parallel drive reading |
| 2.6.2 | Async concurrent drive reading | - | π’ | tokio::spawn for each drive with JoinSet |
| 2.6.3 | DataFrame merging with drive column |
- | π’ | Polars vstack() with added column |
| 2.6.4 | CLI --drives flag |
- | π’ | Accept multiple drives: --drives C,D,E |
| 2.6.5 | Progress aggregation | - | π’ | Per-drive progress bars with MultiProgress |
| 2.6.6 | Error handling per drive | - | π’ | Continue on failure, report which drives failed |
| 2.6.7 | Unit tests | - | π’ | Tests for MultiDriveMftReader |
/// Reads MFTs from multiple drives in parallel.
pub struct MultiDriveMftReader {
drives: Vec<char>,
}
impl MultiDriveMftReader {
pub fn new(drives: Vec<char>) -> Self { ... }
/// Read all drives concurrently, merge into single DataFrame.
/// Adds a "drive" column (e.g., "C:", "D:") to distinguish sources.
pub async fn read_all(&self) -> Result<DataFrame> {
// Spawn async task for each drive
// Collect results
// Add "drive" column to each DataFrame
// Concat all DataFrames
}
/// Read with per-drive progress callbacks.
pub async fn read_with_progress<F>(&self, callback: F) -> Result<DataFrame>
where
F: Fn(char, MftProgress) + Send + Sync + 'static;
}# Read from multiple drives
uffs index --drives C,D,E --output all_drives.parquet
# Search across all indexed drives
uffs search --index all_drives.parquet "*.rs"
# Save raw MFT from multiple drives
uffs save-raw --drives C,D --output-dir ./raw_mfts/- Can read MFTs from multiple drives concurrently
- Merged DataFrame has
drivecolumn (e.g., "C:", "D:") - Progress shows per-drive and aggregate status
- Graceful handling when one drive fails (others continue)
- CLI accepts
--drives C,D,Esyntax - Performance scales with number of drives (parallel I/O)
Goal: Query engine using Polars lazy API
Crate: uffs-core
Target Date: Week 12
Status: π’ Complete
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 3.1 | MftQuery builder | - | π’ | query.rs - Fluent API wrapping LazyFrame |
| 3.2 | Polars filter predicates | - | π’ | size, date, type filters implemented |
| 3.3 | PathResolver struct | - | π’ | path_resolver.rs - FRS β full path |
| 3.4 | Glob to regex conversion | - | π’ | glob.rs - glob_to_regex() |
| 3.5 | Polars string matching | - | π’ | query.rs - glob(), regex(), contains() |
| 3.6 | Streaming mode support | - | π’ | query.rs - collect_streaming() |
| 3.7 | Table exporter | - | π’ | export.rs - export_table() |
| 3.8 | JSON exporter | - | π’ | export.rs - export_json() |
| 3.9 | CSV exporter | - | π’ | export.rs - export_csv() |
| 3.10 | Unit tests | - | π’ | Tests in query.rs, glob.rs, export.rs |
- MftQuery wraps Polars LazyFrame
- Polars lazy predicates work correctly
- Path resolution is accurate
- Export formats produce valid output
- Streaming mode handles large datasets
- All unit tests pass
Note: Tree structure is post-processing, belongs in
uffs-corenotuffs-mft.
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 3.5.1 | TreeIndex struct |
- | π’ | tree.rs - Build parentβchildren index with memoization |
| 3.5.2 | descendants calculation |
- | π’ | Count of all items under a directory |
| 3.5.3 | treesize calculation |
- | π’ | Sum of all file sizes in subtree |
| 3.5.4 | tree_allocated calculation |
- | π’ | Sum of allocated sizes in subtree |
| 3.5.5 | bulkiness calculation |
- | π’ | tree_allocated / treesize ratio (fragmentation metric) |
| 3.5.6 | Add tree columns to query results | - | π’ | On-demand via add_tree_columns() |
| 3.5.7 | Unit tests | - | π’ | 9 tests in tree.rs |
Goal: CLI tool and performance optimization
Crate: uffs-cli
Target Date: Week 16
Status: π’ Complete
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 4.1 | CLI argument parsing | - | π’ | main.rs - clap derive |
| 4.2 | search command |
- | π’ | commands.rs - Full search with filters |
| 4.3 | index command |
- | π’ | commands.rs - Build/save index |
| 4.4 | stats command |
- | π’ | commands.rs - Volume statistics |
| 4.5 | Progress indicators | - | π’ | commands.rs - indicatif progress bar |
| 4.6 | Error messages | - | π’ | anyhow + context |
| 4.7 | Batch I/O optimization | - | π’ | io.rs - BatchMftReader, 1MB chunks |
| 4.8 | Parallel MFT reading | - | π’ | io.rs - ParallelMftReader with Rayon |
| 4.9 | Cluster-level bitmap skip | - | π’ | platform.rs - calculate_skip_range(), in_use_cluster_ranges() |
| 4.10 | Fragmented MFT support | - | π’ | io.rs - MftExtentMap, VCN-to-LCN mapping |
| 4.11 | Benchmark suite | - | π‘ | Skeleton in benches/ |
| 4.12 | Performance profiling | - | β¬ | Future work |
| 4.13 | Integration tests | - | π‘ | Basic tests |
- CLI accepts all documented arguments
- Progress shown during indexing
- MFT read speed β₯500 MB/s (needs Windows testing)
- Index build β€2s for 1M files (needs Windows testing)
- Search latency <10ms (needs Windows testing)
- All benchmarks pass
| Metric | C++ Baseline | Current | Target | Status |
|---|---|---|---|---|
| MFT Read (MB/s) | 500 | TBD | β₯500 | π‘ |
| Index Build (1M files) | 2.0s | TBD | β€1.5s | π‘ |
| Search Latency | 8ms | TBD | <5ms (SIMD) | π‘ |
| Memory/File | 32B | TBD | ~45B (Polars) | π‘ |
| Parquet Size | N/A | TBD | ~60% of raw | π‘ |
The implementation matches the historical baseline for performance with these key components:
| Component | Location | Description |
|---|---|---|
MftExtentMap |
io.rs |
VCN-to-LCN mapping for fragmented MFT support |
MftBitmap |
platform.rs |
Tracks which MFT records are in use |
calculate_skip_range() |
platform.rs |
Cluster-level skip range calculation |
in_use_cluster_ranges() |
platform.rs |
Iterator over clusters with in-use records |
generate_read_chunks() |
io.rs |
Creates optimized 1MB read chunks |
ParallelMftReader |
io.rs |
Orchestrates parallel reading with Rayon |
BatchMftReader |
io.rs |
Reads multiple records per I/O operation |
ReadChunk |
io.rs |
Represents a contiguous read with skip info |
Performance Features Implemented:
- β
Fragmented MFT Support - MFT can be scattered across disk;
MftExtentMaphandles VCN-to-LCN translation - β Cluster-Level Bitmap Skipping - Skip entire clusters where all records are unused
- β Batch I/O (1MB chunks) - Reduce syscall overhead by reading multiple records per I/O
- β Parallel Record Processing - Use Rayon to parse records across all CPU cores
- β USA Fixup - Apply Update Sequence Array fixup to detect torn writes
Goal: Terminal UI and production readiness
Crate: uffs-tui
Target Date: Week 21
Status: π’ Complete (95%)
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 5.1 | TUI framework setup | - | π’ | main.rs - ratatui + crossterm |
| 5.2 | Search input widget | - | π’ | app.rs - Input handling |
| 5.3 | Results list widget | - | π’ | app.rs - Scrollable list |
| 5.4 | File details panel | - | π‘ | Basic (details in list) |
| 5.5 | Progress indicators | - | π’ | Status bar display |
| 5.6 | Keyboard navigation | - | π’ | Up/Down/Enter/Esc |
| 5.7 | Admin privilege check | - | π‘ | Windows-only (future) |
| 5.8 | User documentation | - | π‘ | Inline docs, --help |
| 5.9 | API documentation | - | π‘ | rustdoc for all crates |
| 5.10 | Release builds | - | β¬ | Future work |
| 5.11 | Cross-compilation | - | β¬ | Windows targets |
- TUI launches and displays search interface
- Real-time search updates as you type
- Keyboard navigation works smoothly
- Progress shown during indexing
- Documentation complete and accurate
- Release binaries tested on clean system
Goal: Make the Rust implementation 2-5x faster than the reference binary through architectural optimizations
Crates: uffs-mft, uffs-core
Target Date: Week 31
Status: π‘ In Progress
Reference: MFT Architecture Deep Dive
Before diving into optimizations, note that uffs-mft is already 55% faster than v0.1.30:
| Drive | v0.1.30 (Before) | v0.1.39 (Current) | Improvement |
|---|---|---|---|
| SSD C: | 11.3s | 3.1s | 73% faster |
| HDD S: | 160.6s | 45.9s | 71% faster |
| Total (7 drives) | 315s | 142s | 55% faster |
What's Already Done in uffs-mft:
- β Bitmap-based cluster skipping (skip free records)
- β Rayon fold/reduce parallel parsing
- β
SoA layout (
ParsedColumns) - β
PrefetchMftReaderfor HDDs (double-buffered I/O) - β
ParallelMftReaderfor SSDs (8MB chunks) - β Drive-type auto-detection
The Real Bottleneck is in uffs-core:
- β Path resolution FIXED (FastPathResolver with Vec-based O(1) lookup)
- β No SIMD pattern matching
- β No early termination for
--limit
Analysis of the historical baseline revealed key architectural differences that explain its speed:
- Contiguous memory: All data in vectors, not scattered heap allocations
- Direct indexing:
vector[frs]instead of HashMap lookups - Packed structures: Minimal memory footprint, cache-friendly
- In-place operations: Reuse buffers, avoid allocations
- Single-pass processing: Parse during read, not after
| Metric | C++ | Rust (Current) | Target |
|---|---|---|---|
*.txt search time |
~43.6s | ~56.1s | <20s |
| Files found | 735K | ~200K (bug) | 735K+ |
| Memory per file | ~64 bytes | ~200+ bytes | <50 bytes |
| Path resolution | O(1) array | O(1) HashMap* | O(1) array |
*HashMap has significant constant factor overhead vs direct array indexing.
Goal: Establish baseline measurements and automated comparison tools. Status: π‘ In Progress
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 6.0.1 | just bench-vs-cpp command |
- | π’ | Compare Rust vs C++ (~/bin/uffs.com) |
| 6.0.2 | just bench-micro command |
- | π’ | Run criterion micro-benchmarks |
| 6.0.3 | just bench-search command |
- | π’ | End-to-end search benchmark |
| 6.0.4 | Fill in MFT reading benchmarks | - | β¬ | Replace placeholder in mft_read.rs |
| 6.0.5 | Fill in query benchmarks | - | β¬ | Replace placeholder in query.rs |
| 6.0.6 | Add resolve_path() benchmark |
- | β¬ | Measure path resolution performance |
| 6.0.7 | Baseline tracking | - | π’ | benchmarks/baseline.json |
Goal: Fix correctness issues and establish baseline. Status: π’ Complete
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 6.1.1 | Fix PathResolver bug | - | π’ | Build from FULL MFT data before filtering |
| 6.1.2 | Add benchmark suite | - | π’ | criterion benchmarks for key operations |
| 6.1.3 | Profile current code | - | β¬ | flamegraph, heaptrack analysis |
| 6.1.4 | Establish baseline metrics | - | π’ | just bench-vs-cpp provides this |
Completed:
FastPathResolverwith Vec-based O(1) lookup (replaces HashMap)NameArenafor contiguous name storage- Search pipeline builds resolver from FULL MFT before filtering
- 10 unit tests for path resolution
- Benchmark comparison: HashMap vs Vec resolver
Goal: Eliminate allocation overhead. Status: π’ Complete (merged with Phase 6.1)
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 6.2.1 | Implement NameArena |
- | π’ | Single buffer for all names |
| 6.2.2 | Implement NameRef |
- | π’ | offset + length via FastEntry |
| 6.2.3 | Replace HashMap with Vec | - | π’ | Direct FRS indexing in FastPathResolver |
| 6.2.4 | In-place path building | - | π’ | Append-reverse strategy (like C++) |
| 6.2.5 | Add NameRef to ParsedColumns | - | β¬ | Future: Reference instead of String clone |
Expected Impact: 2-3x speedup from reduced allocations.
Goal: Reduce parsing overhead.
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 6.3.1 | Direct-to-column parsing | - | β¬ | Skip intermediate ParsedRecord |
| 6.3.2 | Lazy attribute parsing | - | β¬ | Only parse what's needed for query |
| 6.3.3 | Bit-packed attributes | - | β¬ | Single u32 for all 18 flags |
| 6.3.4 | Compact record format | - | β¬ | ~56 bytes vs ~200+ bytes |
Expected Impact: 1.3-1.5x speedup from reduced overhead.
Goal: Overlap I/O and CPU work for additional HDD speedup. Status: π’ Complete
Implementation:
PipelinedMftReaderusescrossbeam-channelbounded channels to overlap I/O and CPU work. Reader thread queues chunks while parser thread processes. Auto mode now selectsPipelinedfor HDDs instead ofPrefetch.
Architecture:
Reader Thread βββΆ [Bounded Channel] βββΆ Parser Thread
β β
βΌ βΌ
Read chunks Parse records
as fast as as they arrive
possible (with Rayon)
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 6.4.1 | Channel-based pipeline | - | π’ | crossbeam-channel bounded channels |
| 6.4.2 | Reader thread(s) | - | π’ | Dedicated thread queues reads |
| 6.4.3 | Parser thread(s) | - | π’ | Main thread parses with MftRecordMerger |
| 6.4.4 | Backpressure handling | - | π’ | Bounded channel (depth=3) prevents memory explosion |
| 6.4.5 | Multi-extent parallelism | - | β¬ | Future: One reader per MFT extent |
Expected Impact (HDDs only):
- 20-35% additional speedup on HDDs where I/O time β CPU time
- True overlap:
Time = max(I/O, CPU)instead ofTime = I/O + CPU
Goal: Maximize disk throughput.
Status: π’ Mostly Complete (via PrefetchMftReader and ParallelMftReader)
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 6.5.1 | Adaptive read sizing | - | π’ | MftReadMode::Auto detects SSD vs HDD |
| 6.5.2 | MFT bitmap optimization | - | π’ | Bitmap-based cluster skipping implemented |
| 6.5.3 | Prefetch optimization | - | π’ | PrefetchMftReader with double-buffering |
| 6.5.4 | Memory-mapped I/O option | - | β¬ | Optional: For SSDs with sufficient RAM |
Already Achieved: 55% faster than v0.1.30, 1,839 MB/s on SSDs.
Goal: Accelerate pattern matching. Status: π’ Complete (except SIMD which is deferred)
| ID | Deliverable | Owner | Status | Notes |
|---|---|---|---|---|
| 6.6.1 | SIMD pattern matching | - | β¬ | AVX2/AVX-512 for wildcards (deferred) |
| 6.6.2 | Early termination | - | π’ | Streaming mode has early termination |
| 6.6.3 | Parallel path resolution | - | π’ | add_path_column_parallel() with Rayon |
| 6.6.4 | Extension index | - | π’ | ExtensionIndex for fast *.ext queries |
Expected Impact: 1.2-1.5x speedup for pattern matching.
- PathResolver correctly resolves ALL files (matches reference output)
- Benchmark suite covers all critical paths
-
*.txtsearch completes in <20s (vs C++ ~43.6s) - Memory usage <50 bytes per file (vs current ~200+)
- All existing tests continue to pass (77 tests)
- Performance regression tests in CI
| Optimization | Expected Speedup | Cumulative |
|---|---|---|
| Fix PathResolver | 1.0x (correctness) | 1.0x |
| NameArena | 2-3x | 2-3x |
| Vec-based lookup | 1.5-2x | 3-6x |
| In-place path building | 1.5-2x | 4.5-12x |
| Direct parsing | 1.3-1.5x | 6-18x |
| Pipelined I/O β | 2x | 12-36x |
| SIMD matching | 1.2-1.5x | 14-54x |
Conservative estimate: 10-15x faster than current Rust Optimistic estimate: 25-50x faster than current Rust Target: 2-5x faster than the reference binary
| Approach | Memory for 1M files |
|---|---|
| Current Rust | ~1GB (entire MFT) |
| Pipelined Rust | ~4MB (4 Γ 1MB buffers) |
| Reduction | 250x |
// Arena-based name storage (like C++ std::tvstring)
pub struct NameArena {
buffer: String, // All names concatenated
}
pub struct NameRef {
offset: u32, // Offset into arena
length: u16, // Name length
is_ascii: bool, // ASCII compression flag
}
// Fast path resolver (like C++ RecordsLookup)
pub struct FastPathResolver {
entries: Vec<Option<(u32, NameRef)>>, // Index = FRS
names: NameArena,
volume: char,
}
// Compact record (like C++ Record)
pub struct CompactRecord {
pub frs: u64,
pub parent_frs: u32, // u32 sufficient (like C++)
pub name_ref: NameRef, // 8 bytes
pub size: u64,
pub attributes: u32, // Bit-packed flags
pub timestamps: [i64; 4], // Inline array
}
// Total: ~56 bytes vs current ~200+ bytes| ID | Risk | Impact | Probability | Mitigation | Status |
|---|---|---|---|---|---|
| R1 | Windows API complexity | High | Medium | Use windows crate, extensive testing |
β¬ Open |
| R2 | Performance regression | High | Low | Continuous benchmarking vs C++ | β¬ Open |
| R3 | Memory safety with raw I/O | Critical | Medium | Buffer management, fuzzing | β¬ Open |
| R4 | NTFS edge cases | Medium | Medium | Test on diverse volumes | β¬ Open |
| R5 | Admin privilege issues | Medium | Low | Clear error messages, docs | β¬ Open |
| R6 | Workspace complexity | Low | Low | Clear crate boundaries, docs | β¬ Open |
| Crate | Depends On | Key External Deps |
|---|---|---|
uffs-polars |
- | polars (all features) |
uffs-mft |
uffs-polars | windows, rayon, bitflags, thiserror |
uffs-core |
uffs-polars, uffs-mft | regex |
uffs-cli |
uffs-core | clap, indicatif, anyhow, tokio |
uffs-tui |
uffs-core | ratatui, crossterm |
uffs-gui |
uffs-core | egui (future) |
Phase 0 (Workspace Setup + uffs-polars facade)
β
Phase 1 (uffs-mft Foundation)
β
Phase 2 (uffs-mft DataFrame)
β
Phase 3 (uffs-core Processing with Polars)
β
Phase 4 (uffs-cli & Performance)
β
Phase 5 (uffs-tui & Polish)
β
Phase 6 (Performance Optimization - "Outperform the Reference Baseline")
- Analyzed C++ codebase
- Created implementation plan
- Created milestone document
- Refactored for workspace architecture
- Refactored for Polars-based architecture
- Set up workspace structure with uffs-polars facade
- Establish CI/CD pipeline
- Implemented uffs-polars facade crate
- Implemented uffs-mft with NTFS structures
- Implemented uffs-mft with MftReader and DataFrame
- Implemented uffs-core with MftQuery fluent API
- Implemented uffs-cli with search/index/stats commands
- Implemented uffs-tui with ratatui interface
- Workspace compiles successfully
| Date | Change | Reason |
|---|---|---|
| 2026-01-15 | Initial document creation | Project kickoff |
| 2026-01-15 | Refactored for workspace architecture | Modular crate design |
| 2026-01-15 | Refactored for Polars-based architecture | SIMD, parallelism, Parquet persistence |
| 2026-01-16 | Implementation complete | All core crates implemented |
| 2026-01-16 | High-performance MFT reading | Parallel processing (Rayon), batch I/O, cluster-level bitmap skipping, fragmented MFT support |
| 2026-01-16 | Phase 3.5: Directory Tree Structure | TreeIndex with memoized metrics: descendants, treesize, tree_allocated, bulkiness |
| 2026-01-19 | Phase 6: Performance Optimization | Deep dive analysis of the historical baseline vs Rust architecture; roadmap to make Rust 2-5x faster than the reference binary |
UltraFastFileSearch/
βββ Cargo.toml # Workspace manifest
βββ crates/
β βββ uffs-polars/ # π§ Polars facade (compiles ONCE)
β β βββ Cargo.toml # All Polars features here
β β βββ src/
β β βββ lib.rs # Re-exports polars::prelude::*
β β
β βββ uffs-mft/ # π¦ MFT reading β DataFrame
β β βββ Cargo.toml
β β βββ src/
β β βββ lib.rs # Public API
β β βββ reader.rs # MftReader
β β βββ dataframe.rs # DataFrame construction
β β βββ ntfs/ # NTFS structures
β β β βββ mod.rs
β β β βββ boot_sector.rs
β β β βββ file_record.rs
β β β βββ attributes.rs
β β β βββ run_list.rs
β β βββ io/ # Low-level I/O
β β β βββ mod.rs
β β β βββ volume.rs
β β β βββ async_read.rs
β β βββ platform/
β β βββ mod.rs
β β βββ windows.rs
β β
β βββ uffs-core/ # π¦ Query engine (Polars lazy)
β β βββ Cargo.toml
β β βββ src/
β β βββ lib.rs
β β βββ query.rs # MftQuery (wraps LazyFrame)
β β βββ path_resolver.rs # Path reconstruction
β β βββ glob.rs # Glob to regex
β β βββ export.rs # Table, JSON, CSV
β β
β βββ uffs-cli/ # π§ CLI binary
β β βββ Cargo.toml
β β βββ src/
β β βββ main.rs
β β βββ commands/
β β βββ mod.rs
β β βββ search.rs
β β βββ index.rs
β β βββ stats.rs
β β
β βββ uffs-tui/ # π₯οΈ TUI binary
β β βββ Cargo.toml
β β βββ src/
β β βββ main.rs
β β βββ widgets/
β β
β βββ uffs-gui/ # πͺ GUI binary (future)
β βββ Cargo.toml
β βββ src/
β βββ main.rs
β
βββ examples/ # Usage examples
βββ benches/ # Benchmarks
βββ docs/ # Documentation
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PROJECT HEALTH DASHBOARD β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Overall Progress: ββββββββββββββββββββ 5% β
β β
β Phase 0: ββββββββββββββββββββ 0% Phase 3: ββββββββββββ 0% β
β Phase 1: ββββββββββββββββββββ 0% Phase 4: ββββββββββββ 0% β
β Phase 2: ββββββββββββββββββββ 0% Phase 5: ββββββββββββ 0% β
β β
β Crates: 0/6 complete Tests: 0 passing / 0 total β
β Open Risks: 6 Blockers: 0 β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Architecture Doc:
docs/architecture/Suggested Rust Source Code Structure.docx