Skip to content

refactor the filter code in the garnet provider #808

Merged
hailangx merged 2 commits intometajack/diskann-garnetfrom
haiyang/garnetfilter
Mar 5, 2026
Merged

refactor the filter code in the garnet provider #808
hailangx merged 2 commits intometajack/diskann-garnetfrom
haiyang/garnetfilter

Conversation

@hailangx
Copy link
Member

@hailangx hailangx commented Mar 2, 2026

Label Provider Refactor

  • Changed GarnetQueryLabelProvider from a raw pointer-based struct to a lifetime-parameterized struct that borrows a slice.

Removal of Obsolete Wrapper

  • Removed the FilteredSearchResults wrapper, which previously handled conversion between internal and external IDs for search results. This is no longer needed with previous change to expose generic external id in the beta search strategy.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors Garnet’s filtered-search label handling by moving GarnetQueryLabelProvider from a raw-pointer-based implementation to a slice-borrowing design, and simplifies the filtered search path in the dynamic index wrapper.

Changes:

  • Refactored GarnetQueryLabelProvider to store a borrowed &[u8] and updated is_set() to use safe bounds-checked access.
  • Simplified DynIndex::search_vector by removing the FilteredSearchResults wrapper and passing the output buffer through directly.
  • Updated filter types to use GarnetQueryLabelProvider<'static> in DynIndex APIs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
diskann-garnet/src/labels.rs Refactors bitmap label provider to use slice-based storage and safer lookup logic; adds 'static raw-pointer constructor.
diskann-garnet/src/dyn_index.rs Removes the filtered output wrapper and simplifies filtered search call path; updates filter type signature.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 44 to 67
@@ -73,49 +59,42 @@
/// of this struct.
pub unsafe fn from_raw(data: *const u8, len: usize) -> Self {
if data.is_null() || len == 0 {
return Self {
data: std::ptr::null(),
len: 0,
};
return Self { data: &[] };
}
Self {
data: unsafe { std::slice::from_raw_parts(data, len) },
}
Self { data, len }
}
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from_raw constructs a &'static [u8] via slice::from_raw_parts, but the pointed-to memory is owned by the FFI caller and is only guaranteed to live for the duration of the search call. Creating an 'static reference to non-'static memory is unsound (can enable UB via lifetime-based optimizations), even if the value is dropped shortly after.

Safer options here are (a) store *const u8 + len (as the old implementation did) and do checked reads without ever materializing a Rust reference with an inflated lifetime, or (b) copy into owned storage (e.g., Arc<[u8]>) so the data is truly 'static for the Arc<dyn QueryLabelProvider<_>> used by BetaFilter.

Copilot uses AI. Check for mistakes.
InplaceDeleteMethod, SearchOutputBuffer, glue::SearchStrategy, index::SearchStats, search,
},
graph::{InplaceDeleteMethod, glue::SearchStrategy, index::SearchStats, search},
provider::{Accessor, DataProvider},
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataProvider is no longer used in this file after removing FilteredSearchResults, but it’s still imported. CI runs with -Dwarnings, so this unused import will fail the build; please remove it (and keep Accessor which is still needed for get_element).

Suggested change
provider::{Accessor, DataProvider},
provider::Accessor,

Copilot uses AI. Check for mistakes.
#[derive(Clone)]
pub struct GarnetQueryLabelProvider<'a> {
data: &'a [u8],
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I don't think I was entirely clear on the feedback in #800, I meant that it's reasonable to have a lifetime in the backend use of Strategies, not that GarnetQueryLabelProvder needs a lifetime. Without that change, adding a lifetime bound to GarnetQueryLabelProvider does not really solve anything, especially since we need it to be 'static to use it anyways. And this actually feels worse since it forces the materialization of a &'static [u8]. At least when there is a raw pointer involved, programmers are primed to think "something is going on".

Without making changes to the core DiskANN to accept non-'static types, I suggest we keep this as-is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, to remove the static we need to remove it from the query computer trait.

I don't see much difference regarding the raw pointers when initially it we also have the unsafe convert from raw pointers in this version but just have to mark it under static trick. And when remove the static from the core DiskANN would it be a nature change here to remove the static

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot can you help revert this GarnetQueryLabelProvider to original version

Copy link
Contributor

Copilot AI commented Mar 4, 2026

@hailangx I've opened a new pull request, #814, to work on those changes. Once the pull request is ready, I'll request review from you.

@hailangx hailangx merged commit 9a18fb2 into metajack/diskann-garnet Mar 5, 2026
1 check passed
@hailangx hailangx deleted the haiyang/garnetfilter branch March 5, 2026 00:54
metajack pushed a commit that referenced this pull request Mar 5, 2026
### Removal of Obsolete Wrapper

* Removed the `FilteredSearchResults` wrapper, which previously handled
conversion between internal and external IDs for search results. This is
no longer needed with previous change to expose generic external id in
the beta search strategy.

---------

Co-authored-by: Haiyang Xu <haixu@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants