Skip to content

remove relationships with phantom entities#2261

Merged
dayesouza merged 2 commits intomainfrom
filter_graph
Mar 3, 2026
Merged

remove relationships with phantom entities#2261
dayesouza merged 2 commits intomainfrom
filter_graph

Conversation

@dayesouza
Copy link
Contributor

This pull request introduces a new utility for filtering out "phantom" or orphaned relationships in graph extraction and update workflows. It ensures that relationships whose source or target entities do not exist are removed, preventing downstream errors caused by hallucinated or broken graph edges. The change is applied in both the graph extraction and incremental update pipelines, and is thoroughly tested with new unit tests.

Graph extraction and update pipeline improvements:

  • Added the filter_orphan_relationships utility function in utils.py, which removes relationships referencing non-existent entities after LLM graph extraction. This prevents phantom edges in the graph and logs warnings when such relationships are dropped.
  • Integrated filter_orphan_relationships into the graph extraction workflow (extract_graph.py) and the incremental update workflow (update_entities_relationships.py), ensuring orphan relationships are filtered out in both initial and update passes. [1] [2] [3] [4]

Testing and validation:

  • Added comprehensive unit tests for filter_orphan_relationships, _merge_entities, and _merge_relationships in test_extract_graph.py, covering edge cases such as missing sources/targets, empty inputs, and index resetting.
  • Added unit tests for orphan filtering in the update pipeline in test_update_relationships.py, validating that merged relationships with hallucinated endpoints are correctly removed after entity/relationship updates.
  • Added missing copyright headers to test files.

Release and documentation:

  • Documented this feature as a patch update in .semversioner/next-release/patch-20260302221432185149.json.

@dayesouza dayesouza requested a review from a team as a code owner March 2, 2026 23:31
@dayesouza dayesouza merged commit bfd42c1 into main Mar 3, 2026
19 of 22 checks passed
@dayesouza dayesouza deleted the filter_graph branch March 3, 2026 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants