fix: normalize memory category filters by TurboTheTurtle · Pull Request #876 · CortexReach/memory-lancedb-pro

TurboTheTurtle · 2026-06-11T08:14:29Z

Summary

expose both smart-extractor and legacy memory categories in tool schemas
normalize category filters so legacy singular filters match smart category rows
apply category candidate matching to memory_list and add CI regression coverage

Fixes #873

Verification

node --test test/category-filter-normalization.test.mjs
npm run build
npm run test:packaging-and-workflow
npm run test:core-regression

rwmjhb

Requesting changes. The direction is right, but the current implementation only partially fixes the taxonomy mismatch.

Must fix

Smart-category filters still miss metadata-backed auto-captured rows.

The new retrieval/list filtering still compares only the storage category column: retriever uses matchesMemoryCategoryFilter(r.entry.category, category), and memory_list expands SQL predicates over category via resolveCategoryFilterCandidates(category).

That works for direct aliases like preference/preferences and entity/entities, but it does not handle the smart-extractor categories that are stored through legacy storage categories with the real smart category in metadata. In the normal auto-capture path, profile and cases are stored as fact, events as decision, and patterns as other, with metadata.memory_category carrying the smart category. Filters for profile, events, cases, and patterns can therefore still silently miss normal auto-captured rows.

Please make recall/list filtering metadata-aware, or over-fetch the relevant storage categories and then apply metadata.memory_category filtering in application code. Add regression coverage through the real smart-extractor/store path for the smart categories, not only direct row stubs.

Newly accepted smart category values are not normalized on write.

The tool schema now accepts smart category values such as profile, preferences, events, cases, and patterns for memory_store / memory_update. But the write path still passes the raw category into storage and buildSmartMetadata() without explicitly setting the normalized smart category. For non-legacy values, reverseMapLegacyCategory() cannot infer the intended category reliably and can default to patterns.

Please normalize tool input at the execution boundary into both:

the storage category to persist, and
metadata.memory_category / smart category metadata to preserve semantics.

Also add tests for manual memory_store and memory_update using schema-legal smart category values.

Also worth covering

The new test mainly exercises retriever/BM25-style filtering. Please add coverage for the SQL memory_list path and keep resolveCategoryFilterCandidates() / matchesMemoryCategoryFilter() behavior equivalent so these two filter surfaces do not drift.

rwmjhb

Approved. The new commit addresses the blocking issues from the previous review: category filters are now metadata-aware for recall/list paths, and manual store/update normalize accepted smart category inputs into storage category plus smart metadata.

Verification I checked:

orchestrator targeted tests passed
orchestrator full npm test passed
npm run build --if-present passed on head 3beb2781b2dbe0121e506d3eafb68d6c81601e5c
node --test test/category-filter-normalization.test.mjs passed, 7/7

Non-blocking follow-ups worth considering:

patterns filtering can still differ between recall and list for legacy reflection rows, because the in-memory predicate maps reflection -> patterns but the SQL candidate expansion for list does not fetch reflection.
Metadata-less legacy fact rows remain ambiguous and are treated as cases, so profile-like legacy facts without memory_category metadata may not match category=profile.
Category-only memory_update changes memory_category, but preserves category-derived fields such as fact_key and memory_layer; re-deriving or clearing those would avoid stale metadata after recategorization.

rickthomasjr

Agent Code Review: PR #876 — fix: normalize memory category filters

Verdict: APPROVE

Summary

Two commits introducing bidirectional smart/legacy category mapping. Core changes: memory-categories.ts adds LEGACY_MEMORY_CATEGORIES, TOOL_MEMORY_CATEGORIES, mapping tables, and helper functions (normalizeCategory, matchesMemoryCategoryFilter, resolveCategoryFilterCandidates, resolveToolMemoryCategory, extractMetadataMemoryCategory). Tools and retrieval now normalize incoming categories at the execution boundary, persisting both storageCategory and metadata.memory_category. SQL list() filtering uses candidate OR expansion + in-memory metadata-aware predicate. All retrieval paths use matchesMemoryCategoryFilter() instead of direct equality.

P1 — Supersede logic: APPROVED

Replacing hardcoded SUPERSEDE_ELIGIBLE with TEMPORAL_VERSIONED_CATEGORIES.has(memoryCategory) + metadata-aware matching prevents false-positive cross-supersedes between unrelated legacy categories. Correct.

P2 — Metadata-aware filtering: APPROVED

matchesMemoryCategoryFilter() three-tier approach (direct → normalize → metadata) handles the critical case: entries stored as fact with metadata.memory_category=profile are still found by category=profile filters. This was the core issue rwmjhb flagged.

P3 — SQL candidate expansion: APPROVED

resolveCategoryFilterCandidates() builds OR of all storage category values for a smart category. Set deduplication prevents duplication. Correct.

P4 — Write path normalization: APPROVED

Both memory_store and memory_update resolve storageCategory + memoryCategory at execution boundary. buildSmartMetadata() uses resolved categories. Category-only updates also update memory_category in metadata.

P5 — Response payloads: APPROVED

Adding rawCategory alongside category is useful for debugging.

P6 — Minor notes

isLegacyStableMemoryId regex in store.ts is minor scope creep in this PR — consider separate PR.
rwmjhb's non-blocking follow-ups (patterns/legacy reflection gap, ambiguous metadata-less legacy fact rows, stale fields on recategorization) are valid but out of scope.

Tests

7 test cases, 397 lines. Exercises smart-extractor rows through real store/tool paths. Covers normalization, metadata-aware retrieval, SQL list filtering, store tool writes, update tool writes. Good coverage.

Conclusion

Well-designed, well-tested fix for a real taxonomy mismatch. The dual-category normalization is the right approach. No blocking issues.

fix: normalize memory category filters

8891b54

rwmjhb requested changes Jun 12, 2026

View reviewed changes

fix: apply smart category filters via metadata

3beb278

rwmjhb approved these changes Jun 12, 2026

View reviewed changes

rickthomasjr approved these changes Jun 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: normalize memory category filters#876

fix: normalize memory category filters#876
TurboTheTurtle wants to merge 2 commits into
CortexReach:masterfrom
TurboTheTurtle:codex/issue-873-category-filter

TurboTheTurtle commented Jun 11, 2026

Uh oh!

rwmjhb left a comment

Uh oh!

rwmjhb left a comment

Uh oh!

rickthomasjr left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

TurboTheTurtle commented Jun 11, 2026

Summary

Verification

Uh oh!

rwmjhb left a comment

Choose a reason for hiding this comment

Must fix

Also worth covering

Uh oh!

rwmjhb left a comment

Choose a reason for hiding this comment

Uh oh!

rickthomasjr left a comment

Choose a reason for hiding this comment

Agent Code Review: PR #876 — fix: normalize memory category filters

Summary

P1 — Supersede logic: APPROVED

P2 — Metadata-aware filtering: APPROVED

P3 — SQL candidate expansion: APPROVED

P4 — Write path normalization: APPROVED

P5 — Response payloads: APPROVED

P6 — Minor notes

Tests

Conclusion

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants