Skip to content

feat: record messages from user in ~/.codex/history.jsonl #939

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 15, 2025
Merged

Conversation

bolinfest
Copy link
Collaborator

@bolinfest bolinfest commented May 15, 2025

This is a large change to support a "history" feature like you would expect in a shell like Bash.

History events are recorded in $CODEX_HOME/history.jsonl. Because it is a JSONL file, it is straightforward to append new entries (as opposed to the TypeScript file that uses $CODEX_HOME/history.json, so to be valid JSON, each new entry entails rewriting the entire file). Because it is possible for there to be multiple instances of Codex CLI writing to history.jsonl at once, we use advisory file locking when working with history.jsonl in codex-rs/core/src/message_history.rs.

Because we believe history is a sufficiently useful feature, we enable it by default. Though to provide some safety, we set the file permissions of history.jsonl to be o600 so that other users on the system cannot read the user's history. We do not yet support a default list of SENSITIVE_PATTERNS as the TypeScript CLI does:

// Regex patterns for sensitive commands that should not be saved.
const SENSITIVE_PATTERNS = [
/\b[A-Za-z0-9-_]{20,}\b/, // API keys and tokens
/\bpassword\b/i,
/\bsecret\b/i,
/\btoken\b/i,
/\bkey\b/i,
];

We are going to take a more conservative approach to this list in the Rust CLI. For example, while /\b[A-Za-z0-9-_]{20,}\b/ might exclude sensitive information like API tokens, it would also exclude valuable information such as references to Git commits.

As noted in the updated documentation, users can opt-out of history by adding the following to config.toml:

[history]
persistence = "none" 

Because history.jsonl could, in theory, be quite large, we take a[n arguably overly pedantic] approach in reading history entries into memory. Specifically, we start by telling the client the current number of entries in the history file (history_entry_count) as well as the inode (history_log_id) of history.jsonl (see the new fields on SessionConfiguredEvent).

The client is responsible for keeping new entries in memory to create a "local history," but if the user hits up enough times to go "past" the end of local history, then the client should use the new GetHistoryEntryRequest in the protocol to fetch older entries. Specifically, it should pass the history_log_id it was given originally and work backwards from history_entry_count. (It should really fetch history in batches rather than one-at-a-time, but that is something we can improve upon in subsequent PRs.)

The motivation behind this crazy scheme is that it is designed to defend against:

  • The history.jsonl being truncated during the session such that the index into the history is no longer consistent with what had been read up to that point. We do not yet have logic to enforce a max_bytes for history.jsonl, but once we do, we will aspire to implement it in a way that should result in a new inode for the file on most systems.
  • New items from concurrent Codex CLI sessions amending to the history. Because, in absence of truncation, history.jsonl is an append-only log, so long as the client reads backwards from history_entry_count, it should always get a consistent view of history. (That said, it will not be able to read new commands from concurrent sessions, but perhaps we will introduce a / command to reload latest history or something down the road.)

Admittedly, my testing of this feature thus far has been fairly light. I expect we will find bugs and introduce enhancements/fixes going forward.

@bolinfest bolinfest force-pushed the pr939 branch 5 times, most recently from 970e2cb to 2a3a06e Compare May 15, 2025 20:31
@bolinfest bolinfest force-pushed the pr939 branch 4 times, most recently from aca3779 to 19fa6a5 Compare May 15, 2025 22:19
@bolinfest bolinfest added the rust specific to the Rust CLI label May 15, 2025
@bolinfest bolinfest force-pushed the pr939 branch 3 times, most recently from 4b6d0a6 to e67ff23 Compare May 15, 2025 23:02
@bolinfest bolinfest marked this pull request as ready for review May 15, 2025 23:18
@bolinfest bolinfest merged commit ce2ecbe into main May 15, 2025
9 checks passed
@bolinfest bolinfest deleted the pr939 branch May 15, 2025 23:26
@github-actions github-actions bot locked and limited conversation to collaborators May 15, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
rust specific to the Rust CLI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant