notes

gptel-plus manual

First published: Last updated: 2127 words · 22 lines of code

Overview

gptel-plus.el extends gptel, the Emacs package for interfacing with large language models. It adds cost awareness, context persistence, and quality-of-life improvements that are useful when working with paid LLM APIs on a regular basis.

The development repository is on GitHub.

The package provides four groups of functionality:

  • Cost estimation and tracking. Before sending a request, gptel-plus estimates its cost and displays it in the header line. After a request completes, it parses the response log to report the exact cost. When a request exceeds a configurable cost threshold, it asks for confirmation before proceeding (Cost estimation). A public API is also available for computing costs programmatically from gptel-request callbacks (Programmatic cost API).

  • Context persistence. The gptel context (files added via gptel-context-add-file) is ephemeral by default: it is lost when you kill the buffer. gptel-plus can save and restore context to and from the file itself, as an Org property or a Markdown file-local variable (Context persistence).

  • Context file management. A dedicated buffer lists all context files sorted by size, with a dired-like interface for flagging and removing files in bulk. This is especially helpful when you need to trim a large context to reduce costs (Context file management).

  • Automatic mode activation. gptel-plus can automatically enable gptel-mode when you open a file that contains gptel data, so that you can resume a conversation without manually toggling the mode (Automatic mode activation).

The package requires Emacs 29.1 or later and gptel 0.7.1 or later.

Installation

Manual installation

Clone the repository somewhere on your load-path and load the package:

(require 'gptel-plus)

Installation with use-package

The following snippets show how to install gptel-plus with the most common package managers. Choose whichever one matches your setup.

;; with vc (built-in, Emacs 29+)
(use-package gptel-plus
  :vc (:url "https://github.com/benthamite/gptel-plus"))

;; with elpaca
(use-package gptel-plus
  :ensure (:host github :repo "benthamite/gptel-plus"))

;; with straight
(use-package gptel-plus
  :straight (:host github :repo "benthamite/gptel-plus"))

;; with quelpa
(use-package gptel-plus
  :quelpa (gptel-plus :fetcher github :repo "benthamite/gptel-plus"))

User options

All user options belong to the gptel-plus customization group, which is itself a child of the gptel group.

Cost estimation parameters

The user option gptel-plus-calculate-cost controls whether cost estimation and tracking are active. When set to nil, no cost information is computed or displayed. The default value is t.

The user option gptel-plus-tokens-per-word specifies the approximate number of tokens per word, used for converting word counts into token counts when estimating input costs. The default value is 1.5. If your typical prompts use a language other than English, or if you frequently include code (which tends to have a higher token-to-word ratio), you may want to increase this value.

The user option gptel-plus-tokens-in-output specifies the assumed number of tokens in the model’s response, used for estimating output costs. The default value is 100. If your typical interactions produce longer responses, increase this value to get more accurate cost estimates. For a conservative upper bound, you could set it to the value of gptel-max-tokens, though this will overestimate costs for most requests.

The user option gptel-plus-cost-warning-threshold sets the dollar amount above which gptel-plus prompts for confirmation before sending a request (Cost warning). The default value is 0.15 (fifteen cents). Set it to nil to disable the warning entirely. The type accepts either a number or nil.

Commands

Cost estimation

Cost estimation in gptel-plus has two complementary aspects: an ex ante estimate computed before you send a request, and an ex post calculation reported after the response arrives.

Header line display

When gptel-plus is loaded, it replaces the standard gptel header line with an extended version that includes a clickable [Cost: $X.XX] indicator. This indicator shows the estimated cost of sending the current prompt, updated dynamically as you type or modify the context. Clicking the cost indicator opens the gptel transient menu.

The header line uses pixel-level alignment when available (Emacs 29+), falling back to character-width alignment on older builds.

The estimate accounts for three components:

  1. The words in the current buffer (from the beginning up to point, or in the active region if one exists).
  2. The words in all context files and buffers.
  3. A fixed assumed output length.

Each word count is multiplied by gptel-plus-tokens-per-word and then by the model’s per-token pricing (read from the :input-cost and :output-cost properties that gptel stores on the model symbol). If no pricing information is available for the current model, the indicator shows [Cost: N/A].

Note that the cost estimate does not account for images or region-based context additions. The estimate is approximate; the ex post calculation (Ex post cost calculation) provides the precise figure.

Ex post cost calculation

After each gptel request completes, gptel-plus reports the exact cost in the echo area (e.g., Cost of request: $0.0042).

To obtain the exact token counts, gptel-plus temporarily sets gptel-log-level to info before the request is sent, which causes gptel to log the full request and response data to the *gptel-log* buffer. After the response arrives, gptel-plus parses this log to extract the input and output token counts, computes the cost using the model’s pricing, and then restores the original log level and kills the log buffer.

This mechanism handles concurrent requests correctly: if multiple requests are in flight simultaneously, the log level is only restored after the last one completes. Each request’s log data is parsed in isolation (scoped to the region of the log buffer written during that request), so concurrent requests from different buffers do not interfere with each other.

The cost calculation accounts for prompt caching where applicable. For Anthropic, cache creation tokens are charged at 1.25× the input rate and cache read tokens at 0.1× the input rate. For OpenAI, cached tokens receive a 50% discount on the input rate. Both streaming and non-streaming responses are supported for all three providers.

Cost warning

Before every call to gptel-send, gptel-plus checks whether the estimated cost exceeds gptel-plus-cost-warning-threshold (Cost estimation parameters). If it does, a y-or-n-p prompt asks you to confirm. If you decline, a second prompt offers to clear the context (via gptel-context-remove-all), since a large context is typically the reason for a high cost. Declining both prompts cancels the request with a user-error.

Programmatic cost API

The cost estimation features described above (Cost estimation) work automatically for interactive gptel-send calls. For programmatic use via gptel-request, gptel-plus provides a separate mechanism.

When gptel-plus is loaded, it advises gptel--parse-response to capture the raw token usage from each API response into the INFO plist under the key :token-usage. This happens transparently for all requests, whether interactive or programmatic.

The function gptel-plus-compute-cost takes the INFO plist passed to a gptel-request callback and returns the dollar cost as a float, or nil if usage data or model pricing is unavailable. It handles Anthropic, OpenAI, and Gemini usage formats, including prompt caching. An optional MODEL argument overrides the default gptel-model.

Example usage:

(gptel-request prompt
  :callback (lambda (response info)
              (let ((cost (gptel-plus-compute-cost info)))
                (message "Cost: $%.4f" cost))))

Context persistence

By default, the list of context files you add to a gptel session is lost when you kill the buffer. gptel-plus provides two commands to persist and restore this context.

The command gptel-plus-save-file-context (M-x gptel-plus-save-file-context) saves the current gptel-context to the file you are visiting. In Org mode buffers, the context is stored as a GPTEL_CONTEXT property on the first heading. In Markdown mode buffers, it is stored as a file-local variable. If a saved context already exists, you are asked to confirm the overwrite. The command signals an error if the buffer is neither Org nor Markdown.

The command gptel-plus-restore-file-context (M-x gptel-plus-restore-file-context) reads the saved context from the file and re-adds each file to the gptel context via gptel-context-add-file. If a context is already active, you are asked to confirm the overwrite. Files that no longer exist or are not readable are skipped, with a message listing the missing paths.

Context file management

The command gptel-plus-list-context-files (M-x gptel-plus-list-context-files) opens a dedicated *gptel context files* buffer listing all files in the current gptel context, sorted by size in descending order. Each entry shows the file’s size in kilobytes and its path (abbreviated with ~/ where applicable). If there are no files in the context, a message is displayed instead.

The buffer uses gptel-context-files-mode, a special read-only major mode with the following key bindings:

KeyCommandDescription
xgptel-plus-toggle-markToggle the removal flag on a file
Dgptel-plus-remove-flagged-context-filesRemove all flagged files
ggptel-plus-refresh-context-files-bufferRefresh the listing
qkill-current-bufferClose the buffer

The workflow is similar to dired’s flagging mechanism: move to a file entry, press x to toggle its flag (the marker changes from [ ] to [X]), repeat for other files, then press D to remove all flagged files from the context at once. The cost estimate is updated automatically after removal.

Automatic mode activation

When you save a gptel conversation, gptel stores session metadata in the file (as Org properties or as Markdown file-local variables). However, reopening that file does not automatically re-enable gptel-mode. gptel-plus provides hook functions that detect these markers and activate gptel-mode automatically.

The function gptel-plus-enable-gptel-in-org checks for gptel-related Org properties (such as GPTEL_SYSTEM, GPTEL_BACKEND, GPTEL_MODEL, and others) and enables gptel-mode if any are found.

The function gptel-plus-enable-gptel-in-markdown checks for gptel-related file-local variables (such as gptel-mode, gptel-model, gptel--backend-name, and gptel--bounds) and enables gptel-mode if any are present.

To use these, add the appropriate hooks to your configuration:

(add-hook 'org-mode-hook #'gptel-plus-enable-gptel-in-org)
(add-hook 'markdown-mode-hook #'gptel-plus-enable-gptel-in-markdown)

When activating gptel-mode automatically, gptel-plus takes care not to mark the buffer as modified (which would otherwise happen as a side effect of enabling the mode). It also disables breadcrumb-mode if active, since it conflicts with the gptel header line.

Functions

Cost computation

The function gptel-plus-compute-cost computes the dollar cost of a gptel-request call from its INFO plist. It accepts an optional MODEL argument (defaulting to gptel-model) and returns a float, or nil if usage data or pricing is unavailable. It handles Anthropic, OpenAI, and Gemini usage formats, including prompt caching. See Programmatic cost API for details and a usage example.

The function gptel-plus-get-total-cost returns the estimated total cost (in dollars) of sending the current prompt, or nil if cost information is unavailable for the active model. It combines the input and output costs and normalizes the result. This is the function used by the header line display (Header line display).

The function gptel-plus-get-input-cost returns the estimated input cost, combining the buffer content cost with the cached context cost. The function gptel-plus-get-output-cost returns the estimated output cost based on gptel-plus-tokens-in-output (Cost estimation parameters).

The function gptel-plus-normalize-cost converts a raw cost figure (expressed in cost-per-million-tokens units) into a dollar amount by dividing by one million.

The function gptel-plus-update-context-cost recalculates and caches the cost of the current context files. It is called automatically (via advice) whenever files are added to or removed from the context, and whenever the model or backend changes. You generally do not need to call it directly, but it is available if you modify gptel-context programmatically.

Word counting

The function gptel-plus-count-words-in-buffer returns the word count for the portion of the buffer that will be sent to the model. If a region is active, it counts words in the region; otherwise, it counts from the beginning of the buffer to point (since gptel sends everything up to point).

The function gptel-plus-count-words-in-context iterates over all entries in gptel-context and sums their word counts. Buffer entries are counted directly; file entries are read into a temporary buffer for counting. Binary files and unreadable files are skipped. Dead buffers (killed since they were added to the context) are also skipped.

Context detection

The function gptel-plus-file-has-gptel-local-variable-p returns non-nil if the current buffer has any gptel-related file-local variable set. It checks for gptel-mode, gptel-model, gptel--backend-name, and gptel--bounds.

The function gptel-plus-file-has-gptel-org-property-p returns non-nil if the current Org buffer has any gptel-related property at the beginning of the file. It checks for GPTEL_SYSTEM, GPTEL_BACKEND, GPTEL_MODEL, GPTEL_TEMPERATURE, GPTEL_MAX_TOKENS, and GPTEL_NUM_MESSAGES_TO_SEND.

Both functions are used by the automatic mode activation hooks (Automatic mode activation) and can also be called independently to test whether a file contains gptel data.

Troubleshooting

Cost shows “N/A”

The header line displays [Cost: N/A] when the current model does not have :input-cost or :output-cost properties set. These properties are defined by gptel when you register a backend. If you are using a custom backend, ensure that the model symbols have these properties set (the values are in dollars per million tokens).

Ex post cost not reported

The ex post cost calculation supports Anthropic, OpenAI, and Gemini backends, in both streaming and non-streaming modes. If the echo area message does not appear, check that the model has :input-cost and :output-cost properties set and that gptel-plus-calculate-cost is non-nil.

If the cost calculation fails (e.g., due to unexpected log format changes in a new gptel version), a message of the form gptel-plus: failed to calculate cost: ... will appear in the echo area instead of silently failing.

Context restore skips files

When restoring a saved context, gptel-plus-restore-file-context checks that each file exists and is readable before adding it. If files have been moved, renamed, or deleted since the context was saved, they are skipped and a message lists the missing paths. Update the saved context by saving it again after correcting your file paths.