API reference

Module dom

This module provides functions for handling annotations in the context of an HTML DOM; in other words, a web page.

The module’s main functionality is matching (or ‘anchoring’) a Selector to the DOM, i.e. finding which piece of a web page it refers to; and, vice versa, describing a selection of the page as a Selector.

Index

Functions

createCssSelectorMatcher

  • Find the elements corresponding to the given CssSelector.

    The given CssSelector returns all elements within scope that it matches.

    The function is curried, taking first the selector and then the scope.

    As there may be multiple matches for a given selector, the matcher will return an (async) iterable that produces each match in the order they are found in the document.

    Note that the Web Annotation specification does not mention whether an ‘ambiguous’ CssSelector should indeed match all elements that match the selector value, or perhaps only the first. This implementation returns all matches to give users the freedom to follow either interpretation. This is also in line with more clearly defined behaviour of the TextQuoteSelector:

    “If […] the user agent discovers multiple matching text sequences, then the selection SHOULD be treated as matching all of the matches.”

    Note that if scope is not a Document, the Web Annotation Data Model leaves the behaviour undefined. This implementation will, in such a case, evaluate the selector relative to the document containing the scope, but only return those matches that are fully enclosed within the scope. There might be edge cases where this is not a perfect inverse of describeCss.

    example
    const matches = createCssSelectorMatcher({
      type: 'CssSelector',
      value: '#target',
    });
    for await (const match of matches) {
      console.log(match);
    }
    // <div id="target" …>
    

    Parameters

    Returns Matcher<Node | Range, Element>

    A Matcher function that applies selector to a given scope.

createTextPositionSelectorMatcher

  • Find the range of text corresponding to the given TextPositionSelector.

    The start and end positions are measured relative to the first text character in the given scope.

    The function is curried, taking first the selector and then the scope.

    Its end result is an (async) generator producing a single Range to represent the match (unlike e.g. a TextQuoteSelector, a TextPositionSelector cannot have multiple matches).

    example
    const selector = { type: 'TextPositionSelector', start: 702, end: 736 };
    const scope = document.body;
    const matches = textQuoteSelectorMatcher(selector)(scope);
    const match = (await matches.next()).value;
    // ⇒ Range { startContainer: #text, startOffset: 64, endContainer: #text,
    //   endOffset: 98, … }
    

    Parameters

    Returns Matcher<Node | Range, Range>

    A Matcher function that applies selector within a given scope.

createTextQuoteSelectorMatcher

  • Find occurrences in a text matching the given TextQuoteSelector.

    This performs an exact search for the selector’s quote (including prefix and suffix) within the text contained in the given scope (a Range).

    Note the match is based on strict character-by-character equivalence, i.e. it is sensitive to whitespace, capitalisation, etc.

    The function is curried, taking first the selector and then the scope.

    As there may be multiple matches for a given selector (when its prefix and suffix attributes are not sufficient to disambiguate it), the matcher will return an (async) generator that produces each match in the order they are found in the text.

    XXX Modifying the DOM (e.g. to highlight the text) while the search is still running can mess up and result in an error or an infinite loop. See issue #112.

    example
    // Find the word ‘banana’.
    const selector = { type: 'TextQuoteSelector', exact: 'banana' };
    const scope = document.body;
    
    // Read all matches.
    const matches = textQuoteSelectorMatcher(selector)(scope);
    for await (match of matches) console.log(match);
    // ⇒ Range { startContainer: #text, startOffset: 187, endContainer: #text,
    //   endOffset: 193, … }
    // ⇒ Range { startContainer: #text, startOffset: 631, endContainer: #text,
    //   endOffset: 637, … }
    

    Parameters

    Returns Matcher<Node | Range, Range>

    A Matcher function that applies selector within a given scope.

describeCss

  • describeCss(element: HTMLElement, scope?: Element): Promise<CssSelector>
  • Returns a CssSelector that unambiguously describes the given element, within the given scope.

    example
    const target = document.getElementById('targetelement').firstElementChild;
    const selector = await describeCss(target);
    console.log(selector);
    // {
    //   type: 'CssSelector',
    //   value: '#targetelement > :nth-child(1)'
    // }
    

    Parameters

    • element: HTMLElement

      The element that the selector should describe.

    • Optional scope: Element

      The node that serves as the ‘document’ for purposes of finding an unambiguous selector. Defaults to the Document that contains element.

    Returns Promise<CssSelector>

    The selector unambiguously describing element within scope.

describeTextPosition

  • Returns a TextPositionSelector that points at the target text within the given scope.

    When no scope is given, the position is described relative to the document as a whole. Note this means all the characters in all Text nodes are counted to determine the target’s position, including those in the <head> and whitespace, hence even a minor modification could make the selector point to a different text than its original target.

    example
    const target = window.getSelection().getRangeAt(0);
    const selector = await describeTextPosition(target);
    console.log(selector);
    // {
    //   type: 'TextPositionSelector',
    //   start: 702,
    //   end: 736
    // }
    

    Parameters

    • range: Range

      The Range whose text content will be described.

    • Optional scope: Node | Range

      A Node or Range that serves as the ‘document’ for purposes of finding occurrences and determining prefix and suffix. Defaults to the full Document that contains range.

    Returns Promise<TextPositionSelector>

    The selector describing range within scope.

describeTextQuote

  • Returns a TextQuoteSelector that unambiguously describes the given range of text, within the given scope.

    The selector will contain the exact target quote, and in case this quote appears multiple times in the text, sufficient context around the quote will be included in the selector’s prefix and suffix attributes to disambiguate. By default, more prefix and suffix are included than strictly required; both in order to be robust against slight modifications, and in an attempt to not end halfway a word (mainly for the sake of human readability).

    example
    const target = window.getSelection().getRangeAt(0);
    const selector = await describeTextQuote(target);
    console.log(selector);
    // {
    //   type: 'TextQuoteSelector',
    //   exact: 'ipsum',
    //   prefix: 'Lorem ',
    //   suffix: ' dolor'
    // }
    

    Parameters

    • range: Range

      The Range whose text content will be described

    • Optional scope: Node | Range

      A Node or Range that serves as the ‘document’ for purposes of finding occurrences and determining prefix and suffix. Defaults to the full Document that contains range.

    • Optional options: DescribeTextQuoteOptions

      Options to fine-tune the function’s behaviour.

    Returns Promise<TextQuoteSelector>

    The selector unambiguously describing range within scope.

highlightText

  • highlightText(target: Node | Range, tagName?: string, attributes?: Record<string, string>): () => void
  • Wrap each text node in a given Node or Range with a <mark> or other element.

    If a Range is given that starts and/or ends within a Text node, that node will be split in order to only wrap the contained part in the mark element.

    The highlight can be removed again by calling the function that cleans up the wrapper elements. Note that this might not perfectly restore the DOM to its previous state: text nodes that were split are not merged again. One could consider running range.commonAncestorContainer.normalize() afterwards to join all adjacent text nodes.

    Parameters

    • target: Node | Range

      The Node/Range containing the text. If it is a Range, note that as highlighting modifies the DOM, the Range may be unusable afterwards.

    • Optional tagName: string

      The element used to wrap text nodes. Defaults to 'mark'.

    • Optional attributes: Record<string, string>

      An object defining any attributes to be set on the wrapper elements, e.g. its class.

    Returns () => void

    A function that removes the created highlight.

      • (): void
      • Returns void

makeCreateRangeSelectorMatcher

  • makeCreateRangeSelectorMatcher(createMatcher: <T, TMatch>(selector: T) => Matcher<Node | Range, TMatch>): (selector: RangeSelector) => Matcher<Node | Range, Range>
  • Find the range(s) corresponding to the given RangeSelector.

    As a RangeSelector itself nests two further selectors, one needs to pass a createMatcher function that will be used to process those nested selectors.

    The function is curried, taking first the createMatcher function, then the selector, and then the scope.

    As there may be multiple matches for the start & end selectors, the resulting matcher will return an (async) iterable, that produces a match for each possible pair of matches of the nested selectors (except those where its end would precede its start). (Note that this behaviour is a rather free interpretation of the Web Annotation Data Model spec, which is silent about the possibility of multiple matches for RangeSelectors)

    example

    By using a matcher for TextQuoteSelectors, one could create a matcher for text quotes with ellipsis to select a phrase “ipsum … amet,”:

    const selector = {
      type: 'RangeSelector',
      startSelector: {
        type: 'TextQuoteSelector',
        exact: 'ipsum ',
      },
      endSelector: {
        type: 'TextQuoteSelector',
        // Because the end of a RangeSelector is *exclusive*, we will present the
        // latter part of the quote as the *prefix* so it will be part of the
        // match.
        exact: '',
        prefix: ' amet,',
      }
    };
    const createRangeSelectorMatcher =
      makeCreateRangeSelectorMatcher(createTextQuoteMatcher);
    const match = createRangeSelectorMatcher(selector)(document.body);
    console.log(match)
    // ⇒ Range { startContainer: #text, startOffset: 6, endContainer: #text,
    //   endOffset: 27, … }
    
    example

    To support RangeSelectors that might themselves contain RangeSelectors, recursion can be created by supplying the resulting matcher creator function as the createMatcher parameter:

    const createWhicheverMatcher = (selector) => {
      const innerCreateMatcher = {
        TextQuoteSelector: createTextQuoteSelectorMatcher,
        TextPositionSelector: createTextPositionSelectorMatcher,
        RangeSelector: makeCreateRangeSelectorMatcher(createWhicheverMatcher),
      }[selector.type];
      return innerCreateMatcher(selector);
    });
    

    Parameters

    • createMatcher: <T, TMatch>(selector: T) => Matcher<Node | Range, TMatch>

      The function used to process nested selectors.

        • <T, TMatch>(selector: T): Matcher<Node | Range, TMatch>
        • Type parameters

          Parameters

          • selector: T

          Returns Matcher<Node | Range, TMatch>

    Returns (selector: RangeSelector) => Matcher<Node | Range, Range>

    A function that, given a RangeSelector selector, creates a Matcher function that can apply it to a given scope.