Apr 30, 2024

Inspecting the Clipboard

{frontmatter.author}
by Scott Mahr

The Clipboard is, in my opinion, a real unsung hero of the web. From its humble beginnings as a simple word-processer utility, the Clipboard has grown into a powerful tool for moving all sorts of content between disparate platforms.

For example, on a daily basis I might copy some rows of a Google Sheet into an Dropbox Paper, I might right-click-and-copy a GIF on the web and paste it into Slack, or I might use a MacOS keyboard shortcut to save a portion of my screen to the Clipboard and paste it into a Github comment box. The Clipboard has become, in many ways, the de facto digital interface between... everything. You probably have something in your Clipboard as you're reading this. But do you know what is actually in there?

Below I'll share some debugging techniques and learning around how content is stored in the Clipboard. While this isn't directly a post about how the Clipboard works, understanding how it stores data sheds some light on how to make the most of copying and pasting on the web.

A seamless paste

Firstly, some background: this all came about as we were migrating the underlying library that powers our email editor UI. At the beginning of 2024 we switched from an older, jQuery-based WYSIWYG library to the more powerful — but less opinionated — combination of Prosemirror and Tiptap. With this came a lot of work re-implementing (and hopefully improving upon) the experience of pasting text and HTML into the body of an email. Close is a sales tool focused on communication, and users should be able to assemble a cohesive, professional-looking body of HTML from whatever they want: snippets of previous emails, meeting notes, pricing tables (and yes, GIFs).

So, a lot of that work polishing the paste behaviour involved a not-so-simple question: "What gets pasted when you paste?"

Which leads us to the next section...

What gets pasted when you paste?

Answering this question is not so simple for a few reasons:

Firstly, because of the magic of web APIs, what gets copied into the Clipboard is not necessarily the text/HTML/whatever that's highlighted when you Cmd+C. Anyone who thinks at least a bit like a web developer understands this instinctively: the Google Sheets UI is clearly not a plain HTML <table> but when selecting cells and copy/pasting them somewhere (like Dropbox Paper, Apple Notes...), it usually results in one.

Secondly, if it's HTML that's copied, the browser will put both an HTML and text representation of that content into the Clipboard. A "vanilla" paste (say, into a <textarea> element) will output plain text version, not the HTML.

Thirdly, for obvious security reasons, one cannot not simply waltz into the Clipboard and look around. Most modern web browsers support a Clipboard API, but programmatically reading from the Clipboard using JavaScript requires all the hooplah of granting special permissions (and at time of writing is not supported in Firefox).

And finally, in our case specifically, Prosemirror has its own HTML serialization/deserialization layer which transforms pasted HTML to match a strict schema we define. Debugging the schema/serialization is part of the process of making sure pasting works as expected, meaning we want to know the structure of the copied content both before and after it hits the editor.

Initially, whenever we ran into unexpected behavior when pasting into our editor, the debugging process usually went like this:

  1. Copy the offending content
  2. Reproduce the issue by pasting it into the editor
  3. Switch to the paste handler code, add some console logs, adjust business logic to debug or fix the issue
  4. Return to browser and paste again to test
  5. Oh, some snippet of code gets pasted instead?
  6. Right, of course, while coding, you copied something new to the Clipboard
  7. Rage
  8. Repeat

Solution: build a website

We found the easiest way to understand the contents of the Clipboard at any given time was to build a small interface that listens to and logs out data on paste.

We call this tool: Scott's Handy Paste Inspector.

It is an exceedingly simple one-page, framework-free utility. Here's the source code:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Scott's Handy Paste Inspector</title>
    <script src="https://unpkg.com/prettier@3.0.0/standalone.js"></script>
    <script src="https://unpkg.com/prettier@3.0.0/plugins/html.js"></script>
    <style>
      /* CSS omitted for brevity */
    </style>
  </head>
  <body>
    <h1>Scott's Handy Paste Inspector</h1>
    <p>
      paste something into the textarea below to inspect the Clipboard contents.
    </p>
    <textarea id="drop"></textarea>

    <ul id="result"></ul>
    <script>
      const dropzone = document.getElementById('drop');
      const result = document.getElementById('result');

      const renderResult = ({ type, data }) => {
        const container = document.createElement('li');

        const heading = document.createElement('h2');
        heading.innerText = type;

        const content = document.createElement('pre');

        if (type === 'text/html') {
          prettier
            .format(data, {
              parser: 'html',
              plugins: prettierPlugins,
            })
            .then((result) => {
              content.innerText = result;
            });
        } else {
          content.innerText = data;
        }

        container.appendChild(heading);
        container.appendChild(content);

        result.appendChild(container);
      };

      const handlePaste = (evt) => {
        const { clipboardData } = evt;
        result.innerHTML = '';
        dropzone.innerHTML = '';

        const mimeTypesToCheck = clipboardData.types;

        mimeTypesToCheck
          .map((type) => {
            const data = clipboardData.getData(type);

            if (!data) return;

            return { type, data };
          })
          .filter(Boolean)
          .forEach(renderResult);
      };

      dropzone.addEventListener('paste', handlePaste);
    </script>
  </body>
</html>

So, what's going on here?

Here's a breakdown of what's going on:

  1. First, we set up a textarea and attach an event listener to it, to fire a callback whenever anything is pasted.
  2. Then, in our callback we pull the clipboardData off the paste event. It won't surprise you to learn that this is where Clipboard data is stored!
  3. Specifically, it's stored in as DataTransfer object (same API that's used for drag and drop). DataTransfers can hold any number of items of different data types (generally MIME types like text/plain and text/html). We can tell what types currently exist in the Clipboard by looking at clipboardData.types.
  4. By pulling the list of data types from clipboardData.types, and looping through them, we can "log" the value of each by dropping the result of clipboardData.getData(type) into the browser with a bit of vanilla DOM manipulation.
  5. For text/html we're formatting the value with prettier for readability.

Boom! Instant visibility into the Clipboard contents, as requested!

Inspector results

The additional benefit of having a debugger like this hosted and accessible from anywhere is that allows us to "remote debug" anyone's Clipboard. If someone non-technical on the team notices something unusual when pasting, we can instruct them to paste into the inspector and share with us the resulting HTML.

Things the Paste Inspector Taught Me

That wraps up our solution, but I wanted to share some general learnings about the Clipboard and the web, courtesy of the Paste Inspector:

Browser inconsistencies around CSS

When copying HTML, Webkit automatically inlines all CSS applied to any given element, including vendor styles (making for some verbose inline styles!) Firefox (Gecko), on the other hand, only includes inline styles if they're in the source HTML.

Platform-specific niceties

Sometimes, it's clear that web-based services have put effort and design into proprietary Clipboard behavior:

  • Google Sheets inlines a bunch of additional data into the HTML when copying pieces of a spreadsheet, including the range of the copied cells as a meta tag, and raw values for each cell as data attributes on the <td>.
  • Dropbox Paper and Notion (and others I'm sure) add the Markdown representation of copied content to the Clipboard, alongside the copied HTML.

(Slack, on the other hand, doesn't seem to do any Clipboard massaging, and uses a proprietary span-based paragraph structure, meaning paragraph breaks aren't preserved when pasting. Thanks for nothing, Slack!)

Platform-specific quirks

  • Webkit, in some cases, adds a mysterious <br class="Apple-interchange-newline" /> break tag to the end of the copied HTML. It seems to happen when you copy a whole paragraph (e.g. triple-click).
  • Apple Notes seems to be one of the rare applications that still adds text/rtf content to the Clipboard on copy... but only when pasting into Safari.
  • When copying some cells from Apple Numbers, Numbers actually generates an image of the selected cells and adds this to a Clipboard as a File.

In conclusion

Take the paste inspector for a spin! Which apps do you use on a daily basis that are also doing sugary things with Clipboard contents? What's in your Clipboard right now? Might it be a password? Is this all an elaborate phishing scheme? I guess the only way to know for sure is to apply for one of our open engineering positions.

Happy pasting!