This document proposes a new API aimed at grabbing a screenshot of either the current viewport, the current window or the current screen.
Extending this API beyond the current viewport/window/screen is possible, but we avoid it for the time being. The rationale for this decision is discussed further below.
This is an early draft of a proposal for an API for grabbing screenshots.
For the purposes of this document, when the viewport is discussed, we mean the viewport of the top-level document; essentially the entire user-visible part of the user agent tab.
This document uses the definitions of {{MediaStream}} and {{MediaStreamTrack}} from [[!MEDIACAPTURE-STREAMS]].
This document uses the definitions of display surfaces from [[!SCREEN-CAPTURE]], including the distinction between browser, window and monitor display surfaces.
The terms fulfilled, rejected and resolved used in the context of Promises are defined in [[ECMA-262]].
We define a "screenshot" as an image representing what the user sees on their screen, or part thereof. A screenshot is defined as a subset of the user's current browser, window or monitor display surface.
We avoid extended definitions for the time being. For the purposes of this document, the following are NOT considered a screenshot:
Applications often present end users the option to file feedback. This feedback can be used to report bugs about the application itself, or about its interaction with either the user agent or the operating system. When reporting such defects, it is often true that a picture is worth a thousand words.
Current mechanisms exist for an application to grab a screenshot. They all have significant associated issues, necessitating the definition of a new API.
Applications sometimes use getDisplayMedia, then attempt to grab a single frame from the resulting video track. This workaround is problematic on several accounts:
Some libraries exist for redrawing a part of the DOM onto a Canvas.
Various browser extensions exist aiming to fill in the gap in the web platform. Reliance on extensions suffers from several issues:
It is possible to direct the user to take a screenshot manually, using whatever tools are available on the user's operating system, and then ask the user to hand the application that file.
Most modern browsers include a Developer Tools module which includes the ability to grab a screenshot. An application can direct the user to use one of these and to then share with the application the resulting image.
It should go without saying that this workaround is highly perilous for the user, as directions to use the Developer Tools module could easily lead to the user being asked to help the application exceed the permissions which it should have.
Reputable applications would never direct the user to make use of the Developer Tools module. Tech-savvy users would find such instruction highly suspect, whereas unsavvy users would often find the instruction difficult to follow.
No good workaround exists, yet the capability is highly desired on the web, as evidenced by the many high-profile applications which provide this functionality using complicated workarounds. An API for accomplishing this is necessary.
partial interface MediaDevices { Promise<CaptureScreenshotResultType> captureScreenshot(ScreenshotSurfaceType surface); };
Prompts the user for permission to grab a screenshot of either the surface or any [=surface-size|contained=] type. The screenshot is presented to the user for approval and editing before it's handed to the application.
When the {{captureScreenshot()}} method is invoked, the User Agent MUST run the following steps before proceeding:
If all of these validations pass, the user agent MUST then return promise
p
to the application, and present the user with a prompt containing a
preview of the screenshot.
If surface was anything [=surface-size|larger=] than {{ScreenshotSurfaceType/viewport}}, the user agent nevertheless MUST offer all [=surface-size|smaller=] surfaces as well, so as to prevent the user being compelled into sharing more than they would like.
This preview MUST be large enough for the user to be able to inspect it reasonably. It is recommended that the user agent provide a mechanism to zoom in or magnify the preview.
The user agent SHOULD provide the user with mechanisms to crop the image or black out parts of it. Although the application is better positioned to perform such image manipulation than the user agent, the rationale here is that the user should be able to reduce the amount of information shared with an application to begin with.
The user agent SHOULD add a random delay between when {{captureScreenshot()}} is called and when the screenshot is taken and presented to the user. This makes it harder for a malicious application to flash new content to the screen exactly at the time that the preview is presented to the user, thereby escaping the user's notice and gaining their approval of the screenshot based on the content they saw before the preview was presented.
If the user does not approve the screenshot, the user agent MUST [=reject=]
p
with a {{DOMException}} object whose {{DOMException/name}} attribute
has the value {{AbortError}}.
If the user approves the screenshot, the user agent MUST resolve p
with a
{{CaptureScreenshotResultType}} constructed with the
{{CaptureScreenshotResultType/surface}} the user ultimately chose and the
{{CaptureScreenshotResultType/image}} resulting from the user's cropping and otherwise
editing of the initial screenshot.
Describes the different surfaces which can be captured as a screenshot. As input to {{CaptureScreenshot}}, it is used to describe which surface the application is interested in capturing. When returned by {{CaptureScreenshot}}, it describes which surface the user approved of capturing.
Note that a type for a monitor display surfaces is not currently defined. This is intentional. It avoids, for now, the complicated topic of calling {{CaptureScreenshot}} from a document presented in a window that spans multiple monitors.
enum ScreenshotSurfaceType { "viewport", "window", };
Enumeration description | |
---|---|
viewport | The viewport associated with the document from which {{captureScreenshot}} was called. Note that this means that an embedded document captures its parent document as well as sibling documents. |
window | The window displaying the document from which {{captureScreenshot}} was called. |
With respect to surface-size, note that each surface in this enum is contained within the next one, allowing us to define "larger" surfaces as those that appear later in this enum.
Successful invocations of {{captureScreenshot}}, where the user approves a capture, produce a result of type {{CaptureScreenshotResultType}}.
[Exposed=Window] interface CaptureScreenshotResultType { attribute ScreenshotSurfaceType surface; attribute ImageBitmap image; };
The surface which the user approved to capture. This may be a lesser surface than requested by the application.
The screenshot, after applying any editing that the user might have chosen to peform, such as cropping and/or blacking-out of parts of the image.
This document defines a [=policy-controlled feature=] identified by the string
"screenshot". Its [=default allowlist=] is
"self"
.