Handle video stream
The video socket from Scrcpy server contains the encoded video frames. However, its format depends on the Scrcpy server version and the specified option values.
Raw mode
Since v1.22, if sendDeviceMeta and sendFrameMeta options are false, and sendCodecMeta option is also false(since v2.0), the server directly returns the encoded video stream without any encapsulations. In this mode, the video stream can be decoded or saved directly.
Otherwise, the video stream contains additional metadata and frame information. The following methods can be used to parse and handle the video stream.
With @yume-chan/scrcpy
If you have a ReadableStream<Uint8Array> that reads from the video socket, the corresponding ScrcpyOptionsX_YY class can be used to parse the metadata and create a packet stream.
parseVideoStreamMetadata
The first step is to parse the preceding metadata from the video stream. This method will return the metadata and a new stream that contains the remaining data.
- JavaScript
- TypeScript
import { ScrcpyOptions2_1 } from "@yume-chan/scrcpy";
const options = new ScrcpyOptions2_1({
// use the same version and options when starting the server
});
const videoSocket; // get the stream yourself
// Parse video socket metadata
const { metadata: videoMetadata, stream: videoStream } =
await options.parseVideoStreamMetadata(videoSocket);
import { ScrcpyOptions2_1, ScrcpyVideoStreamPacket } from "@yume-chan/scrcpy";
const options = new ScrcpyOptions2_1({
// use the same version and options when starting the server
});
const videoSocket: ReadableStream<Uint8Array>; // get the stream yourself
// Parse video socket metadata
const { metadata: videoMetadata, stream: videoStream } =
await options.parseVideoStreamMetadata(videoSocket);
The metadata object contains the following fields:
interface ScrcpyVideoStreamMetadata {
deviceName?: string | undefined;
width?: number | undefined;
height?: number | undefined;
codec: ScrcpyVideoCodecId;
}
deviceName: The device's model name, orundefinedifsendDeviceMetaoption isfalse.(since v1.22).widthandheight: Size of the first video frame, orundefinedifsendDeviceMetaoption isfalse.(between v1.22 and v2.0), orundefinedifsendCodecMetaoption isfalse.(since v2.0).codec: The codec returned from server, or the codec inferred from options ifsendCodecMetaoption isfalse.(since v2.0).
When device's screen resolution changes (for example, when device rotates, or when a foldable device folds/unfolds), the server restarts video capture and encoding with the new resolution. However, it won't send a new metadata packets with the updated size. The only way to keep track of the video resolution is by parsing the video stream directly. The exact code depends on the video codec.
The returned videoStream is a ReadableStream<Uint8Array> that contains the video stream after the metadata, or the input stream as-is if the sendDeviceMeta option is false.(between v1.22 and v2.0), or the input stream as-is if both sendDeviceMeta and sendCodecMeta options are false.(since v2.0).
createMediaStreamTransformer
The createMediaStreamTransformer method creates a TransformStream that parses the video stream into packets.
parseVideoStreamMetadata and createMediaStreamTransformer are separate methods, because the video and audio stream has the same format, but different metadata.
- JavaScript
- TypeScript
const videoPacketStream = videoStream.pipeThrough(options.createMediaStreamTransformer());
videoPacketStream
.pipeTo(
new WritableStream({
write(packet) {
switch (packet.type) {
case "configuration":
// Handle configuration packet
console.log(packet.data);
break;
case "data":
// Handle data packet
console.log(packet.keyframe, packet.pts, packet.data);
break;
}
},
}),
)
.catch((e) => {
console.error(e);
});
const videoPacketStream: ReadableStream<ScrcpyMediaStreamPacket> = videoStream.pipeThrough(
options.createMediaStreamTransformer(),
);
videoPacketStream
.pipeTo(
new WritableStream({
write(packet: ScrcpyMediaStreamPacket) {
switch (packet.type) {
case "configuration":
// Handle configuration packet
console.log(packet.data);
break;
case "data":
// Handle data packet
console.log(packet.keyframe, packet.pts, packet.data);
break;
}
},
}),
)
.catch((e) => {
console.error(e);
});
Similar to options.clipboard, don't await the pipeTo. The returned Promise only resolves when videoSocket ends, but waiting here and not handling other streams will block videoSocket so it never ends.
videoPacketStream contains two types of packets:
interface ScrcpyMediaStreamConfigurationPacket {
type: "configuration";
data: Uint8Array;
}
interface ScrcpyMediaStreamDataPacket {
type: "data";
keyframe?: boolean;
pts?: bigint;
data: Uint8Array;
}
type ScrcpyMediaStreamPacket =
| ScrcpyMediaStreamConfigurationPacket
| ScrcpyMediaStreamDataPacket;
The behavior of the packets depends on the sendFrameMeta option:
| Value | sendFrameMeta: true | sendFrameMeta: false |
|---|---|---|
ScrcpyMediaStreamConfigurationPacket | Must be the first packet, but can also appear anywhere | Not exist |
ScrcpyMediaStreamDataPacket.keyframe | boolean indicating if the current packet is a keyframe | undefined |
ScrcpyMediaStreamDataPacket.pts | bigint indicating the presentation timestamp | undefined |
Configuration packet
When the encoder is restarted, another configuration packet will be generated.
The data field contains the codec-specific configuration information:
- H.264: Sequence Parameter Set (SPS) and Picture Parameter Set (PPS)
- H.265: Video Parameter Set (VPS), Sequence Parameter Set (SPS), and Picture Parameter Set (PPS)
- AV1: The first 3 bytes of
AV1CodecConfigurationRecord(https://aomediacodec.github.io/av1-isobmff/#av1codecconfigurationbox). The remaining configuration OBUs are in the next data packet.
The client should handle this packet and update the decoder accordingly.
Data packet
Each data packet contains exactly one encoded frame.
With @yume-chan/adb-scrcpy
When video option is not false, AdbScrcpyClient.videoStream is a Promise that resolves to an AdbScrcpyVideoStream.
interface AdbScrcpyVideoStream {
stream: ReadableStream<ScrcpyMediaStreamPacket>;
metadata: ScrcpyVideoStreamMetadata;
}
It's very similar to the videoMetadata and videoPacketStream values in the above section, because AdbScrcpyClient calls parseVideoStreamMetadata and createMediaStreamTransformer internally.
- JavaScript
- TypeScript
if (client.videoSteam) {
const { metadata: videoMetadata, stream: videoPacketStream } = await client.videoStream;
videoPacketStream
.pipeTo(
new WritableStream({
write(packet) {
switch (packet.type) {
case "configuration":
// Handle configuration packet
console.log(packet.data);
break;
case "data":
// Handle data packet
console.log(packet.keyframe, packet.pts, packet.data);
break;
}
},
}),
)
.catch((e) => {
console.error(e);
});
}
if (client.videoSteam) {
const { metadata: videoMetadata, stream: videoPacketStream } = await client.videoStream;
videoPacketStream
.pipeTo(
new WritableStream({
write(packet: ScrcpyMediaStreamPacket) {
switch (packet.type) {
case "configuration":
// Handle configuration packet
console.log(packet.data);
break;
case "data":
// Handle data packet
console.log(packet.keyframe, packet.pts, packet.data);
break;
}
},
}),
)
.catch((e) => {
console.error(e);
});
}
Check the With @yume-chan/scrcpy section above for how to handle different types of packets.
Decode and render video
Tango provides packages to decode and render video packets in Web browsers:
To use these decoders, the sendFrameMeta options must be true (the default value).
Decoding and playing video outside the browser is out of scope for this library. It will depend on the runtime environment, UI framework, and media library you use.