PRG File Format Specification
- MIME Type:
application/vnd.project-graph - File Extension:
.prg - Author: zty012 <[email protected]>
- Version: 2.3.0
1. Introduction
The PRG file format is a container-based format for storing diagrams or extensions used in the Project Graph application. It leverages the ubiquitous ZIP archive format to bundle the main content (stage.msgpack or extension.js) with its associated binary attachments (images, documents, etc.).
The design goals of this format are:
- Portability: To serve as a single, shareable file containing all project assets.
- Interoperability: To be based on well-established standards (ZIP, MessagePack) for ease of implementation.
- Extensibility: To allow for future evolution of the format while maintaining backward compatibility.
2. Overall Container Structure
A PRG file MUST be a valid ZIP archive. The structure within the ZIP filesystem is as follows:
- All paths within the ZIP archive MUST use the forward slash (
/) as the directory separator. - All filenames MUST be encoded using UTF-8.
NOTE: A valid PRG container MUST include metadata.msgpack at the root, and MUST include at least one of stage.msgpack or extension.js. Implementations that encounter a ZIP lacking these required files should reject the file as invalid.
3. The Metadata File (metadata.msgpack)
3.1. Format & Encoding
The metadata.msgpack file MUST exist at the root of the ZIP container. Its content MUST be a binary stream that is a valid MessagePack serialization of an object containing metadata about the project.
3.2. Content Schema
The deserialized content of metadata.msgpack MUST be an object with the following entries:
version: Astringvalue representing the version of the PRG format used in Semantic Versioning (e.g.,"2.3.0"). This field MUST be present to allow for future format evolution and compatibility checks.extension?: Required if the PRG file is an extension. Anobjectcontaining:id: Astringvalue representing the unique identifier of the extension (e.g.,"com.example.myextension").name: Astringvalue representing the human-readable name of the extension (e.g.,"My Extension").description: Astringvalue providing a brief description of the extension's functionality and purpose.version: Astringvalue representing the version of the extension in Semantic Versioning (e.g.,"1.0.0").author: Astringvalue representing the name and email of the extension's author (e.g.,"Jane Doe <[email protected]>").
Compatibility and alternative formats:
- Implementations MUST support
metadata.msgpack. Implementations MAY also support a human-readablemetadata.jsonas an alternative. If bothmetadata.msgpackandmetadata.jsonare present, implementations MUST prefermetadata.msgpack. - When encountering a
metadata.versionwith an unknown major version, implementations SHOULD fail gracefully (e.g., refuse to load or operate in read-only mode) and log or surface a clear error. For unknown minor/patch versions, implementations SHOULD ignore unknown fields while continuing processing when possible.
Example (JSON form; equivalent MessagePack should be used in metadata.msgpack):
{
"version": "2.3.0",
"extension": {
"id": "com.example.myextension",
"name": "My Extension",
"description": "Provides extra node types",
"version": "1.0.0",
"author": "Jane Doe <[email protected]>"
}
}4. Main Content
PRG files can represent either a diagram or an extension.
4.1. Diagram (stage.msgpack)
4.1.1. Format & Encoding
The stage.msgpack file MAY exist at the root of the ZIP container. Its content MUST be a binary stream that is a valid MessagePack serialization of a Graphif Serializer serialized array.
4.1.2. Content Schema
The deserialized content of stage.msgpack MUST be an array. Each element in this array MUST be an object representing a Stage Object.
4.1.2.1. Stage Object
A Stage Object (an object) MUST contain the following entry:
uuid(Key): Astringvalue representing the unique identifier for this stage.
The Stage Object MAY contain any number of other entries to define the graph's nodes, edges, properties, and references to attachments. The specific schema for these entries is defined by the Graphif Serializer specification.
Any implementation that encounters a Stage Object with an unknown structure SHOULD ignore the unrecognized entries and continue processing.
4.2. Extension (extension.js)
The extension.js file MAY exist at the root of the ZIP container. If present, it MUST be a JavaScript file that defines the behavior and functionality of an extension package. The content of this file is outside the scope of this specification and is determined by the extension's requirements.
5. The Attachments Directory (attachments/)
5.1. Purpose
The attachments/ directory is an optional container for any binary or text files referenced by the stage(s), such as images, documents, or other media.
5.2. Naming Convention
Files within the attachments/ directory MUST be named using a Universally Unique Identifier (UUID) followed by a file extension that implies the file's MIME type.
Example:
attachments/b54c5f6c-6f28-4d65-bcc5-4c891c6dbd77.pngattachments/f8a3d2b1-4e5c-6789-0123-456789abcdef.pdf
UUID format: Implementations SHOULD follow RFC 4122 (commonly UUID v4). A recommended validation regex is:
^[0-9a-f]{8}-[0-9a-f]{4}-[1-5][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$Implementations MUST NOT allow multiple different files in attachments/ that share the same UUID (even with different extensions); presence of duplicate UUIDs is considered invalid and implementations MUST report an error rather than silently choosing one.
5.3. Referencing Attachments
Stage Objects reference these files by their UUID filename (without the extension). For example, an ImageNode object would reference the attachment b54c5f6c-6f28-4d65-bcc5-4c891c6dbd77.png using the string "b54c5f6c-6f28-4d65-bcc5-4c891c6dbd77".
6. Security Considerations
Implementors and users of this format should be aware of several security-related aspects:
-
ZIP Container Risks:
- Compression Bombs: A PRG file may contain a small ZIP that decompresses to an extremely large amount of data, causing denial-of-service. Implementations MUST impose reasonable limits on the number of extracted files and the total uncompressed size.
- Recommended default limits (implementations MAY make these configurable):
- Maximum number of entries:
1000 - Maximum total uncompressed size:
500 * 1024 * 1024bytes (500 MB) - Maximum single file uncompressed size:
200 * 1024 * 1024bytes (200 MB)
- Maximum number of entries:
- Recommended default limits (implementations MAY make these configurable):
- Path Traversal: Maliciously crafted ZIP entries could have names like
../../../some_important_file. Implementations MUST NOT extract files to filesystem, instead, read them directly from the ZIP stream to memory or a controlled environment.- Implementations MUST reject ZIP entries that contain absolute paths, drive letters, or any
..path segments.
- Implementations MUST reject ZIP entries that contain absolute paths, drive letters, or any
- Compression Bombs: A PRG file may contain a small ZIP that decompresses to an extremely large amount of data, causing denial-of-service. Implementations MUST impose reasonable limits on the number of extracted files and the total uncompressed size.
-
Attachment Risks: The attachments directory can contain any file type. The application processing the PRG file is responsible for handling each attachment in a secure manner (e.g., run script files in sandbox, detect malware in attachments).
7. Future Considerations
This section outlines potential extensions to the format for future discussion and development.
7.1. Use JSON instead of MessagePack for Metadata
To enhance human readability and ease of debugging, the metadata.msgpack file could be supplemented by a metadata.json file. Implementations MUST continue to accept metadata.msgpack; support for metadata.json is optional and, if implemented, the priority rules in section 3.2 apply.
7.2. Versioning (versions/ directory)
A versions/ directory could be introduced to store historical snapshots of the stage.msgpack file, enabling built-in version control and audit trails. Each snapshot could be a copy of stage.msgpack named by a timestamp or commit hash.
7.3. Sub-Stages (sub/ directory)
To avoid the inefficiency of nested ZIP files (storing a .prg inside another .prg), a sub/ directory could store additional Stage files. This would allow for complex, multi-stage projects within a single container.
Sub-stages can be nested to arbitrary depth, with each sub-stage having its own stage.msgpack file. References between stages MUST be done using the UUID of the sub-stage.
Storage convention and lookup: sub-stages MUST be stored under sub/<uuid>/stage.msgpack. When resolving a reference to a sub-stage UUID, implementations SHOULD look for a matching folder name under sub/ and load that folder's stage.msgpack. Example layout:
sub/
b54c5f6c-6f28-4d65-bcc5-4c891c6dbd77/
stage.msgpackA sub-stage reference such as b54c5f6c-6f28-4d65-bcc5-4c891c6dbd77 resolves to sub/b54c5f6c-6f28-4d65-bcc5-4c891c6dbd77/stage.msgpack.
7.3.1 Representing Sub-Stages
Because UUID is unique, so use just one UUID to represent a sub-stage.
7.4. Workspace Settings (settings.msgpack)
Application-specific settings (e.g., default node styles, view preferences) could be stored in a root-level file like settings.json, making the PRG file a self-contained workspace.
7.5. Anchors
To represent a node or a region, we need to design an anchor syntax that can uniquely identify elements within the PRG file. The proposed syntax uses UUIDs to reference specific nodes or regions.
Node UUIDs MUST be separated by ;.
# wrapped for readability
file:///home/user/project.prg#
ec32d43d-7890-4e45-a28f-b31bca4dafea;
6c95db6b-b64e-49e8-a3b0-3a39ee2588c9;
49828d7f-7d05-48d9-bcb5-a699782f9880This example references three nodes:
ec32d43d-7890-4e45-a28f-b31bca4dafea6c95db6b-b64e-49e8-a3b0-3a39ee2588c949828d7f-7d05-48d9-bcb5-a699782f9880
Anchor syntax (recommended): anchors SHOULD be a single-line, semicolon-separated list of UUIDs with no surrounding whitespace. Trailing semicolons are NOT allowed. Example single-line form:
file:///home/user/project.prg#ec32d43d-7890-4e45-a28f-b31bca4dafea;6c95db6b-b64e-49e8-a3b0-3a39ee2588c9;49828d7f-7d05-48d9-bcb5-a699782f9880When embedded in URLs, implementations MUST apply URL-encoding as required by the surrounding context.
7.6. Fractional indexing
Source: Realtime editing of ordered sequences | Figma Blog
Instead of OT, Figma uses a trick that’s often used to implement reordering on top of a database. Every object has a real number as an index and the order of the children for an element of the tree is determined by sorting all children by their index. To insert between two objects, just set the index for the new object to the average index of the two objects on either side. We use arbitrary-precision fractions instead of 64-bit doubles so that we can’t run out of precision after lots of edits.
We can also use numbers to identify nodes in a stage. This can make it easier to represent the order of nodes (z-index).