Introduction#
PFS is a proprietary archive format utilized by the Artemis visual novel engine. It functions as a container for game assets, enabling efficient packaging and distribution of multiple files within a single archive.
This document presents a comprehensive technical specification of the PFS file format, derived from reverse-engineering analysis. It aims to assist developers in building tools for extracting, inspecting, and modifying PFS archives.
Related Tools#
I have developed the following tools to facilitate working with PFS files:
- pf8: A Rust library offering robust encoding and decoding capabilities for PFS files.
- pfs-rs: A command-line utility for packing and unpacking PFS archives.
PFS File Format Structure#
The PFS format employs a straightforward structure to organize file metadata and content. It begins with a Header section that indexes all contained files, followed immediately by the File Data section. This layout supports efficient random access to any file within the archive.
Header Structure#
The header contains global archive information and the index of all files.
To represent the variable-length sections, we define the following symbols:
N: The number of files (value of File Count).S_Entries: The total size in bytes of the File Entries array.S_Offsets: The total size in bytes of the File Size Offsets array (equal to8 * N).
| Offset | Size | Field Name | Description |
|---|---|---|---|
0x00 | 3 | Magic Number | File signature: pf8 (or pf6). |
0x03 | 4 | Index Size | Size of the index data (from 0x07 to end of header). |
0x07 | 4 | File Count (N) | Total number of files in the archive. |
0x0B | S_Entries | File Entries | Array of File Entry structures. |
0x0B + S_Entries | 4 | File Size Count | Number of size offsets (typically N + 1). |
0x0F + S_Entries | S_Offsets | File Size Offsets | Array of 8-byte offsets pointing to file sizes. |
0x0F + S_Entries + S_Offsets | 8 | Padding | Zero-filled padding bytes. |
0x17 + S_Entries + S_Offsets | 4 | Index End Offset | Pointer to File Size Count (relative to 0x07). |
%%{init: {'flowchart': {'nodeSpacing': 15, 'rankSpacing': 15}, 'themeVariables': {'fontSize': '12px'}}}%%
graph TD
Magic[0x00: Magic Number] --> IndexSize[0x03: Index Size]
IndexSize --> FileCount[0x07: File Count]
FileCount --> FileEntries[0x0B: File Entries]
FileEntries -- Size: S_Entries --> FileSizeCount[File Size Count]
FileSizeCount --> FileSizeOffsets[File Size Offsets]
FileSizeOffsets -- Size: S_Offsets --> Padding[Padding]
Padding --> EndOffset[Index End Offset]
style FileEntries fill:#d9d,stroke:#333,stroke-width:2px
style FileSizeOffsets fill:#0bf,stroke:#333,stroke-width:2px
File Entry Structure#
Each file in the archive is described by a variable-length entry in the File Entries section.
Note: The offsets listed below are relative to the start of the specific file entry.
| Relative Offset | Size | Field Name | Description |
|---|---|---|---|
0x00 | 4 | Name Length | The length of the filename in bytes. |
0x04 | N | File Name | The filename string. Length is defined by Name Length. |
0x04 + N | 4 | Separator | A separator field, typically filled with null bytes (0x00). |
0x08 + N | 4 | Data Offset | The absolute offset of the file’s data from the beginning of the PFS file. See Understanding Data Offsets. |
0x0C + N | 4 | File Size | The size of the file’s data in bytes. |
Understanding Data Offsets#
The Data Offset field specifies the absolute position of the file’s content within the archive.
Example:
If a file entry has a Data Offset of 0xBD and a File Size of 0x100:
- The file’s data begins at byte
0xBDof the PFS file. - The data ends at byte
0x1BD(0xBD + 0x100). - In Rust slice notation:
archive_data[0xBD .. 0x1BD].
File Size Offset Structure#
%%{init: { "packet": { "bitsPerRow": 8 } } }%%
packet
+8: "File Size Offset"
+8: "File Size Offset"
+8: "..."
The File Size Offsets section provides a secondary way to navigate the file entries, specifically pointing to the size fields.
- Each entry is an 8-byte integer.
- The value is an offset relative to the start of the File Size Offsets array itself.
- It points to the
File Sizefield of a specific File Entry.
Example:
If a File Size Offset entry contains 0x25:
- Locate the start of the File Size Offsets array (let’s say it’s at absolute offset
0xFfor simplicity, though it varies). - Add
0x25to that start position. - The resulting address points to the 4-byte
File Sizevalue of a file entry.
Encryption#
PFS archives (specifically the pf8 variant) use a custom encryption scheme to protect file data. The encryption uses a stream cipher based on XOR operations with a dynamically generated key.
Key Generation#
The encryption key is derived from the archive’s header index data using the SHA-1 hashing algorithm.
Key Generation Process:
Source Data: Extract the index data from the archive header
- Start Offset:
0x07(immediately after the magic number and index size fields) - Length: The value specified in the Index Size field (at offset
0x03) - This includes the file count, all file entries, file size offsets, padding, and index end offset
- Start Offset:
Hashing: Apply SHA-1 hash to the index data
- Algorithm: SHA-1
- Output: 20-byte (160-bit) hash digest
Result: The 20-byte SHA-1 hash serves as the encryption key
// Key generation example
let index_size = read_u32_le(&archive_data[0x03..0x07]);
let index_data = &archive_data[0x07 .. 0x07 + index_size as usize];
let key = sha1(index_data); // 20 bytes
Encryption Algorithm#
The encryption uses a simple but effective XOR stream cipher with the generated key.
Algorithm Characteristics:
- Method: XOR (Exclusive OR) cipher
- Key Stream: The 20-byte key is repeated cyclically to match the data length
- Scope: Only file data is encrypted; headers and index structures remain in plaintext
- Per-File Encryption: Each file is encrypted independently, with the key stream starting from the beginning for each file
Encryption Formula:
$$C_i = P_i \oplus K_{i \bmod L}$$Where:
- \(C_i\) is the \(i\)-th byte of ciphertext (encrypted data)
- \(P_i\) is the \(i\)-th byte of plaintext (original data)
- \(K\) is the 20-byte encryption key
- \(L = 20\) (key length in bytes)
- \(i\) is the byte position within the current file (starting from 0 for each file)
Encryption Example:
// Each file is encrypted independently
for file in files {
let mut file_data = read_file_data(file);
// XOR with key starting from offset 0 for this file
for (i, byte) in file_data.iter_mut().enumerate() {
*byte ^= key[i % key.len()];
}
write_encrypted_data(file_data);
}
For large files that are processed in chunks, the offset is tracked within that file only:
let mut offset_in_file = 0;
while let Some(chunk) = read_chunk() {
for (i, byte) in chunk.iter_mut().enumerate() {
*byte ^= key[(offset_in_file + i) % key.len()];
}
offset_in_file += chunk.len();
}
Decryption#
Decryption is identical to encryption due to the XOR cipher’s symmetric nature:
$$P_i = C_i \oplus K_{i \bmod L}$$Each file is decrypted independently using the same key, with the key stream restarting from position 0 for each file.

