Archives

 Digital Preservation Rules*

RULE NUMBER
RECORD TYPE
PRESCRIBED ACTION
SUPPORT LEVEL
NOTES
1
Word processing and presentation application  files in the following formats: .doc, .docx, .docm, .ppt, .pptx, .rtf, .wpd, .odt The State Archives converts proprietary desktop productivity application files to the PDF/A  file format. It may be necessary to first convert these files to an intermediate format prior to converting them to PDF/A.
Limited
Word processing documents and slideshows in propriety file formats should be converted to PDF/A before being transferred to the State Archives if possible.
2
Spreadsheet files in the following formats: .xls, .xlsx, .xlsm, .wk2, .wk4, .odt, .qpw, .wb1, .wb2, .wb3, .wq1, .wq2 The State Archives converts these files to either the .csv or PDF/A file format based on whether the appearance of the information (PDF/A) or the content of the information (.csv) is paramount. It may be necessary to first convert these files to an intermediate format prior to converting them to the chosen preservation format.
Limited
The State Archives prefers that producers who maintain spreadsheet files in propriety formats (Excel, Lotus, Quattro, etc.) convert the files to open standards  prior to donating the files to the archives.
3
Portable documents in the following format: PDF/A The State Archives maintains these files in their native formats. No transformations are enacted on these files for preservation purposes
Full
PDF/A is the preferred version of PDF for archival preservation. PDF/A-1 (ISO 19005-1:2005) and PDF/A/2 (ISO 19005-2:2011) are both supported.
4
Portable documents in the following format: PDF The State Archives maintains these files in their native formats. No transformations are enacted on these files for preservation purposes
Limited
PDF files may be converted to PDF/A as time and resources allow. It is prefered that producers perform this conversion prior to donating their records to the archives
5
Electronic publishing files (excludes PDF): .epub, .mobi, .indd, .pub The State Archives converts electronic publishing files to the .PDF/A file format. It may be necessary to first convert these files to an intermediate format prior to converting them to .PDF/A.
Limited
The State Archives prefers that producers who maintain electronic publishing files in propriety formats convert the files to an open standard such as .epub or .pdf prior to donating the files to the archives.
6
Desktop database management system files (non-sql): .accdb, .mdb, .fm, .fmp, .odb The State Archives maintains these files in their native formats. No transformations are enacted on these files for preservation purposes.
Basic
The State Archives prefers that producers who maintain data in propriety desktop database management systems like MS Access convert the files to open standards prior to donating the files to the archives.
7
Plain text files (.txt) The State Archives maintains these files in their native formats. No transformations are enacted on these files for preservation purposes
Full
Plain text is considered to be a longterm archival preservation format.
8
Text files  containing data in the following formats: .csv, .tsv The State Archives maintains these files in their native formats. No transformations are enacted on these files for preservation purposes
Full
Plain text is considered to be a longterm archival preservation format.
9
Text files employing markup languages: .xml, .html, .xhtml, .css, .xsl, .tex The State Archives maintains these files in their native formats. No transformations are enacted on these files for preservation purposes
Full
Plain text is considered to be a longterm archival preservation format.
10
Geospatial (GIS) data files: .mxd, .shp The State Archives maintains these files in their native formats. No transformations are enacted on these files for preservation purposes.
Basic
The State Archives prefers that producers who maintain GIS data in propriety formats convert the files to open data and/or image standards such as .kml (for data) and .tif (for images) prior to donating the files to the archives.
11
Digital Images in the following formats: .jpg, .j2c, .gif, .png, .bmp The State Archives maintains most digital images in their native file formats (see rule 13 for exceptions). No transformations are enacted on these files for preservation purposes
Limited
Higher spatial resolution is preferred over lower resolution; uncompressed (TIFF) or losslessly compressed (JPEG 2000) images are preferred over those with lossy compression.
12
Digital Images in the following formats: .tif, .jpg2 The State Archives maintains TIFF and JPEG 2000 images in their original format.
Full
Both TIFF and JPEG 200O are considered longterm archival preservation formats.
13
Digital Images (Raw Image Formats): .cr2, .crw, .dcr, .kdc, .nef, .orf, .pef, .raf, .srf, .x3f, .dng Proprietary raw image formats will be converted to a standard preservation format such as TIFF. It may be necessary to first convert these files to an intermediate format such as .dng prior to converting them to a preservation format.
Limited
Higher spatial resolution is preferred over lower resolution; uncompressed (TIFF) or lossless compressed (JPEG 2000) images are preferred over those with lossy compression.
14
Digital Audio: .mp3, .mp4,. wma, .wav, .ogg, .flac The State Archives maintains digital audio files in their native file formats. No transformations are enacted on these files for preservation purposes
Limited
Higher sampling rate is preferred over lower sampling rate; 24-bit sample word-length preferred over shorter; uncompressed files (WAV/BWF) or lossless compressed (.flac) files preferred over those with lossy compression; if compressed, AAC compression preferred over MPEG-layer 2 (MP3).
15
Digital Video: .avi, .flv, .mov, .mp4, .mkv, .vob The State Archives maintains digital video files in their native file formats. No transformations are enacted on these files for preservation purposes
Limited
Video files usually contain multiple formats within a wrapper. These formats may include audio formats, text formats (for closed caption), graphical formats, and others. Typically the also have some form of compression as well. Higher bit rate is preferred over lower for same compression scheme; uncompressed  video or those using lossless compression codecs in wrappers that allow for the addition of metadata is preferred to those using lossy codecs.
16
Other formats The State Archives is actively monitoring the development of preservation formats for other file types. Preservation decisions for  formats not addresed in this document will be made on a case-by-case basis.  If you have files that exist in formats other than those discussed here, please contact the State Archives to discuss preservation  options.
TBD
TBD
Note: File extensions provided for each record type above are intended to serve as examples only and do not constitute a comprehensive list of all file formats accepted by the State Archives.

Note on compression and encryption
The State Archives may only be able to guarantee digital preservation support at the basic level in cases where files have been compressed prior to transfer to the archives. When selecting a compression type, it is important to remember that lossy compression is irreversible and some level of detail will be lost and unrecoverable whenever it is utilized.

The State Archives is unable to preserve and provide access to encrypted files unless the appropriate passwords and/or key files necessary to decrypt the files have also been transferred to the archives.

*Adapted from Tufts University Digital Collections & Archives Preservation Rules