2017-04-17

2017-04-17 AtomicityCodec Introduction

AtomicityCodec is an XML-based grammar designed for declaratively specifying the outline structure of a file.  I have been working recently to create it to fill a number of roles within my OS.  This is a long post to explain the what and why, but I'm leaving the actual technical details of the grammar for a future post.

The grammar is designed to achieve the following goals:

  • Define a wide variety of file formats including filesystems.
  • Provide a first-level parsing of files for an associated codec to then use a reference for further parsing.
  • Provide a means of identifying a file based on its content.
  • Serve as a basis for an "intelligent" hex viewer, which will have applications beyond AtomicityOS.
  • Provide a means for basic file format or minor variant of an existing format to be added to the system without needing to write any code.
  • Design a binary file parsing grammar that will have uses to a wider audience than just myself.
  • Possibly have the format usable to save out files as well as load them.

Background


I have long been interested in file formats, and always wanted a means to easily define, modify, and play with files in an easy and responsive manner.  There are a few hex viewer tools on the market that allow a structure definition to be applied to the data to visualise or extract values, but these tend to be rather limited.  A number of groups have attempted to create a grammar to describe a binary file format, but these projects have either been abandoned, or commercial with a limited usefulness.  So I have defined my own grammar (as with everything else) which is still evolving (I'm considering adding iterative loops) but is sufficiently complete to describe a host of formats in a breadth-first manner.  This grammar allows me to make ad-hoc changes to how the structure of a given file is parsed and to see the results of those changes in real-time.