Audiveris Design Manual

BEWARE: This documentation has not yet been updated for a while, please consider this file as obsolete.

Purpose of this design manual is to present the rationale for key design decisions within the Audiveris application.

Pixel Handling

Pixel Runs

The most basic entity handled by a lag is a run, although a run is a stand-alone entity that can be used outside of any containing lag. A run is a sequence of foreground pixels (we don't consider runs of background pixels for this application, except for the initial scale computation). All the pixels of one run are contiguous pixels in the same orientation (all runs are horizontal for a horizontal lag, and similarly in a vertical lag, all runs are defined as vertical sequences of pixels).

Foreground vs. Background

In Audiveris, lag runs are created by processing a provided image, where pixels are defined as levels of gray. The only degree of freedom is the precise gray level value that separates a rather white pixel (thus assigned to background) from a rather black pixel (thus assigned to foreground). This value is a symbolic constant in class omr.sheet.Picture, and is assigned value 227 (pixels levels are in the range 0 .. 255). Any pixel whose level is higher than or equal to this value will be considered as foreground.

Runs are created by an instance of class RunsBuilder, which needs an adapter to access individual pixels and to do whatever processing is needed at the end of each run read :

Lag Sections

Sections are collections of runs. Within a given section, all runs are stuck side by side. In a horizontal lag, all sections are made of runs piled one on top of the other. A vertical lag is more like Manhattan.

Junctions

The main question is to determine when a section ends. On both edges of a section, the termination can be trigerred by:

Sections are created by an instance of SectionsBuilder, which thus populates its related lag, according to the provided JunctionPolicy.
Lag instances in Audiveris

Beside a tiny lag allocated for each synthetic SymbolIcon, Audiveris application uses 3 lag instances based on pixels from the same sheet picture in the following order:

  1. "sLag", a horizontal lag created by SkewBuilder, is used for computing the global skew angle of the sheet. Note that this lag cannot be reused for the next step for two reasons: First, only runs of significant length are built, since we are interested only in long sections to determine skew trends, and thus many pixels are discarded though they are present in the original image. Second, when a rotation is needed because the sheet must be deskewed, a new (rotated) image is generated and thus all pixels are impacted.
  2. "hLag", a horizontal lag created by LinesBuilder and reused by HorizontalsBuilder, is used to retrieve staff lines and horizontal sticks (ledgers for example). At the end of horizontal processing, certain pixels are erased because they were part of the removed staff lines. New ones are created to extend objects that were crossed by the former staff lines, so that these objects get their normal appearance back, as if no staff line had ever been drawn upon them.
  3. "vLag", a vertical lag created by BarsBuilder and reused by all subsequent steps, is used to retrieve vertical sticks (barlines, stems) but also glyphs with no particular orientation.

Glyphs and Sticks

Glyphs are collections of sections. These sections are actually instances of GlyphSection, a subclass of Section, which keeps a reference back to the containing glyph. A GlyphLag is a lag augmented with a collection of glyphs.

How sections are assembled to form a glyph can use any arbitrary logic:

    1. In step LINES, there is one LineBuilder instance per each candidate staff line area to build the actual staff line. Section filtering is based on vertical position of the horizontal section (it must be located in the staff line area).
    2. Still in step LINES, other instances of LineBuilder are also used to scan areas where staff lines are interrupted due to missing pixels. They use the same source, so the same section filtering, that is used for the main staff line retrieval.
    3. In step HORIZONTALS, one single HorizontalArea instance is used to retrieve horizontal dashes (ledgers and endings) in the whole sheet.
    4. In step BARS, one single instance of VerticalArea is used to retrieve barlines in the whole sheet. A section predicate is used to keep only the sections that are not successfully assigned (their result derives from FailureResult). [TBD: Check whether there is duplication between "result" and "assigned glyph" criteria]
    5. In step VERTICALS, one single instance of VerticalArea is used to retrieve stems in the whole sheet. A section predicate keeps only sections that are assigned to either no glyph or assigned to just noise or structure glyphs.

Whether we build mere glyphs or specific sticks, the assembly is made out of a population of "available" sections, that is sections not yet assigned to a recognized glyph, or assigned to a recognized glyph that we want to break apart. An example of such breaking can be found in Structure shape: In the early symbol recognition step,  a Structure glyph (assembly of notes, stems and beams all connected together) can be built and recognized as a Structure. Then, in a later step, some of its sections will be reused to build vertical sticks (the stems) while the remaining sections will be assembled into leaf symbols (the hote heads, the beams).

Link between glyph and its member sections

While sections in a Lag (or GlyphLag) are built once for all, the containment link between sections and glyph is more volatile:

What physically defines a glyph is the collection of its member sections. Through out the various glyph and stick extractions, the same collection of sections may lead to new instances of glyphs/sticks which in fact represent already existing glyphs. And since some properties may be attached to a glyph (as of today, we keep a set of forbidden shapes per glyph), we want to keep these properties even if a "new" glyph has been created out of the very same member sections. To do this, we handle a GlyphSignature, based on simple physical characteristics (weight and contour box), to detect identical glyphs. And the set of forbidden shapes for a given glyph is in fact not directly attached to the glyph but handled by a map using the glyph signature as key.