User Tools

Site Tools


atrac3p:serialized_tone_data

Serialized tone data

Tonal components are sine-shaped signals that get added after IMDCT on decoding.

Header info

Tonal components have their own stereo processing which is not connected to the residual spectrum stereo processing, this is the cause that the serialized tone data has a header that is not replicated per channel. The first bit in the tone info header defines the dynamic range. In the high-dynamic-range mode, the levels for each tone are chosen on a 64 step exponential amplitude scale, while in low-dynamic range, only an overall level is chosen on the exponential scale, and the individual tones have the level chosen on a 16-level linear scale.

The number of bands with tones encoded is independent of the number of bands with residual spectral data and given after the choice of dynamic range. For stereo streams, flags for each band follow, namely whether tones in these band are shared between channels, whether master is left or right and whether the right channel should receive a 180° phase shift.

Finally, for each channel the tone info is stored. For the master channel, tone info for all bands is stored while for the slave channel, tone info for the shared bands is missing, as it was already encoded for the master channel. Each band with at least one tone has one common optional start and optional end time for all tones in that band, while each individual tone has pitch, level and phase info.

Start/End info

For the master channel, start/end info is always directly encoded: Both the start and end time are encoded as one bit indicating whether there is a star or end time followed by 5 bits for the start/end time. On the slave channel, there is one mode bit indicating whether the start/end info is present for the slave just as it is for the master channel, or the master channel info should be copied to the slave channel.

Tone count info

For each band, the count of tones in this band is encoded in one out of two modes for the master channel, or one out of four modes for the slave channel. The sum of all tone counts (shared master/slave bands have their tones counted only once) should not be bigger than 48.

Encoding Modes

0: plain

For each band with tones, a four-bit number tells the number of tones in that band.

1: variable-length encoded

For each band with tones, a VLC symbol is stored giving the number of tones in that band (maximum is 7 in this case).

2 (only in slave channel): variable-length encoded difference to master channel

For each band with tones, a VLC symbol is stored telling the difference of the tone count between master and slave channel. The difference is a signed 3-bit numbers, while the tone count is 4 bits, and wraparound occurs after 15

3 (only in slave channel): clone master channel info

No data is stored in this case.

Tone pitches

For each tone, the pitch is stored, using near-plain encoding or difference-to-master encoding (obviously available only on the slave channel)

Encoding Modes

0: near-plain

If there is more than one tone in the band, a bit flag tells whether the tones are stored in ascending or descending pitch order. (for only one tone, order obviously doesn't matter). Tone pitches are numbers between 0 and 1023, duplicates are encodable (i.e. two times the same pitch in succession). The first tone is always encoded using ten bits. In ascending mode: each further tone is encoded with a bit count depending on the pitch of the last tone. If the previous tone was below 512, 10 bits are used. If the tone was above or equal to 512, but below 768, 9 bits are used, and 512 is added to the number to obtain the pitch, and so on, up 2 bits if the previous tone was between 1020 (inclusive) and 1022 (exclusive), where 1020 is added to obtain the pitch and finally 1 bit if the previous tone was 1022 or 1023 using a base of 1023. Following the pattern a zero-bit encoding would be expected if the last tone was 1023; this is not the case.

In descending mode: Tones are encoded with 10 bits if the previous tone has a pitch above or equal to 512, encoded with 9 bits if the previous tone has a pitch above or equal to 256 and so on, up to a 1 bit encoding if the previous tone had a pitch of one or zero. The pitches are then reversed (to re-obtain an increasing order) before they are stored into the tones. That means even in decreasing-pitch storage orders, the tone with the lowest index is the one with the lowest pitch.

1: variable-length-encoded difference-to-master

In each band, the slave pitches of all tones are encoded as difference to the master pitch with the same tone index in that band (if present) or the master tone with the highest index that is present, or, if there are no master tones at all, as difference to 0. Difference application wraps at 1024.

Tone Linking

For the compression of level information in the slave channel, tones in the slave channel are linked to tones in the master channel based on their pitch. The linking algorithm is like this:

  • Look for the tone(s) in the master channel which have the lowest absolute deviation in pitch from the current slave tone. If more than one tone with the same absolute deviation is found, pick the first one. If the absolute deviation in pitch is less then 8, link to that tone.
  • Otherwise, link to the tone with the same index inside the band of the master channel, if that exists.
  • Otherwise, don't link the tone at all.

Tone Levels (HDR mode)

In high-dynamic range mode, the level of each tone is stored on a logarithmic scale. The value between 0 and 63 can be encoded in one of four modes, the last two only being available on the slave channel (as they refer linked tones).

Encoding Modes

0: direct encoding

The level of each tone in each band is encoded as a plain 6-bit number.

1: variable-length encoding

The level of each tone in each band is stored using a variable-length code. The possible level values range between 20 and 51 in this case.

2 (only in slave channel): variable-length encoded difference to master

The level of each tone in each band is stored as variable-length-encoded difference to the level of the linked tone in the master channel, which is assumed as 34 if there is no linked tone. The difference application does not wrap.

3 (only in slave channel): clone master channel data

The level of each tone that is linked to a master tone is copied from the master tone level, unlinked tones get a default level of 32.

Band Base Levels (LDR mode)

In low-dynamic range mode, the base level of all tones in a band is described by one common value on an logarithmic scale, with each individual tone having a linear scaled additional level. This allows a more fine-grained level control if all tones in a band have approximately the same level. The encoding modes are similar, but not equivalent to the level info of the HDR mode:

Encoding Modes

0: direct encoding

The base level of each band is encoded as a plain 6-bit number.

1: variable-length encoding

The base level of each band is stored using a variable-length code. The possible level values range between 24 and 55 in this case.

2 (only in slave channel): variable-length encoded difference to master

The base level of each band is stored as variable-lenght-encoded difference to the master level of base level of the corresponding band in the master channel, which is assumed as 44 if the master channel has no tones. Difference application does not wrap.

3 (only in slave channel): clone master channel data

The base level of each band is copied from the corresponding master base level; Bands without tones in the master channel get a base level of 49.

Tone Levels (LDR mode)

The individual linearly-scaled tone levels (on a scale between 0 and 15) in LDR mode are stored similar to the logarithmic tone levels in HDR mode:

Encoding Modes

0: direct encoding

The level of each tone is encoded as a plain 4-bit number

1: variable-lenght encoding

The level of each tone is encoded using a variable-lenght code. Bands with just one tone use a different code than other bands, probably because the coarse scaling using the logarithmically coded base level can be adjusted more precisely to match the tone level.

2 (only in slave channel): variable-length encoded difference to master

The level of each tone in each band is stored as variable-lenght-encoded difference to the level of the linked tone in the master channel, which is assumed as 12 if there is no linked tone. Difference applications wraps around.

3 (only in slave channel): clone master channel data

The level of each tone that is linked to a master tone is copied from the master tone level, unlinked tones get a default level of 14.

Tone Phase

The phase is encoded as plain 5-bit-number for each tone.

Encoding

Tone info header

  • 1 bit: Chooses high-dynamic-range mode if set, low-dynamic-range mode if clear
  • symbol from bands with tones tree (in the range 1..16 bands).
  • in two-channel substreams:
    • 1 bit: If clear, never clone information from master to slave for any band otherwise
      • 1 bit: If clear, clone information from master to slave for all bands, otherwise
        • 1 bit per band with tone data: If set, clone data for this band
    • 1 bit: If clear, master channel is the left channel for all bands
      • 1 bit: If clear, master channel is the right channel for all bands
        • 1 bit per band with tone data: If set, master channel is the right channel
    • 1 bit: If clear, use right channel as-is
      • 1 bit: If clear, apply 180° phase shift (=negation) to right channel
        • 1 bit per band with tone data: Id set, right channel is negated.
  • for each channel:
    • per-channel tone info (see below)

Per-channel tone info

  • per-band data:
    • start/end positions
    • counts of tones in the bands
  • per-tone data:
    • pitches
    • if in HDR mode
      • tone levels (HDR)
    • otherwise
      • band base levels (LDR)
      • tone levels (LDR)
    • phase

Start/End Positions

  • if on slave channel:
    • 1 bit to choose cloning mode if set, or plain encoding if clear
  • if in plain encoding mode (always on master channel)
    • for each band that might have tone data (all bands up to the highest band with tones in the master channel, all uncloned bands in the slave channel)
      • 1 bit: enable start ramp, if set:
        • 5 bits: start ramp position.
      • 1 bit: enable end ramp, if set:
        • 5 bits: end ramp position.
  • otherwise (in clone mode)
    • no data

Band Tone Count

  • on master channel
    • 1 bit: coding mode
  • on slave channel
    • 2 bits: coding mode
  • encoded tone counts (see directly below)

Coding Mode 0: direct encoding

  • for each band that might have tone data:
    • 4 bit tone count

Coding Mode 1: variable-length encoding

Coding Mode 2 (slave only): variable-length delta-to-master encoding

  • for each band that might have tone data:

Coding Mode 3 (slave only): clone master

  • no data

Tone pitches

  • On slave channel only:
    • 1 bit: coding mode (master always uses mode 0)
  • encoded pitch info (see directly below)

Coding mode 0: near-direct encoding

  • for each band that has a non-zero tone count:
    • if the tone count is bigger than 1:
      • 1 bit, chooses "decreasing" mode if set, or "increasing" mode if clear. (for only one tone, order doesn't matter)
    • for each tone in that band:
      • up to ten bits: near-direct encoded pitch. In ascending mode, leading 1 bits of the current tone pitch that were already set in the previous pitch are omitted (note that the count if leading one bits is monotonically increasing). Analogously, in descending mode, leading 0 bits of the current tone pitch that were already clear in the previous pitch are omitted.

Coding mode 1: difference-to-master

Tone levels (HDR)

  • on master channel
    • 1 bit: coding mode
  • on slave channel
    • 2 bits: coding mode
  • encoded level info (see directly below)

Coding mode 0: direct encoding

  • for each tone in each band:
    • 6 bits: binary encoded tone level

Coding mode 1: variable-length encoding

Coding mode 2 (slave only): variable-lenght encoded difference to master

Coding mode 3 (slave only): clone master

  • no data

Band base levels (LDR)

  • on master channel
    • 1 bit: coding mode
  • on slave channel
    • 2 bits: coding mode
  • encoded level info (see directly below)

Coding mode 0: direct encoding

  • for each band:
    • 6 bits: binary encoded base level

Coding mode 1: variable-length encoding

  • for each band:

Coding mode 2 (slave only): variable-lenght encoded difference to master

Coding mode 3 (slave only): clone master

  • no data

Tone levels (LDR)

  • on master channel
    • 1 bit: coding mode
  • on slave channel
    • 2 bits: coding mode
  • encoded level info (see directly below)

Coding mode 0: direct encoding

  • for each tone in each band:
    • 4 bits: binary encoded tone level

Coding mode 1: variable-length encoding

Coding mode 2 (slave only): variable-lenght encoded difference to master

Coding mode 3 (slave only): clone master

  • no data

Tone phase

  • for each tone in each band:
    • 5 bit phase value
atrac3p/serialized_tone_data.txt · Last modified: 2010/10/25 22:28 by megadiscman

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki