This code is part of 2 dimentional code family, it can encode up to 2335 characters on a very small surface. The encoding is done in two stages : first the datas are converted to 8 bits "codeword" (High level encoding) then those are converted to small black and white squares. (Low level encoding) Moreover an error correction system is included, it allows to reconstitute badly printed, erased, fuzzy or torn off datas. In the continuation of this talk, the word "codeword" will be shortened into CW.
The symbol is a square or rectangular array made with rows and columns. Each cell is a small square black for a bit set to 1 and white for a bit set to 0. The dimension of the square is named the module.
Thereafter we'll use operators : + > addition, x > multiplication, \ > integer division, MOD > remainder of the integer division.
There are 24 sizes of square symbol and 6 sizes of rectangular symbol. The following array give basic values for each symbol size.
Symbol size 
Number of 
Number of Reed 
Number of 
Square symbols 

10x10 
1 
5 
1 
12x12 
1 
7 
1 
14x14 
1 
10 
1 
16x16 
1 
12 
1 
18x18 
1 
14 
1 
20x20 
1 
18 
1 
22x22 
1 
20 
1 
24x24 
1 
24 
1 
26x26 
1 
28 
1 
32x32 
2x2 
36 
1 
36x36 
2x2 
42 
1 
40x40 
2x2 
48 
1 
44x44 
2x2 
56 
1 
48x48 
2x2 
68 
1 
52x52 
2x2 
2 x 42 
2 
64x64 
4x4 
2 x 56 
2 
72x72 
4x4 
4 x 36 
4 
80x80 
4x4 
4 x 48 
4 
88x88 
4x4 
4 x 56 
4 
96x96 
4x4 
4 x 68 
4 
104x104 
4x4 
6 x 56 
6 
120x120 
6x6 
6 x 68 
6 
132x132 
6x6 
8 x 62 
8 
144x144 
6x6 
10 x 62 
8 
Rectangular symbols 

8x18 
1 
7 
1 
8x32 
2 
11 
1 
12x26 
1 
14 
1 
12x36 
1x2 
18 
1 
16x36 
1x2 
24 
1 
16x48 
1x2 
28 
1 
Each region has a one module wide perimeter. Left and lower sides are entirely black, right and top sides are made up of alternating black and white squares.
Each CW is placed in the matrix (If there are several regions, they are assembled to form an unique matrix) on 45 degree parallel diagonal lines and the left top corner is always as shown below
In this image, we can remark than CW nr. 2, 5 and 6 have a regular shape. CW nr. 1, 3, 4 are truncated and the remain of these CW is reported on the other side of the symbol. Here is the entire placement of the 8 x 8 matrix :
You can remark on this image that the bit 8 of each CW is under the 45 degree parallel diagonal lines. Corner and border conditions are very intricate and different for each matrix size, fortunately Datamatrix standard give us an algorithm in order to make the placement.
The hight level encoding support 6 compaction mode, ASCII mode is divided in 3 submode :
Compaction mode 
Datas to encode 
Rate compaction 
ASCII  ASCII character 0 to 127 
1 byte per CW 
ASCII extended  ASCII character 128 to 255 
0.5 byte per CW 
ASCII numeric  ASCII digits 
2 byte per CW 
C40  Uppercase alphanumeric 
1.5 byte per CW 
TEXT  Lowercase alphanumeric 
1.5 byte per CW 
X12  ANSI X12 
1.5 byte per CW 
EDIFACT  ASCII character 32 to 94 
1.33 bytet per CW 
BASE 256  ASCII character 0 to 255 
1 byte per CW 
The default character encodation method is ASCII. Some special CWs allow to switch between the encoding methods
Codeword 
Data or function 
1 to 128 
ASCII datas 
129 
Padding 
130 to 229 
Pair of digits : 00 to 999 
230 
Switch to C40 method 
231 
Switch to Base 256 method 
232 
FNC1 character 
233 
Structure of several symbols 
234 
Reader programming 
235 
Shift to extended ASCII for one character 
236 
Macro 
237 
Macro 
238 
Switch to ANSI X12 method 
239 
Switch to TEXT method 
240 
Switch to EDIFACT method 
241 
Extended Channel Interpretation character 
254 
If ASCII method is in force : End of datas, next CWs are pads CW If other method is in force : Switch back to ASCII method or indicate end of datas 
If the symbol is not full, pad CWs are required. After the last data CW, the 254 CW indicates the end of the datas or the return to ASCII method. First padding CW is 129 and next padding CWs are computed with the 253state algorithm.
● ASCII character in the range 0 to 127
CW = "ASCII value" + 1
● Extended ASCII character in the range 128 to 255
A first CW with the value 235 and a second CW with the value : "ASCII value"  127
● Pair of digits 00, 01, 02 ..... 99
CW = "Pair of digits numerical value" + 130
C40 and TEXT modes are similar : only uppercase and lowercase characters are inverted.
In these modes 3 data characters are compacted in 2 CWs. In C40 and TEXT modes 3 shift characters allow to indicate an other character set for the next character.
The 16 bits value of a CW pair is computed as following :
Value = C1 * 1600 + C2 * 40 + C3 + 1 with C1, C2 and C3 the 3 character values to compact.
254 CW indicate a return to the ASCII method exept if this mode allows to fill completely the symbol.
In C40 and TEXT mode a pad character with 0 value can be added at the 2 last characters in order to form a pair of CW.
If it remains to encode only one character in C40 or TEXT mode or 2 character in X12 mode; it(they) must be encoded with ASCII method but if a single free CW remain in the symbol before data correction CWs, it is assumed that this CW is encoded using ASCII method without using the 254 CW.
"Upper Shift" character enable to encode extended ASCII character.
● Generate code "1" to switch to set 2, then the code 30 which is the "upper shift" code.
● Substract 128 from the ASCII value of the character to encode; we obtains a not extended character.
● Encode normally this character with changing the set if necessary
In this mode 4 data characters are compacted in 3 CWs. Each EDIFACT character is coded with 6 bits which are the 6 last bits of the ASCII value.
EDIFACT value 
ASCII value character 
Comment 
0 to 30  64 to 94 
EDIFACT value = ASCII value  64 
31 
End of datas, return to ASCII mode 

32 to 63  32 to 63 
EDIFACT value = ASCII value 
This mode can encode any byte.
After the 231 CW which switch to "base 256" mode, there is a length field. This field is build with 1 or 2 bytes.
Let N the number of data to encode :
If N < 250 a single byte is used, its value is N (from 0 to 249)
If N >= 250 two bytes are used, the value of the first one is : (N \ 250) + 249 (Values from 250 to 255) and the value of the second one is N MOD 250 (Values from 0 to 249).
If N finishes the filling of the symbol: the value of the byte is 0.
Moreover each CW (including the length field) must be computed with the 255state algorithm
The correction system is based on "Reed Solomon" codes which enjoy the math students and terrify others ...
The number of correction CWs depend of the matrix size, more exactly it depend of the bloc size.
Reed Solomon codes are based on a polynomial equation where x power is the number of error correction CWs used. For sample with the 8 x 8 matrix we use an equation like this : x^{5} + ax^{4} + bx^{3} + cx^{2} + dx + e. The numbers a, b, c, d and e are the factors of the polynomial equation.
For information the equation is : (x  2)(x  2^{2})(x  2^{3}).....(x  2^{k}) We develop the polynomial equation with Galois arithmetic on each factor...
There is 16 Reed Solomon block size (See table ) : 5, 7, 10, 11, 12, 14, 18, 20, 24, 28, 36, 42, 48, 56, 62, 68. The factors of these 16 polynomial equations have been precomputed. You can see the factors file.
Rather than to draw the algorithm used to compute the correction CWs, I prefer to provide it to you in Basic.
Let k the number of correction CWs, a the factors array, m the number of data CWs, d the data CWs array and c the correction CWs array. We'll use a temporary variable t.
c and t are inited with 0. And let's go with the math fiddle :
For i = 0 To m  1 t = (d(i) Xor c(k  1)) For j = k  1 To 0 Step 1 If t = 0 Then c(j) = 0 Else c(j) = Mult(t, a(j)) End If If j > 0 Then c(j) = c(j  1) Xor c(j) Next Next