1 files changed, 209 insertions, 209 deletions
diff --git a/utils/zenutils/libraries/zlib123/zlib/algorithm.txt b/utils/zenutils/libraries/zlib123/zlib/algorithm.txt
index 9f6b06808c..b022dde312 100755..100644
--- a/utils/zenutils/libraries/zlib123/zlib/algorithm.txt
+++ b/utils/zenutils/libraries/zlib123/zlib/algorithm.txt
@@ -1,209 +1,209 @@
-1. Compression algorithm (deflate)
+1. Compression algorithm (deflate)
-The deflation algorithm used by gzip (also zip and zlib) is a variation of
+The deflation algorithm used by gzip (also zip and zlib) is a variation of
-LZ77 (Lempel-Ziv 1977, see reference below). It finds duplicated strings in
+LZ77 (Lempel-Ziv 1977, see reference below). It finds duplicated strings in
-the input data.  The second occurrence of a string is replaced by a
+the input data.  The second occurrence of a string is replaced by a
-pointer to the previous string, in the form of a pair (distance,
+pointer to the previous string, in the form of a pair (distance,
-length).  Distances are limited to 32K bytes, and lengths are limited
+length).  Distances are limited to 32K bytes, and lengths are limited
-to 258 bytes. When a string does not occur anywhere in the previous
+to 258 bytes. When a string does not occur anywhere in the previous
-32K bytes, it is emitted as a sequence of literal bytes.  (In this
+32K bytes, it is emitted as a sequence of literal bytes.  (In this
-description, `string' must be taken as an arbitrary sequence of bytes,
+description, `string' must be taken as an arbitrary sequence of bytes,
-and is not restricted to printable characters.)
+and is not restricted to printable characters.)
-Literals or match lengths are compressed with one Huffman tree, and
+Literals or match lengths are compressed with one Huffman tree, and
-match distances are compressed with another tree. The trees are stored
+match distances are compressed with another tree. The trees are stored
-in a compact form at the start of each block. The blocks can have any
+in a compact form at the start of each block. The blocks can have any
-size (except that the compressed data for one block must fit in
+size (except that the compressed data for one block must fit in
-available memory). A block is terminated when deflate() determines that
+available memory). A block is terminated when deflate() determines that
-it would be useful to start another block with fresh trees. (This is
+it would be useful to start another block with fresh trees. (This is
-somewhat similar to the behavior of LZW-based _compress_.)
+somewhat similar to the behavior of LZW-based _compress_.)
-Duplicated strings are found using a hash table. All input strings of
+Duplicated strings are found using a hash table. All input strings of
-length 3 are inserted in the hash table. A hash index is computed for
+length 3 are inserted in the hash table. A hash index is computed for
-the next 3 bytes. If the hash chain for this index is not empty, all
+the next 3 bytes. If the hash chain for this index is not empty, all
-strings in the chain are compared with the current input string, and
+strings in the chain are compared with the current input string, and
-the longest match is selected.
+the longest match is selected.
-The hash chains are searched starting with the most recent strings, to
+The hash chains are searched starting with the most recent strings, to
-favor small distances and thus take advantage of the Huffman encoding.
+favor small distances and thus take advantage of the Huffman encoding.
-The hash chains are singly linked. There are no deletions from the
+The hash chains are singly linked. There are no deletions from the
-hash chains, the algorithm simply discards matches that are too old.
+hash chains, the algorithm simply discards matches that are too old.
-To avoid a worst-case situation, very long hash chains are arbitrarily
+To avoid a worst-case situation, very long hash chains are arbitrarily
-truncated at a certain length, determined by a runtime option (level
+truncated at a certain length, determined by a runtime option (level
-parameter of deflateInit). So deflate() does not always find the longest
+parameter of deflateInit). So deflate() does not always find the longest
-possible match but generally finds a match which is long enough.
+possible match but generally finds a match which is long enough.
-deflate() also defers the selection of matches with a lazy evaluation
+deflate() also defers the selection of matches with a lazy evaluation
-mechanism. After a match of length N has been found, deflate() searches for
+mechanism. After a match of length N has been found, deflate() searches for
-a longer match at the next input byte. If a longer match is found, the
+a longer match at the next input byte. If a longer match is found, the
-previous match is truncated to a length of one (thus producing a single
+previous match is truncated to a length of one (thus producing a single
-literal byte) and the process of lazy evaluation begins again. Otherwise,
+literal byte) and the process of lazy evaluation begins again. Otherwise,
-the original match is kept, and the next match search is attempted only N
+the original match is kept, and the next match search is attempted only N
-steps later.
+steps later.
-The lazy match evaluation is also subject to a runtime parameter. If
+The lazy match evaluation is also subject to a runtime parameter. If
-the current match is long enough, deflate() reduces the search for a longer
+the current match is long enough, deflate() reduces the search for a longer
-match, thus speeding up the whole process. If compression ratio is more
+match, thus speeding up the whole process. If compression ratio is more
-important than speed, deflate() attempts a complete second search even if
+important than speed, deflate() attempts a complete second search even if
-the first match is already long enough.
+the first match is already long enough.
-The lazy match evaluation is not performed for the fastest compression
+The lazy match evaluation is not performed for the fastest compression
-modes (level parameter 1 to 3). For these fast modes, new strings
+modes (level parameter 1 to 3). For these fast modes, new strings
-are inserted in the hash table only when no match was found, or
+are inserted in the hash table only when no match was found, or
-when the match is not too long. This degrades the compression ratio
+when the match is not too long. This degrades the compression ratio
-but saves time since there are both fewer insertions and fewer searches.
+but saves time since there are both fewer insertions and fewer searches.
-2. Decompression algorithm (inflate)
+2. Decompression algorithm (inflate)
-2.1 Introduction
+2.1 Introduction
-The key question is how to represent a Huffman code (or any prefix code) so
+The key question is how to represent a Huffman code (or any prefix code) so
-that you can decode fast.  The most important characteristic is that shorter
+that you can decode fast.  The most important characteristic is that shorter
-codes are much more common than longer codes, so pay attention to decoding the
+codes are much more common than longer codes, so pay attention to decoding the
-short codes fast, and let the long codes take longer to decode.
+short codes fast, and let the long codes take longer to decode.
-inflate() sets up a first level table that covers some number of bits of
+inflate() sets up a first level table that covers some number of bits of
-input less than the length of longest code.  It gets that many bits from the
+input less than the length of longest code.  It gets that many bits from the
-stream, and looks it up in the table.  The table will tell if the next
+stream, and looks it up in the table.  The table will tell if the next
-code is that many bits or less and how many, and if it is, it will tell
+code is that many bits or less and how many, and if it is, it will tell
-the value, else it will point to the next level table for which inflate()
+the value, else it will point to the next level table for which inflate()
-grabs more bits and tries to decode a longer code.
+grabs more bits and tries to decode a longer code.
-How many bits to make the first lookup is a tradeoff between the time it
+How many bits to make the first lookup is a tradeoff between the time it
-takes to decode and the time it takes to build the table.  If building the
+takes to decode and the time it takes to build the table.  If building the
-table took no time (and if you had infinite memory), then there would only
+table took no time (and if you had infinite memory), then there would only
-be a first level table to cover all the way to the longest code.  However,
+be a first level table to cover all the way to the longest code.  However,
-building the table ends up taking a lot longer for more bits since short
+building the table ends up taking a lot longer for more bits since short
-codes are replicated many times in such a table.  What inflate() does is
+codes are replicated many times in such a table.  What inflate() does is
-simply to make the number of bits in the first table a variable, and  then
+simply to make the number of bits in the first table a variable, and  then
-to set that variable for the maximum speed.
+to set that variable for the maximum speed.
-For inflate, which has 286 possible codes for the literal/length tree, the size
+For inflate, which has 286 possible codes for the literal/length tree, the size
-of the first table is nine bits.  Also the distance trees have 30 possible
+of the first table is nine bits.  Also the distance trees have 30 possible
-values, and the size of the first table is six bits.  Note that for each of
+values, and the size of the first table is six bits.  Note that for each of
-those cases, the table ended up one bit longer than the ``average'' code
+those cases, the table ended up one bit longer than the ``average'' code
-length, i.e. the code length of an approximately flat code which would be a
+length, i.e. the code length of an approximately flat code which would be a
-little more than eight bits for 286 symbols and a little less than five bits
+little more than eight bits for 286 symbols and a little less than five bits
-for 30 symbols.
+for 30 symbols.
-2.2 More details on the inflate table lookup
+2.2 More details on the inflate table lookup
-Ok, you want to know what this cleverly obfuscated inflate tree actually
+Ok, you want to know what this cleverly obfuscated inflate tree actually
-looks like.  You are correct that it's not a Huffman tree.  It is simply a
+looks like.  You are correct that it's not a Huffman tree.  It is simply a
-lookup table for the first, let's say, nine bits of a Huffman symbol.  The
+lookup table for the first, let's say, nine bits of a Huffman symbol.  The
-symbol could be as short as one bit or as long as 15 bits.  If a particular
+symbol could be as short as one bit or as long as 15 bits.  If a particular
-symbol is shorter than nine bits, then that symbol's translation is duplicated
+symbol is shorter than nine bits, then that symbol's translation is duplicated
-in all those entries that start with that symbol's bits.  For example, if the
+in all those entries that start with that symbol's bits.  For example, if the
-symbol is four bits, then it's duplicated 32 times in a nine-bit table.  If a
+symbol is four bits, then it's duplicated 32 times in a nine-bit table.  If a
-symbol is nine bits long, it appears in the table once.
+symbol is nine bits long, it appears in the table once.
-If the symbol is longer than nine bits, then that entry in the table points
+If the symbol is longer than nine bits, then that entry in the table points
-to another similar table for the remaining bits.  Again, there are duplicated
+to another similar table for the remaining bits.  Again, there are duplicated
-entries as needed.  The idea is that most of the time the symbol will be short
+entries as needed.  The idea is that most of the time the symbol will be short
-and there will only be one table look up.  (That's whole idea behind data
+and there will only be one table look up.  (That's whole idea behind data
-compression in the first place.)  For the less frequent long symbols, there
+compression in the first place.)  For the less frequent long symbols, there
-will be two lookups.  If you had a compression method with really long
+will be two lookups.  If you had a compression method with really long
-symbols, you could have as many levels of lookups as is efficient.  For
+symbols, you could have as many levels of lookups as is efficient.  For
-inflate, two is enough.
+inflate, two is enough.
-So a table entry either points to another table (in which case nine bits in
+So a table entry either points to another table (in which case nine bits in
-the above example are gobbled), or it contains the translation for the symbol
+the above example are gobbled), or it contains the translation for the symbol
-and the number of bits to gobble.  Then you start again with the next
+and the number of bits to gobble.  Then you start again with the next
-ungobbled bit.
+ungobbled bit.
-You may wonder: why not just have one lookup table for how ever many bits the
+You may wonder: why not just have one lookup table for how ever many bits the
-longest symbol is?  The reason is that if you do that, you end up spending
+longest symbol is?  The reason is that if you do that, you end up spending
-more time filling in duplicate symbol entries than you do actually decoding.
+more time filling in duplicate symbol entries than you do actually decoding.
-At least for deflate's output that generates new trees every several 10's of
+At least for deflate's output that generates new trees every several 10's of
-kbytes.  You can imagine that filling in a 2^15 entry table for a 15-bit code
+kbytes.  You can imagine that filling in a 2^15 entry table for a 15-bit code
-would take too long if you're only decoding several thousand symbols.  At the
+would take too long if you're only decoding several thousand symbols.  At the
-other extreme, you could make a new table for every bit in the code.  In fact,
+other extreme, you could make a new table for every bit in the code.  In fact,
-that's essentially a Huffman tree.  But then you spend two much time
+that's essentially a Huffman tree.  But then you spend two much time
-traversing the tree while decoding, even for short symbols.
+traversing the tree while decoding, even for short symbols.
-So the number of bits for the first lookup table is a trade of the time to
+So the number of bits for the first lookup table is a trade of the time to
-fill out the table vs. the time spent looking at the second level and above of
+fill out the table vs. the time spent looking at the second level and above of
-the table.
+the table.
-Here is an example, scaled down:
+Here is an example, scaled down:
-The code being decoded, with 10 symbols, from 1 to 6 bits long:
+The code being decoded, with 10 symbols, from 1 to 6 bits long:
-A: 0
+A: 0
-B: 10
+B: 10
-C: 1100
+C: 1100
-D: 11010
+D: 11010
-E: 11011
+E: 11011
-F: 11100
+F: 11100
-G: 11101
+G: 11101
-H: 11110
+H: 11110
-I: 111110
+I: 111110
-J: 111111
+J: 111111
-Let's make the first table three bits long (eight entries):
+Let's make the first table three bits long (eight entries):
-000: A,1
+000: A,1
-001: A,1
+001: A,1
-010: A,1
+010: A,1
-011: A,1
+011: A,1
-100: B,2
+100: B,2
-101: B,2
+101: B,2
-110: -> table X (gobble 3 bits)
+110: -> table X (gobble 3 bits)
-111: -> table Y (gobble 3 bits)
+111: -> table Y (gobble 3 bits)
-Each entry is what the bits decode as and how many bits that is, i.e. how
+Each entry is what the bits decode as and how many bits that is, i.e. how
-many bits to gobble.  Or the entry points to another table, with the number of
+many bits to gobble.  Or the entry points to another table, with the number of
-bits to gobble implicit in the size of the table.
+bits to gobble implicit in the size of the table.
-Table X is two bits long since the longest code starting with 110 is five bits
+Table X is two bits long since the longest code starting with 110 is five bits
-long:
+long:
-00: C,1
+00: C,1
-01: C,1
+01: C,1
-10: D,2
+10: D,2
-11: E,2
+11: E,2
-Table Y is three bits long since the longest code starting with 111 is six
+Table Y is three bits long since the longest code starting with 111 is six
-bits long:
+bits long:
-000: F,2
+000: F,2
-001: F,2
+001: F,2
-010: G,2
+010: G,2
-011: G,2
+011: G,2
-100: H,2
+100: H,2
-101: H,2
+101: H,2
-110: I,3
+110: I,3
-111: J,3
+111: J,3
-So what we have here are three tables with a total of 20 entries that had to
+So what we have here are three tables with a total of 20 entries that had to
-be constructed.  That's compared to 64 entries for a single table.  Or
+be constructed.  That's compared to 64 entries for a single table.  Or
-compared to 16 entries for a Huffman tree (six two entry tables and one four
+compared to 16 entries for a Huffman tree (six two entry tables and one four
-entry table).  Assuming that the code ideally represents the probability of
+entry table).  Assuming that the code ideally represents the probability of
-the symbols, it takes on the average 1.25 lookups per symbol.  That's compared
+the symbols, it takes on the average 1.25 lookups per symbol.  That's compared
-to one lookup for the single table, or 1.66 lookups per symbol for the
+to one lookup for the single table, or 1.66 lookups per symbol for the
-Huffman tree.
+Huffman tree.
-There, I think that gives you a picture of what's going on.  For inflate, the
+There, I think that gives you a picture of what's going on.  For inflate, the
-meaning of a particular symbol is often more than just a letter.  It can be a
+meaning of a particular symbol is often more than just a letter.  It can be a
-byte (a "literal"), or it can be either a length or a distance which
+byte (a "literal"), or it can be either a length or a distance which
-indicates a base value and a number of bits to fetch after the code that is
+indicates a base value and a number of bits to fetch after the code that is
-added to the base value.  Or it might be the special end-of-block code.  The
+added to the base value.  Or it might be the special end-of-block code.  The
-data structures created in inftrees.c try to encode all that information
+data structures created in inftrees.c try to encode all that information
-compactly in the tables.
+compactly in the tables.
-Jean-loup Gailly        Mark Adler
+Jean-loup Gailly        Mark Adler
-jloup@gzip.org          madler@alumni.caltech.edu
+jloup@gzip.org          madler@alumni.caltech.edu
-References:
+References:
-[LZ77] Ziv J., Lempel A., ``A Universal Algorithm for Sequential Data
+[LZ77] Ziv J., Lempel A., ``A Universal Algorithm for Sequential Data
-Compression,'' IEEE Transactions on Information Theory, Vol. 23, No. 3,
+Compression,'' IEEE Transactions on Information Theory, Vol. 23, No. 3,
-pp. 337-343.
+pp. 337-343.
-``DEFLATE Compressed Data Format Specification'' available in
+``DEFLATE Compressed Data Format Specification'' available in
-http://www.ietf.org/rfc/rfc1951.txt
+http://www.ietf.org/rfc/rfc1951.txt

diff --git a/utils/zenutils/libraries/zlib123/zlib/algorithm.txt b/utils/zenutils/libraries/zlib123/zlib/algorithm.txt index 9f6b06808c..b022dde312 100755..100644 --- a/utils/zenutils/libraries/zlib123/zlib/algorithm.txt +++ b/utils/zenutils/libraries/zlib123/zlib/algorithm.txt
@@ -1,209 +1,209 @@
1	1. Compression algorithm (deflate)	1	1. Compression algorithm (deflate)
2		2
3	The deflation algorithm used by gzip (also zip and zlib) is a variation of	3	The deflation algorithm used by gzip (also zip and zlib) is a variation of
4	LZ77 (Lempel-Ziv 1977, see reference below). It finds duplicated strings in	4	LZ77 (Lempel-Ziv 1977, see reference below). It finds duplicated strings in
5	the input data. The second occurrence of a string is replaced by a	5	the input data. The second occurrence of a string is replaced by a
6	pointer to the previous string, in the form of a pair (distance,	6	pointer to the previous string, in the form of a pair (distance,
7	length). Distances are limited to 32K bytes, and lengths are limited	7	length). Distances are limited to 32K bytes, and lengths are limited
8	to 258 bytes. When a string does not occur anywhere in the previous	8	to 258 bytes. When a string does not occur anywhere in the previous
9	32K bytes, it is emitted as a sequence of literal bytes. (In this	9	32K bytes, it is emitted as a sequence of literal bytes. (In this
10	description, `string' must be taken as an arbitrary sequence of bytes,	10	description, `string' must be taken as an arbitrary sequence of bytes,
11	and is not restricted to printable characters.)	11	and is not restricted to printable characters.)
12		12
13	Literals or match lengths are compressed with one Huffman tree, and	13	Literals or match lengths are compressed with one Huffman tree, and
14	match distances are compressed with another tree. The trees are stored	14	match distances are compressed with another tree. The trees are stored
15	in a compact form at the start of each block. The blocks can have any	15	in a compact form at the start of each block. The blocks can have any
16	size (except that the compressed data for one block must fit in	16	size (except that the compressed data for one block must fit in
17	available memory). A block is terminated when deflate() determines that	17	available memory). A block is terminated when deflate() determines that
18	it would be useful to start another block with fresh trees. (This is	18	it would be useful to start another block with fresh trees. (This is
19	somewhat similar to the behavior of LZW-based _compress_.)	19	somewhat similar to the behavior of LZW-based _compress_.)
20		20
21	Duplicated strings are found using a hash table. All input strings of	21	Duplicated strings are found using a hash table. All input strings of
22	length 3 are inserted in the hash table. A hash index is computed for	22	length 3 are inserted in the hash table. A hash index is computed for
23	the next 3 bytes. If the hash chain for this index is not empty, all	23	the next 3 bytes. If the hash chain for this index is not empty, all
24	strings in the chain are compared with the current input string, and	24	strings in the chain are compared with the current input string, and
25	the longest match is selected.	25	the longest match is selected.
26		26
27	The hash chains are searched starting with the most recent strings, to	27	The hash chains are searched starting with the most recent strings, to
28	favor small distances and thus take advantage of the Huffman encoding.	28	favor small distances and thus take advantage of the Huffman encoding.
29	The hash chains are singly linked. There are no deletions from the	29	The hash chains are singly linked. There are no deletions from the
30	hash chains, the algorithm simply discards matches that are too old.	30	hash chains, the algorithm simply discards matches that are too old.
31		31
32	To avoid a worst-case situation, very long hash chains are arbitrarily	32	To avoid a worst-case situation, very long hash chains are arbitrarily
33	truncated at a certain length, determined by a runtime option (level	33	truncated at a certain length, determined by a runtime option (level
34	parameter of deflateInit). So deflate() does not always find the longest	34	parameter of deflateInit). So deflate() does not always find the longest
35	possible match but generally finds a match which is long enough.	35	possible match but generally finds a match which is long enough.
36		36
37	deflate() also defers the selection of matches with a lazy evaluation	37	deflate() also defers the selection of matches with a lazy evaluation
38	mechanism. After a match of length N has been found, deflate() searches for	38	mechanism. After a match of length N has been found, deflate() searches for
39	a longer match at the next input byte. If a longer match is found, the	39	a longer match at the next input byte. If a longer match is found, the
40	previous match is truncated to a length of one (thus producing a single	40	previous match is truncated to a length of one (thus producing a single
41	literal byte) and the process of lazy evaluation begins again. Otherwise,	41	literal byte) and the process of lazy evaluation begins again. Otherwise,
42	the original match is kept, and the next match search is attempted only N	42	the original match is kept, and the next match search is attempted only N
43	steps later.	43	steps later.
44		44
45	The lazy match evaluation is also subject to a runtime parameter. If	45	The lazy match evaluation is also subject to a runtime parameter. If
46	the current match is long enough, deflate() reduces the search for a longer	46	the current match is long enough, deflate() reduces the search for a longer
47	match, thus speeding up the whole process. If compression ratio is more	47	match, thus speeding up the whole process. If compression ratio is more
48	important than speed, deflate() attempts a complete second search even if	48	important than speed, deflate() attempts a complete second search even if
49	the first match is already long enough.	49	the first match is already long enough.
50		50
51	The lazy match evaluation is not performed for the fastest compression	51	The lazy match evaluation is not performed for the fastest compression
52	modes (level parameter 1 to 3). For these fast modes, new strings	52	modes (level parameter 1 to 3). For these fast modes, new strings
53	are inserted in the hash table only when no match was found, or	53	are inserted in the hash table only when no match was found, or
54	when the match is not too long. This degrades the compression ratio	54	when the match is not too long. This degrades the compression ratio
55	but saves time since there are both fewer insertions and fewer searches.	55	but saves time since there are both fewer insertions and fewer searches.
56		56
57		57
58	2. Decompression algorithm (inflate)	58	2. Decompression algorithm (inflate)
59		59
60	2.1 Introduction	60	2.1 Introduction
61		61
62	The key question is how to represent a Huffman code (or any prefix code) so	62	The key question is how to represent a Huffman code (or any prefix code) so
63	that you can decode fast. The most important characteristic is that shorter	63	that you can decode fast. The most important characteristic is that shorter
64	codes are much more common than longer codes, so pay attention to decoding the	64	codes are much more common than longer codes, so pay attention to decoding the
65	short codes fast, and let the long codes take longer to decode.	65	short codes fast, and let the long codes take longer to decode.
66		66
67	inflate() sets up a first level table that covers some number of bits of	67	inflate() sets up a first level table that covers some number of bits of
68	input less than the length of longest code. It gets that many bits from the	68	input less than the length of longest code. It gets that many bits from the
69	stream, and looks it up in the table. The table will tell if the next	69	stream, and looks it up in the table. The table will tell if the next
70	code is that many bits or less and how many, and if it is, it will tell	70	code is that many bits or less and how many, and if it is, it will tell
71	the value, else it will point to the next level table for which inflate()	71	the value, else it will point to the next level table for which inflate()
72	grabs more bits and tries to decode a longer code.	72	grabs more bits and tries to decode a longer code.
73		73
74	How many bits to make the first lookup is a tradeoff between the time it	74	How many bits to make the first lookup is a tradeoff between the time it
75	takes to decode and the time it takes to build the table. If building the	75	takes to decode and the time it takes to build the table. If building the
76	table took no time (and if you had infinite memory), then there would only	76	table took no time (and if you had infinite memory), then there would only
77	be a first level table to cover all the way to the longest code. However,	77	be a first level table to cover all the way to the longest code. However,
78	building the table ends up taking a lot longer for more bits since short	78	building the table ends up taking a lot longer for more bits since short
79	codes are replicated many times in such a table. What inflate() does is	79	codes are replicated many times in such a table. What inflate() does is
80	simply to make the number of bits in the first table a variable, and then	80	simply to make the number of bits in the first table a variable, and then
81	to set that variable for the maximum speed.	81	to set that variable for the maximum speed.
82		82
83	For inflate, which has 286 possible codes for the literal/length tree, the size	83	For inflate, which has 286 possible codes for the literal/length tree, the size
84	of the first table is nine bits. Also the distance trees have 30 possible	84	of the first table is nine bits. Also the distance trees have 30 possible
85	values, and the size of the first table is six bits. Note that for each of	85	values, and the size of the first table is six bits. Note that for each of
86	those cases, the table ended up one bit longer than the ``average'' code	86	those cases, the table ended up one bit longer than the ``average'' code
87	length, i.e. the code length of an approximately flat code which would be a	87	length, i.e. the code length of an approximately flat code which would be a
88	little more than eight bits for 286 symbols and a little less than five bits	88	little more than eight bits for 286 symbols and a little less than five bits
89	for 30 symbols.	89	for 30 symbols.
90		90
91		91
92	2.2 More details on the inflate table lookup	92	2.2 More details on the inflate table lookup
93		93
94	Ok, you want to know what this cleverly obfuscated inflate tree actually	94	Ok, you want to know what this cleverly obfuscated inflate tree actually
95	looks like. You are correct that it's not a Huffman tree. It is simply a	95	looks like. You are correct that it's not a Huffman tree. It is simply a
96	lookup table for the first, let's say, nine bits of a Huffman symbol. The	96	lookup table for the first, let's say, nine bits of a Huffman symbol. The
97	symbol could be as short as one bit or as long as 15 bits. If a particular	97	symbol could be as short as one bit or as long as 15 bits. If a particular
98	symbol is shorter than nine bits, then that symbol's translation is duplicated	98	symbol is shorter than nine bits, then that symbol's translation is duplicated
99	in all those entries that start with that symbol's bits. For example, if the	99	in all those entries that start with that symbol's bits. For example, if the
100	symbol is four bits, then it's duplicated 32 times in a nine-bit table. If a	100	symbol is four bits, then it's duplicated 32 times in a nine-bit table. If a
101	symbol is nine bits long, it appears in the table once.	101	symbol is nine bits long, it appears in the table once.
102		102
103	If the symbol is longer than nine bits, then that entry in the table points	103	If the symbol is longer than nine bits, then that entry in the table points
104	to another similar table for the remaining bits. Again, there are duplicated	104	to another similar table for the remaining bits. Again, there are duplicated
105	entries as needed. The idea is that most of the time the symbol will be short	105	entries as needed. The idea is that most of the time the symbol will be short
106	and there will only be one table look up. (That's whole idea behind data	106	and there will only be one table look up. (That's whole idea behind data
107	compression in the first place.) For the less frequent long symbols, there	107	compression in the first place.) For the less frequent long symbols, there
108	will be two lookups. If you had a compression method with really long	108	will be two lookups. If you had a compression method with really long
109	symbols, you could have as many levels of lookups as is efficient. For	109	symbols, you could have as many levels of lookups as is efficient. For
110	inflate, two is enough.	110	inflate, two is enough.
111		111
112	So a table entry either points to another table (in which case nine bits in	112	So a table entry either points to another table (in which case nine bits in
113	the above example are gobbled), or it contains the translation for the symbol	113	the above example are gobbled), or it contains the translation for the symbol
114	and the number of bits to gobble. Then you start again with the next	114	and the number of bits to gobble. Then you start again with the next
115	ungobbled bit.	115	ungobbled bit.
116		116
117	You may wonder: why not just have one lookup table for how ever many bits the	117	You may wonder: why not just have one lookup table for how ever many bits the
118	longest symbol is? The reason is that if you do that, you end up spending	118	longest symbol is? The reason is that if you do that, you end up spending
119	more time filling in duplicate symbol entries than you do actually decoding.	119	more time filling in duplicate symbol entries than you do actually decoding.
120	At least for deflate's output that generates new trees every several 10's of	120	At least for deflate's output that generates new trees every several 10's of
121	kbytes. You can imagine that filling in a 2^15 entry table for a 15-bit code	121	kbytes. You can imagine that filling in a 2^15 entry table for a 15-bit code
122	would take too long if you're only decoding several thousand symbols. At the	122	would take too long if you're only decoding several thousand symbols. At the
123	other extreme, you could make a new table for every bit in the code. In fact,	123	other extreme, you could make a new table for every bit in the code. In fact,
124	that's essentially a Huffman tree. But then you spend two much time	124	that's essentially a Huffman tree. But then you spend two much time
125	traversing the tree while decoding, even for short symbols.	125	traversing the tree while decoding, even for short symbols.
126		126
127	So the number of bits for the first lookup table is a trade of the time to	127	So the number of bits for the first lookup table is a trade of the time to
128	fill out the table vs. the time spent looking at the second level and above of	128	fill out the table vs. the time spent looking at the second level and above of
129	the table.	129	the table.
130		130
131	Here is an example, scaled down:	131	Here is an example, scaled down:
132		132
133	The code being decoded, with 10 symbols, from 1 to 6 bits long:	133	The code being decoded, with 10 symbols, from 1 to 6 bits long:
134		134
135	A: 0	135	A: 0
136	B: 10	136	B: 10
137	C: 1100	137	C: 1100
138	D: 11010	138	D: 11010
139	E: 11011	139	E: 11011
140	F: 11100	140	F: 11100
141	G: 11101	141	G: 11101
142	H: 11110	142	H: 11110
143	I: 111110	143	I: 111110
144	J: 111111	144	J: 111111
145		145
146	Let's make the first table three bits long (eight entries):	146	Let's make the first table three bits long (eight entries):
147		147
148	000: A,1	148	000: A,1
149	001: A,1	149	001: A,1
150	010: A,1	150	010: A,1
151	011: A,1	151	011: A,1
152	100: B,2	152	100: B,2
153	101: B,2	153	101: B,2
154	110: -> table X (gobble 3 bits)	154	110: -> table X (gobble 3 bits)
155	111: -> table Y (gobble 3 bits)	155	111: -> table Y (gobble 3 bits)
156		156
157	Each entry is what the bits decode as and how many bits that is, i.e. how	157	Each entry is what the bits decode as and how many bits that is, i.e. how
158	many bits to gobble. Or the entry points to another table, with the number of	158	many bits to gobble. Or the entry points to another table, with the number of
159	bits to gobble implicit in the size of the table.	159	bits to gobble implicit in the size of the table.
160		160
161	Table X is two bits long since the longest code starting with 110 is five bits	161	Table X is two bits long since the longest code starting with 110 is five bits
162	long:	162	long:
163		163
164	00: C,1	164	00: C,1
165	01: C,1	165	01: C,1
166	10: D,2	166	10: D,2
167	11: E,2	167	11: E,2
168		168
169	Table Y is three bits long since the longest code starting with 111 is six	169	Table Y is three bits long since the longest code starting with 111 is six
170	bits long:	170	bits long:
171		171
172	000: F,2	172	000: F,2
173	001: F,2	173	001: F,2
174	010: G,2	174	010: G,2
175	011: G,2	175	011: G,2
176	100: H,2	176	100: H,2
177	101: H,2	177	101: H,2
178	110: I,3	178	110: I,3
179	111: J,3	179	111: J,3
180		180
181	So what we have here are three tables with a total of 20 entries that had to	181	So what we have here are three tables with a total of 20 entries that had to
182	be constructed. That's compared to 64 entries for a single table. Or	182	be constructed. That's compared to 64 entries for a single table. Or
183	compared to 16 entries for a Huffman tree (six two entry tables and one four	183	compared to 16 entries for a Huffman tree (six two entry tables and one four
184	entry table). Assuming that the code ideally represents the probability of	184	entry table). Assuming that the code ideally represents the probability of
185	the symbols, it takes on the average 1.25 lookups per symbol. That's compared	185	the symbols, it takes on the average 1.25 lookups per symbol. That's compared
186	to one lookup for the single table, or 1.66 lookups per symbol for the	186	to one lookup for the single table, or 1.66 lookups per symbol for the
187	Huffman tree.	187	Huffman tree.
188		188
189	There, I think that gives you a picture of what's going on. For inflate, the	189	There, I think that gives you a picture of what's going on. For inflate, the
190	meaning of a particular symbol is often more than just a letter. It can be a	190	meaning of a particular symbol is often more than just a letter. It can be a
191	byte (a "literal"), or it can be either a length or a distance which	191	byte (a "literal"), or it can be either a length or a distance which
192	indicates a base value and a number of bits to fetch after the code that is	192	indicates a base value and a number of bits to fetch after the code that is
193	added to the base value. Or it might be the special end-of-block code. The	193	added to the base value. Or it might be the special end-of-block code. The
194	data structures created in inftrees.c try to encode all that information	194	data structures created in inftrees.c try to encode all that information
195	compactly in the tables.	195	compactly in the tables.
196		196
197		197
198	Jean-loup Gailly Mark Adler	198	Jean-loup Gailly Mark Adler
199	jloup@gzip.org madler@alumni.caltech.edu	199	jloup@gzip.org madler@alumni.caltech.edu
200		200
201		201
202	References:	202	References:
203		203
204	[LZ77] Ziv J., Lempel A., ``A Universal Algorithm for Sequential Data	204	[LZ77] Ziv J., Lempel A., ``A Universal Algorithm for Sequential Data
205	Compression,'' IEEE Transactions on Information Theory, Vol. 23, No. 3,	205	Compression,'' IEEE Transactions on Information Theory, Vol. 23, No. 3,
206	pp. 337-343.	206	pp. 337-343.
207		207
208	``DEFLATE Compressed Data Format Specification'' available in	208	``DEFLATE Compressed Data Format Specification'' available in
209	http://www.ietf.org/rfc/rfc1951.txt	209	http://www.ietf.org/rfc/rfc1951.txt