commit61d58c760c776f5e0d56e569ab2ca68627938984parentbbb3066fd0708a6622063323c917ef8ecbfa0ebaAuthor:Sol <sol@plunder.tech>Date:Sat, 16 Sep 2023 12:46:20 -0400 doc: improve SEED docsDiffstat:

M | doc/seed/README.md | | | 239 | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------- |

1 file changed, 182 insertions(+), 57 deletions(-)diff --git a/doc/seed/README.md b/doc/seed/README.md@@ -1,75 +1,163 @@ # PLAN Seeds -This is a simple format for serializing (normalized) PLAN trees. +Seed is a simple format for encoding binary trees of natural numbers. -This format is optimized for simplicity of decoding, not speed or size. +The format has a number off desirable properties: -We convert everything to atoms and cells via `(<p> -> (4 p))` and -`({n a b} -> (4 n a b))`. The original values can be recovered via -evaluation. +- It's very simple to write an efficient SEED loader. (See [this + demo](./demo.c) for an example of this). -In particular, this should only be used for "trustworthy" inputs. -Since decoders are stupid, and since complex values are reconstructed -via evaluation, it is possible to create seeds that contain computation, -and those computations may diverge. +- Naturals are stored word-aligned and word-padded. Multi-word naturals + can be loaded with zero copies by just storing big numbers as pointer + into the input buffer. -For example, you can encode the value `(0 0 0 0)`, which trivially -diverges. + This is especially useful when working with files that are loaded + with mmap() and contain large binary blobs. In this situation, + the OS will transparently page this data in-and-out as necessary. + +- Individual DAG nodes of a PLAN value can be serialized individually. + For example, by using hashes to encode an edge-list. + +- All repeated subtrees and written only once. + +When serializing PLAN values, we write out pins and laws by encoding +them as expressions that evaluate to pins and laws. For example `<3>` is +encoded as `(4 3)`. `{"_ToNat" 1 (0 (2 0 3) 1)}` is encoded as +`(0 127961276568671 1 (0 (2 0 3) 1))`. + +PLAN trees encoded with seed can contain *computations*, not just +values. If this is undesirable, the decoder can just verify that +everything it reads is in normal form. All atoms are de-duplicated and stored in ascending order. All cells are de-duplicated by serializing trees as a series of tree-fragments, each of which may refer to any earlier fragment. - # The Format -Seed files have the following layout: +At a high-level, the format is: + +- Size metadata. +- All of the numbers, in descending order. +- Tree data. + +- We order the atoms so that they can be binned by size bignats, words, + and bytes. This avoids the need for complex encoding of size metadata. + +- We use descending order so that bytes are written last, otherwise + we would need padding after the bytes to maintain word-alignment. + +For sharing, tree data is encoded as a series of tree "fragments" each +of which can reference fragments that came before it. For example, the +value ((0 1) (0 1)) is stored as: + + [0]: 1 + [1]: 0 + [2]: ($1 $0) + [3]: ($2 $2) + +More precisely, seed files have the following layout: + + u64 numHoles // number of external references + u64 numBigNats // number of nats bigger than one word + u64 numWords // number of nats bigger than one byte + u64 numBytes // number of nats that fit in a byte + u64 numTrees // number of tree fragments + u64[numBigNats] bigNatSizes // word-size of each multi-word nat + u64[sum(bigNatSizes)] bigNatData // actual data for each multi-word nat + u64[numWords] words // actual data for each word-sized nat + u8[numBytes] bytes // actual data for each byte-sized nat + u8[] trees // bit-encoded binary trees + u8[] zeros // zeros to make the size a multiple of 8 + +Each tree is bit-encoded with the following scheme: + + 1xy -> (x y) + 0(r{n}) -> where n is the bit-width of the maximum backref size. + +each element of the `trees` array omits the outermost 1 bit. Since each +atom is already in the table, the top over every `tree` is always an +interior node. + +For example, `((0 1) (0 1))` is this data: + + [0]: 1 + [1]: 0 + [2]: ($1 $0) + [3]: ($2 $2) + +Which is stored as: + + numHoles=0 + numBigNats=0 + numWords=0 + numBytes=2 + numTrees=2 + [bigNatSizes] + [bigNatData] + [words] + [bytes] + 0b00000001 + 0b00000000 + [trees] + {01}{00} // two possible references = 1 bit per leaf + {010}{010} // three possible references = 2 bits per leaf. + [zeros] + 00000000000000000000000000000000000000 - u64 numHoles - u64 numBigs - u64 numWords - u64 numBytes - u64 numFrags - u64[numBigs] bigSizes - u64[sum(bigSizes)] bigData - u64[numWords] words - u8[numBytes] bytes - u64[] fragData // a series of bit-encoded fragments +Or, in raw binary, least-significant bit-first: -Decode item-by-item, appending each to an accumulator. + 0000000000000000000000000000000000000000000000000000000000000000 + 0000000000000000000000000000000000000000000000000000000000000000 + 0000000000000000000000000000000000000000000000000000000000000000 + 0000000000000000000000000000000000000000000000000000000000000010 + 0000000000000000000000000000000000000000000000000000000000000010 + 1000000000000000010001001000000000000000000000000000000000000000 -Each fragment is bit-encoded with the following scheme: +# Decoding - 1xy -> (x y) - 0n -> accumulator[n] +- The head has a fixed size, so we can simply copy those bytes into + words. -Where the bit-width of `n` is the number of bits required to store the -biggest possible index. If there are four elements in the accumulator, -each index is two bits, etc. +- Once we have loaded the header, we can determine the total size of + the table used during load. + +- Load all of the big-nats, load all of the words, load all of the + bytes. + +- Read the remaining file bit-by-bit decoding the tree structure. indexBitWidth = ceil(log2(accumulator.length()) The atoms are stored in decreasing order so that we don't need padding after the bytes to maintain word-alignment. - # Encoding ## De-duplication -It's easy enough to de-duplicate atoms, just insert them all into a -hash table. +Deduplication for atoms is easy: just insert each atom into a hash table. + +However, using the same approach for trees is expensive. + +Urbit's `jam` solves this by putting a lazily-calculated hash value +on every node, that solves the O(n) cost of hashing each node, and it +also solves the O(n) cost of comparing trees, (since deep comparisons +of similar but non-equal trees will quickly find a non-matching hash). -However, using the same approach for trees is expensive, since all -operations are O(n) on the size of each tree, and trees can be quite -large. Hashing and equality are both O(n) for each node. +However, the `jam` approach increases the memory-footprint of each cell +by 50%. -So, we traverse the tree depth-first, and assign an index to each -unique node that we find. +Seed uses a different approach that avoids the per-node hash at the price +of needing to do a full depth-first traversal of the DAG. This doesn't +scale to DAGs with massive sharing, but that problem is already solved +with pins. -Then we represent each cell as an `(Idx, Idx)` pair and use that as the -value to inserted into the de-duplication table for cells. +The approach that seed uses, is to write each cell into a table, and to +assign each unique cell a unique key. Then we can just store each cell +as (key, key) in a de-duplication table. Pairs of keys are efficient +to hash and efficient to check for equality. For example, to de-duplicate the value `((1 2) (1 2))`. @@ -82,36 +170,73 @@ For example, to de-duplicate the value `((1 2) (1 2))`. (c c ) [a=1 b=2 c=(a b)] {1=a 2=b (a b)=c} d [a=1 b=2 c=(a b) d=(c c)] {1=a 2=b (a b)=c (c c)=d} -For the next thing, the shatter() algorithm, we also need to track the -number of references on each node. - ## Converting De-duplicated Trees into Fragment Sequences -See [shatter.txt]. +The process described above creates a table of unique DAG nodes, but we +still need to turn that into a series of fragments with back-references. + +The trick there is to track the "refcount" of each node during our +depth-first traversal. Whenever a reference b, but refcount(a) < +refcount(b), b should be written out as a separate tree fragment. + +See [shatter.txt](shatter.txt) for the details of that algorithm. ## Sorting Atoms -We have an array of unique atoms, in traversal order. However, the -numbers need to be serialized in descending order, and back-references -to numbers are also in sort-order. +We have an array of unique atoms, in traversal order. -We can't just sort our array of numbers, since our cells contain indexes -into the traversal-order array. +However, the atoms needs to be serialized in descending order (this is +what makes the (bignats, then words, then bytes) encoding possible. -So, we need to derive the sort-order for this array, and we need to -produce a mapping from traversal-order-index into sort-order-index -(for back-references into the arrays table). +We can't just sort our array of numbers, since our our cells table +contains references into that array by index. +So, we build a table of indexes into that array and we sort that. -# Single-Pin Output -The same format can be used with a header, to encode single pins. +## Encoding Individual Pins -Just add this header: +A single pin (a single DAG-node in the pin DAG) can be stored using a +scheme such as: u64 numPins - u256[numPins] pins (traversal order) + u256[numPins] pinRefs (in traversal order) + seed(with numHoles=numPins) + +To load this: + +- Read the pinRefs table to get the edgelist. +- Load each dependency. +- Decode the seed file, with the refs-table pre-populated with these pins. + + +## Seed Pods: Encoding Large pin DAGs + +The SEED de-duplication system does not scale to huge, highly duplicated +trees, so very large values must be broken into pins and stored separately. + +The seed-pod format works by traversing the pin DAG, saving each pin +with seed, and adding it to an array. + +For example, the DAG: + + [a] + / \ + [b] [c] + \ / + [d] + +Can be written with the following scheme: + + SEED( + [ ([], SEED(d)) + , ([0], SEED(b)) + , ([0], SEED(c)) + , ([0,1,2], SEED(d)) + ] + ) -And then increase each index in each fragment by `numPins`. +Where the array of numbers is an edge-list (each number being an index +into the same array).