From e41a24efca41dde645f22ac55a8df2c0d3950369 Mon Sep 17 00:00:00 2001
From: Daniel Olsen <daniel.olsen99@gmail.com>
Date: Wed, 28 Apr 2021 16:27:05 +0200
Subject: [PATCH] update unordered 2021-04-28

---
 content/2021-03-03-unordered.md | 129 ++++++++++++++++++++++++--------
 1 file changed, 96 insertions(+), 33 deletions(-)
diff --git a/content/2021-03-03-unordered.md b/content/2021-03-03-unordered.md
index 91166b8..7439621 100644
--- a/content/2021-03-03-unordered.md
+++ b/content/2021-03-03-unordered.md
@@ -1,7 +1,7 @@
 +++
 title = "Unordered Numbers"
 date = 2021-03-01
-updated = 2021-03-04
+updated = 2021-04-28
 
 slug = "unordered"
 [taxonomies]
@@ -17,6 +17,9 @@ The main idea is to create a number that represents a single uniqe combination o
 
 for example `u([1, 2, 3])` = `u([1, 2, 3])` = \\(65\\) if all the elements are base 10.
 
+<details>
+<summary>Note about A009994</summary>
+
 if you know there are three elements which can be 10 different values, then the number is the same as what's found in [A009994](https://oeis.org/A009994)
 
 {% python() %}
@@ -262,7 +265,7 @@ for n,i in enumerate(A009994generator(4), start=1):
 ```
 
 </details>
-
+</details>
 
 I believe this could be useful to compress bitfields:
 ```rust
@@ -275,36 +278,33 @@ enum Stuff {
   Four
 }
 ```
-if you put this into a bitfield you could store, `[One, Two, Three, Four]` as something like `00100111` (each element of the bitfield is LSB First for reasons that will become clear later)  
-But if you don't care about whether or not it's `[One, Two, Three, Four]` or `[Two, One, Four, Three]` you could sort the list so that it's `[One, Two, Three, Four]`  
-The first element could be any of the four variants so it "uses" 4 values: `(00)`, the second element can also be any of the 4 variants, it too "uses" 4 values. `(00)(10)`.  
-But now the third element can only be one of three variants, `Two`, `Three`, or `Four`. spending two whole bits on that would be a 25% waste of space! Can we use it later?  
-If we use the "index" of the available options as the bit value we might be able to do it `(00)(10)(10)`.  
-The fourth element can be one of only two variants, `Three`, or `Four`, that's only one bit. We did have one extra value to use from before.  
-`(00)(10)(11)(00)` could be `[One, Two, Three, Three]` and `(00)(10)(10)(10)` could be `[One, Two, Three, Four]`  
-Note here that the last bit isn't used in either of those encodings. Here's a list of all possible permutations:
-
+If you put this into a bitfield you could store, `[One, Two, Three, Four]` as something like `00100111`  
+But if you don't care about whether or not it's `[One, Two, Three, Four]` or `[Two, One, Four, Three]` you could sort the list so that it's `[One, Two, Three, Four]` every time.  
+And then use the fact that the first element could be any of the four variants, the second element can also be any of the 4 variants,  
+but the third element can only be one of three variants, since the last one was `Two`, namely `Two`, `Three`, or `Four`. Spending two whole bits on that would be a 25% waste of space!  
+If we use the "index" of the available options as the bit value we might be able to do something about it.  
+The fourth element can be one of only two variants, `Three`, or `Four`. This would obviously have quite the space savings.
 ```
-000000  [One, One, One, One]        00000000
-000001  [One, One, One, Two]        00000001
-000010  [One, One, One, Three]      00000010
-000011  [One, One, One, Four]       00000011
-000100  [One, One, Two, Two]        00000100
-000101  [One, One, Two, Three]      00000101
-000110  [One, One, Two, Four]       00000110
-000111  [One, One, Three, Three]    00001000 !
-001000  [One, One, Three, Four]     00000110
-001001  [One, One, Four, Four]      00001100
-001010  [One, Two, Two, Two]        00100000
-001011  [One, Two, Two, Three]      00100010
-001100  [One, Two, Two, Four]       00100001
-001101  [One, Two, Three, Three]    00101000
-001110  [One, Two, Three, Four]     00101010
-001111  [One, Two, Four, Four]      00100100
-010000  [One, Three, Three, Three]  00010000
-010001  [One, Three, Three, Four]   00010010
-010010  [One, Three, Four, Four]    00011000
-010011  [One, Four, Four, Four]     00110000
+000000  [One, One, One, One]
+000001  [One, One, One, Two]
+000010  [One, One, One, Three]
+000011  [One, One, One, Four]
+000100  [One, One, Two, Two]
+000101  [One, One, Two, Three]
+000110  [One, One, Two, Four]
+000111  [One, One, Three, Three]
+001000  [One, One, Three, Four]
+001001  [One, One, Four, Four]
+001010  [One, Two, Two, Two]
+001011  [One, Two, Two, Three]
+001100  [One, Two, Two, Four]
+001101  [One, Two, Three, Three]
+001110  [One, Two, Three, Four]
+001111  [One, Two, Four, Four]
+010000  [One, Three, Three, Three]
+010001  [One, Three, Three, Four]
+010010  [One, Three, Four, Four]
+010011  [One, Four, Four, Four]
 010100  [Two, Two, Two, Two]        
 010101  [Two, Two, Two, Three]      
 010110  [Two, Two, Two, Four]       
@@ -322,9 +322,72 @@ Note here that the last bit isn't used in either of those encodings. Here's a li
 100010  [Four, Four, Four, Four]
 ```
 
-Unfortuanetly I've been unable to make a function to convert between them. Though I'm working on it...  
+Unfortuanetly I've been unable to make a function to convert between them without a map. Though I'm working on it...  
 Until that I guess sorting the elements and looking it up in a table will work 😕
 
+
+# Update 2021-04-26 Encoding!
+
+After a lot of attempts and this problem burning in the back of my mind, 2 months later I've found a solution.  
+The breakthrough was figuring out that if you can figure out how to count how many options there are left, you can work out which option you're at.
+
+You could do this by initializing a loop for counting at some state
+{% python() %}
+
+count = 0
+for i in range (0,4):
+  for j in range (i, 4):
+    for k in range(j, 4):
+      for l in range(k, 4):
+        count += 1
+print(count)
+{% end %}
+
+Which we knew, but for some reason it didn't click that we could easilly count the states above our original number by just starting at it.
+
+expressing this as mafs would be:
+
+{% katex(block=true) %}
+\sum_{i = 1}^4 \sum_{j = i}^4 \sum_{k = j}^4 \sum_{l = k}^4 1
+{% end %}
+
+Similarly you can count just the last two digits remove those from the total.  
+This way you can find out which one of those options are the initial state.
+
+{% katex(block=true) %}
+\begin{alignedat}{2}
+&S_4(O, A) &&= \sum_{i = A}^O \sum_{j = i}^O \sum_{k = j}^O \sum_{l = k}^O 1 \\
+&S_3(O, B) &&= \sum_{i = B}^O \sum_{j = i}^O \sum_{k = j}^O 1 \\
+&S_2(O, C) &&= \sum_{i = C}^O \sum_{j = i}^O 1 \\
+&S_1(O, D) &&= O - D \\
+&E(O, A, B, C, D) &&= \underbrace{S_3(O, 1)}_{\text{All options}} - 
+\underbrace{(S_1(O,D) + S_2(O, C+2) + S_3(O, B+2) + S_4(O, A+2))}_{\text{All options above the initial state}}
+\end{alignedat}{2}
+{% end %}
+
+Where O is the base (`4`), and the set size being 4 in this case.
+
+You could probably just, uh, count up instead, but I didn't really think of that at the time...
+
+In the end though you end up with some nice numbers:
+```
+E(4, 0,0,0,0) =  0
+E(4, 0,0,0,1) =  1
+E(4, 0,0,0,2) =  2
+E(4, 0,0,0,3) =  3
+E(4, 0,0,1,1) =  5
+E(4, 0,0,1,2) =  6
+...
+E(4, 3,3,3,3) = 35
+```
+
+# Update 2021-04-28
+
+What I am actually looking for is apparently something called "Arithmetic coding".  
+I can generate a statistical model for which "symbols" should be available in each step really easilly.  
+
+Typical that you find the answer a couple of days after having made progress 🤣
+
 <details>
 <summary>2020-03-01 failed attempt at impementation</summary>
 
@@ -479,4 +542,4 @@ print(o(246))
 Unfortuanetly these functions do not give a perfect compression level.  
 _It is better_, just not perfect, and probably not worth it
 
-</details>
\ No newline at end of file
+</details>