0729692cc8
Finally. We're almost at parity for the template compiler.
Now we have a build option to use templating:
`./configure --enable-asn1-templating`
Tests fail if you build `rfc2459.asn1` with `--template`.
TBD: Figure out what differences remain between the two compilers, and
fix the templating compiler accordingly, adding tests along the
way.
Making IMPLICIT tags work in the templating compiler turned out to be a
simple fix: don't attempt to do anything clever about IMPLICIT tags in
the template generator in the compiler other than denoting them --
instead leave all the smarts about IMPLICIT tags to the interpreter.
This might be a very slight pessimization, but also a great
simplification.
The result is very elegant: when the interpreter finds an IMPLICIT
tag it then recurses to find the template for the body of the type
so-tagged, and evaluates that. Much more elegant than the code
generated by the non-template compiler, not least for not needing
any additional temporary memory allocation.
With this we finally have parity in basic testing of the template
compiler. Indeed, for IMPLICIT tags the template compiler and
interpreter might even be better because they support IMPLICIT tags
with BER lengths, whereas the non-template compiler doesn't (mostly
because `der_replace_tag()` needs to be changed to support it.
And, of course, the template compiler is simply superior in that it
produces smaller code and is *much* easier to work with because the
functions to interpret templates are small and simple. Which means we
can add more functions to deal with other encoding rules fairly
trivially. It should be possible to add all of these with very little
work, almost all of it localized to `lib/asn1/template.c`:
- PER Packed Encoding Rules [X.691]
- XER XML Encoding Rules [X.693]
- OER Octet Encoding Rules [X.696] (intended to replace PER)
- JER JSON Encoding Rules [X.697] (doubles as visual representation)
- GSER Generic String E.R.s [RFC3641] (a visual representation)
- XDR External Data Repr. [STD67][RFC4506]
(XDR is *not* an ASN.1 encoding rules specification, but it's a
*lot* like PER/OER but with 4-octet alignment, and is specified
for the syntax equivalent (XDR) of only a subset of ASN.1 syntax
and semantics.)
All we'd have to do is add variants of `_asn1_{length,encode,decode}()`
for each set of rules, then generate per-type stub functions that call
them (as we already do for DER).
We could then have an encoding rule transliteration program that takes a
`TypeName` and some representation of a value encoded by some encoding
rules, and outputs the same thing encoded by a different set of rules.
This would double as a pretty-printer and parser if we do add support
for JER and/or GSER. It would find the template for the given type
using `dlsym()` against some shared object (possibly `libasn1` itself).
Whereas generating source code for C (or whatever language) for
additional ERs requires much more work. Plus, templates are much
smaller, and the interpreter is tiny, which yields much smaller text and
much smaller CPU icache/dcache footprint, which yields better
performance in many cases.
As well, the template system should be much easier to port to other
languages. Though in the cases of, e.g., Rust, it would require use of
`unsafe` in the interpreter, so in fact the inverse might be true: that
it's easier to generate safe Rust code than to implement a template
interpreter in Rust. Similarly for Haskell, OCAML, etc. But wherever
the template interpreter is easy to implement, it's a huge win.
Note that implementing OER and PER using the templates as they are
currently would be a bit of a challenge, as the interpreter would have
to first do a pass of each SEQUENCE/SET to determine the size and
layout of the OER/PER sequence/set preamble by counting the number of
OPTIONAL/DEFAULT members, BOOLEAN members, and extensibility markers
with extensions present. We could always generate more entries to
encode precomputed preamble metadata. We would also need to add a
template entry type for extensibility markers, which currently we do
not.
#!/bin/sh
size .libs/libasn1.dylib
size .libs/libasn1base.a | awk '{sum += $1} END {print sum}' | sed 's/^/TEXT baselib: /'
size .libs/asn1_*.o | awk '{sum += $1} END {print sum}' | sed 's/^/generated code stubs: /'
size *_asn1-template.o | awk '{sum += $1} END {print sum}' | sed 's/^/TEXT stubs: /'
exit 0
Notes about the template parser:
- assumption: code is large, tables smaller
- how to generate template based stubs:
make check asn1_compile_FLAGS=--template > log
- pretty much the same as the generate code, except uses tables instead of code
TODO:
- Make hdb work
- Fuzzing tests
- Performance testing
- ASN1_MALLOC_ENCODE() as a function, replaces encode_ and length_
- Fix SIZE constraits
- Compact types that only contain on entry to not having a header.
SIZE - Futher down is later generations of the template parser
code:
==================
__TEXT __DATA __OBJC others dec hex
462848 12288 0 323584 798720 c3000 (O2)
trivial types:
==================
__TEXT __DATA __OBJC others dec hex
446464 12288 0 323584 782336 bf000 (O2)
OPTIONAL
==================
__TEXT __DATA __OBJC others dec hex
425984 16384 0 323584 765952 bb000 (O2)
SEQ OF
==================
__TEXT __DATA __OBJC others dec hex
368640 32768 0 327680 729088 b2000 (O2)
348160 32768 0 327680 708608 ad000 (Os)
BOOLEAN
==================
339968 32768 0 327680 700416 ab000 (Os)
TYPE_EXTERNAL:
==================
331776 32768 0 327680 692224 a9000 (Os)
SET OF
==================
327680 32768 0 327680 688128 a8000 (Os)
TYPE_EXTERNAL everywhere
==================
__TEXT __DATA __OBJC others dec hex
167936 69632 0 327680 565248 8a000 (Os)
TAG uses ->ptr (header and trailer)
==================
229376 102400 0 421888 753664 b8000 (O0)
TAG uses ->ptr (header only)
==================
221184 77824 0 421888 720896 b0000 (O0)
BER support for octet string (not working)
==================
180224 73728 0 417792 671744 a4000 (O2)
CHOICE and BIT STRING missign
==================
__TEXT __DATA __OBJC others dec hex
172032 73728 0 417792 663552 a2000 (Os)
No accessor functions to global variable
==================
__TEXT __DATA __OBJC others dec hex
159744 73728 0 393216 626688 99000 (Os)
All types tables (except choice) (id still objects)
==================
__TEXT __DATA __OBJC others dec hex
167936 77824 0 421888 667648 a3000
base lib: 22820
__TEXT __DATA __OBJC others dec hex
==================
167936 77824 0 421888 667648 a3000 (Os)
baselib: 22820
generated code stubs: 41472
TEXT stubs: 112560
All types, id still objects
==================
__TEXT __DATA __OBJC others dec hex
155648 81920 0 430080 667648 a3000 (Os)
TEXT baselib: 23166
generated code stubs: 20796
TEXT stubs: 119891
All types, id still objects, dup compression
==================
__TEXT __DATA __OBJC others dec hex
143360 65536 0 376832 585728 8f000 (Os)
TEXT baselib: 23166
generated code stubs: 20796
TEXT stubs: 107147
All types, dup compression, id vars
==================
__TEXT __DATA __OBJC others dec hex
131072 65536 0 352256 548864 86000
TEXT baselib: 23166
generated code stubs: 7536
TEXT stubs: 107147