The value we initialize a CHOICE's `element` enum field to, prior to
decoding the CHOICE value, needs be one that does not cause
`asn1_free()` to do anything since we've not done anything yet either.
Should always run AFL before pushing changes to the ASN.1 compiler or
template interpreter!
Still fuzzing. There are crashers in the _asn1_print() path, though
right now they cannot affect anything else in Heimdal other than
asn1_print, since that's the only thing calling it yet.
Status:
- And it works!
- We have an extensive test based on decoding a rich EK certficate.
This test exercises all of:
- decoding
- encoding with and without decoded open types
- copying of decoded values with decoded open types
- freeing of decoded values with decoded open types
Valgrind finds no memory errors.
- Added a manual page for the compiler.
- rfc2459.asn1 now has all three primary PKIX types that we care about
defined as in RFC5912, with IOS constraints and parameterization:
- `Extension` (embeds open type in an `OCTET STRING`)
- `OtherName` (embeds open type in an `ANY`-like type)
- `SingleAttribute` (embeds open type in an `ANY`-like type)
- `AttributeSet` (embeds open type in a `SET OF ANY`-like type)
All of these use OIDs as the open type type ID field, but integer
open type type ID fields are also supported (and needed, for
Kerberos).
That will cover every typed hole pattern in all our ASN.1 modules.
With this we'll be able to automatically and recursively decode
through all subject DN attributes even when the subject DN is a
directoryName SAN, and subjectDirectoryAttributes, and all
extensions, and all SANs, and all authorization-data elements, and
PA-data, and...
We're not really using `SingleAttribute` and `AttributeSet` yet
because various changes are needed in `lib/hx509` for that.
- `asn1_compile` builds and recognizes the subset of X.681/682/683 that
we need for, and now use in, rfc2459.asn1. It builds the necessary
AST, generates the correct C types, and generates templating for
object sets and open types!
- See READMEs for details.
- Codegen backend not tested; I won't make it implement automatic open
type handling, but it should at least not crash by substituting
`heim_any` for open types not embedded in `OCTET STRING`.
- We're _really_ starting to have problems with the ITU-T ASN.1
grammar and our version of it...
Type names have to start with upper-case, value names with
lower-case, but it's not enough to disambiguate.
The fact the we've allowed value and type names to violate their
respective start-with case rules is causing us trouble now that we're
adding grammar from X.681/682/683, and we're going to have to undo
that.
In preparation for that I'm capitalizing the `heim_any` and
`heim_any_set` types, and doing some additional cleanup, which
requires changes to other parts of Heimdal (all in this same commit
for now).
Problems we have because of this:
- We cannot IMPORT values into modules because we have no idea if a
symbol being imported refers to a value or a type because the only
clue we would have is the symbol's name, so we assume IMPORTed
symbols are for types.
This means we can't import OIDs, for example, which is super
annoying.
One thing we might be able to do here is mark imported symbols as
being of an undetermined-but-not-undefined type, then coerce the
symbol's type the first time it's used in a context where its type
is inferred as type, value, object, object set, or class. (Though
since we don't generate C symbols for objects or classes, we won't
be able to import them, especially since we need to know them at
compile time and cannot defer their handling to link- or
run-time.)
- The `NULL` type name, and the `NULL` value name now cause two
reduce/reduce conflicts via the `FieldSetting` production.
- Various shift/reduce conflicts involving `NULL` values in
non-top-level contexts (in constraints, for example).
- Currently I have a bug where to disambiguate the grammar I have a
CLASS_IDENTIFIER token that is all caps, while TYPE_IDENTIFIER must
start with a capital but not be all caps, but this breaks Kerberos
since all its types are all capitalized -- oof!
To fix this I made it so class names have to be all caps and
start with an underscore (ick).
TBD:
- Check all the XXX comments and address them
- Apply this treatment to Kerberos! Automatic handling of authz-data
sounds useful :)
- Apply this treatment to PKCS#10 (CSRs) and other ASN.1 modules too.
- Replace various bits of code in `lib/hx509/` with uses of this
feature.
- Add JER.
- Enhance `hxtool` and `asn1_print`.
Getting there!
The regular ASN.1 compiler does NOT sort SET { ... } types' members by
tag, though it should. It cannot because if a field is of an untagged
imported type, then the compiler won't know the field's tag because the
compiler does not read and parse IMPORTed modules. At least the regular
ASN.1 compiler does handle out-of-order encodings on decode.
The template ASN.1 compiler did not even support SET { ... } types at
all. With this commit the template ASN.1 compiler does, but still it
does not sort members on encode, and it does not decode out-of-
[definition-]order encodings.
A proper fix to these issues will require run-time sorting of SET
members on encode. An even better fix will require making the compiler
able to read and parse more than one module in one run, that way it can
know all the things about IMPORTed types that it currently leaves to
run-time.
Finally. We're almost at parity for the template compiler.
Now we have a build option to use templating:
`./configure --enable-asn1-templating`
Tests fail if you build `rfc2459.asn1` with `--template`.
TBD: Figure out what differences remain between the two compilers, and
fix the templating compiler accordingly, adding tests along the
way.
Making IMPLICIT tags work in the templating compiler turned out to be a
simple fix: don't attempt to do anything clever about IMPLICIT tags in
the template generator in the compiler other than denoting them --
instead leave all the smarts about IMPLICIT tags to the interpreter.
This might be a very slight pessimization, but also a great
simplification.
The result is very elegant: when the interpreter finds an IMPLICIT
tag it then recurses to find the template for the body of the type
so-tagged, and evaluates that. Much more elegant than the code
generated by the non-template compiler, not least for not needing
any additional temporary memory allocation.
With this we finally have parity in basic testing of the template
compiler. Indeed, for IMPLICIT tags the template compiler and
interpreter might even be better because they support IMPLICIT tags
with BER lengths, whereas the non-template compiler doesn't (mostly
because `der_replace_tag()` needs to be changed to support it.
And, of course, the template compiler is simply superior in that it
produces smaller code and is *much* easier to work with because the
functions to interpret templates are small and simple. Which means we
can add more functions to deal with other encoding rules fairly
trivially. It should be possible to add all of these with very little
work, almost all of it localized to `lib/asn1/template.c`:
- PER Packed Encoding Rules [X.691]
- XER XML Encoding Rules [X.693]
- OER Octet Encoding Rules [X.696] (intended to replace PER)
- JER JSON Encoding Rules [X.697] (doubles as visual representation)
- GSER Generic String E.R.s [RFC3641] (a visual representation)
- XDR External Data Repr. [STD67][RFC4506]
(XDR is *not* an ASN.1 encoding rules specification, but it's a
*lot* like PER/OER but with 4-octet alignment, and is specified
for the syntax equivalent (XDR) of only a subset of ASN.1 syntax
and semantics.)
All we'd have to do is add variants of `_asn1_{length,encode,decode}()`
for each set of rules, then generate per-type stub functions that call
them (as we already do for DER).
We could then have an encoding rule transliteration program that takes a
`TypeName` and some representation of a value encoded by some encoding
rules, and outputs the same thing encoded by a different set of rules.
This would double as a pretty-printer and parser if we do add support
for JER and/or GSER. It would find the template for the given type
using `dlsym()` against some shared object (possibly `libasn1` itself).
Whereas generating source code for C (or whatever language) for
additional ERs requires much more work. Plus, templates are much
smaller, and the interpreter is tiny, which yields much smaller text and
much smaller CPU icache/dcache footprint, which yields better
performance in many cases.
As well, the template system should be much easier to port to other
languages. Though in the cases of, e.g., Rust, it would require use of
`unsafe` in the interpreter, so in fact the inverse might be true: that
it's easier to generate safe Rust code than to implement a template
interpreter in Rust. Similarly for Haskell, OCAML, etc. But wherever
the template interpreter is easy to implement, it's a huge win.
Note that implementing OER and PER using the templates as they are
currently would be a bit of a challenge, as the interpreter would have
to first do a pass of each SEQUENCE/SET to determine the size and
layout of the OER/PER sequence/set preamble by counting the number of
OPTIONAL/DEFAULT members, BOOLEAN members, and extensibility markers
with extensions present. We could always generate more entries to
encode precomputed preamble metadata. We would also need to add a
template entry type for extensibility markers, which currently we do
not.
The earlier fixes to the ASN.1 compiler for IMPLICIT tags did not
include the template interpreter.
TBD:
- TESTImplicit encoding/decoding still fails due to a bug in the
template generator.
- There are missing cases in the template interpreter. See XXX
comments.
ASN.1 INTEGERs will now compile to C int64_t or uint64_t, depending
on whether the constraint ranges include numbers that cannot be
represented in 32-bit ints and whether they include negative
numbers.
Template backend support included. check-template is now built with
--template, so we know we're testing it.
Tests included.
Having that symbol exported clobbers the namespace and makes other
apps fail, most notably pdftex. I don't believe that the symbol is in
fact intended for public use. Fixes http://bugs.gentoo.org/357235 .