Simple cmds_scanf() v2 example
|
|
All integer values (whether data or length values) should be encoded/entered into the serialized data as big-endian values.
Here is a carefully crafted cmd struct, designed to demonstrate what can go wrong:
Simple example of a Problematic C Struct
|
|
A PATTERN for this structure (as commented) is u1u2u4. An equally correct pattern is u2u1u4. Or u4u2u1. Each of these patterns have the correct number of tokens, and the input buffer size will match what the tokens tell cmds_scanf to expect.
To the cmds_scanf command, these three patterns match the cmd struct given. What will differ is how the incoming buffer was serialized on the host, how p_cmd is interpreted, and what the effective result values in the cmd struct are. The call may return no error -- but the populated struct would have incorrect information in it.
The cmds_scanf method is smart enough to identify when the p_cmd length differs from what the pattern tells cmds_scanf to expect. In the above example, if data is supplied for the terminal token ('v1' or '\*') the pattern is accepted. If no data is supplied (valid when '*'), the 'v1' pattern will generate a deserialization error (unexpected end of input).
If the host programmer is misinformed or doesn't understand how things are parsed on the CryptoServer, they could easily provide the correct data, but be responsible for an incorrect serialization of that data. An incorrect serialization would result in either the parse failing, or worse, the parse succeeding but the data being corrupted.
Common failure modes:
-
Adding a length value into the data when the pattern expects a 'remainder' mark, '+*+',
-
Using an incorrect length mark size (serialized v4, expected v2),
-
Two integers like `u4u4`, but on the host they are encoded as "flags, then specifier", but on the HSM as "specifier, then flags".
-
...
The cmds_scanf method is a "limited copy" parser. It is limited copy, because integers (the length fields of 'v' tokens, or 'u' tokens) are copied by value, but data fields (for the 'v' and '*' tokens) are pointed to.
Do not attempt to modify data pointed to by a field in the cmd struct, as populated by cmds_scanf -- the sub-function code does not 'own' that data.
Because data fields are pointers, subsequent code can use the struct members for read-only access to data within the passed-in p_cmd data. However: If the code tries to modify the pointed to memory, results are indeterminate (the p_cmd data is 'owned' by the parent context, not by the sub-function method).
Again, the programmer is expected to understand how the pattern scanning works on the HSM, what is required in the interface (based on each field within the pattern), the order of the fields being serialized, and also is expected to know what is best practice on the host, for assembly of the serialized data in a given language, taking into account endianness, length markers for 'v'-type fields, etc. The programmer is also expected to understand that the CryptoServer is big-endian, that a command byte array (or response) is limited to 256Kb of data (less some overhead), etc.
There are, in short, several different failure modes that automation could check for, or ensure do not happen.
While cmds_scanf-like methodology is provided on the host, it is not available in the high-level (cryptosystem) APIs, only in the low-level libcsx \(C) api.