PCP API Documentation and more changes - Updated 04.03.2014 23:10

During the last weeks I made lots of changes to the PCP code. Most of it was refactoring of internal stuff. First, I implemented a C Buffer "class", inspired by OpenSSH's buffer.c. I really like it. It can be used to incrementally fill a buffer, the buffer resizes automatically and makes boundary checks on every call to avoid buffer overflows. I can also directly put numbers of different sizes into it and multibyte numbers can be converted to big-endian automatically. The same prinziples can be used in the reverse, it's possible to read from a buffer little chunks. It remembers the last read offset and also supports host-endian conversion.

On top I wrote a C stream "class". This one is even nicer, since it can be used to read/write from/to files OR buffers. It behaves almost like the FILE interface, you can use ps_write() to put data out or ps_read() to fetch data in. That way I can do blockwise encryption on files and memory buffers. In fact, the encryption API uses those streams. But the latest addition to the stream class is even better: it can transparently encode to Z85 encoding. And in read-mode it can decode and determine automatically if the input is encoded. I'm not using it in the API yet, but the sample program already uses it and it works like a charm.

As you can see from the various links in this post, I also added some fairly amount of API documentation which is available here for online reading. It's generated using doxygen and I love it.

What else? I changed the sign and crypt mode of pcp, it signs the recipient list and a hash of the original content, and encrypts the signature. I re-factored some of the Z85 code to be more reliable and fault tolerant. I changed the native format of PCP public key exports. Public keys are now exported in RFC4880 (OpenPGP) format. While my exports are incompatible with OpenPGP (which is not intended anyway), the format is much more flexible than my old format. Previously I just dumped the pubkey C-structure to disk. While that worked well, it's completely impractical if I ever change that structure (which I did a lot so far). Now with the RFC4880 format, exported keys are indepentent from the internal structure, therefore I can change it as I wish without ever getting incompatible to old exports.

While I was at it, I also changed the export format of secret keys. While I don't use RFC4880 for this, I use at least a much more formalized format and not just a structure dump. Also, the whole export is now encrypted, not just the secret key blobs inside.

Last but not least, you can now export public keys in a couple of programming languages, such as a perl structure, or C code or YAML. That way an exported key can be used in small programs without the hassle of generating and maintaining keys. Just use PCP for key management.

As always latest source is on Github.

Update 28.02.2014 17:24:

During the last days I made another large change, I changed the header markers for Z85 encoded data. Until now I had something like
----- BEGIN blah...-----
just as it's being used in PEM and elsewhere. This is no problem at all for small files like keys. But I wanted to have armored encryption as well, and there I fell heavily on my nose, including bleeding, crying and cursing of course.

The thing is, that the hyphen is a legitimate Z85 character and it may happen that ----- is legitimate Z85 encoded content. Proof:

$ perl -e 'foreach ((0xc6, 0x5a, 0x0b, 0x13)) { print chr($_) }' | pcp1 -z
And since I'm reading input in streaming mode, it may happen that such a marker crosses block boundaries. So, I had to solve this. And my solution was quite genious or so I thought *g*. I just used the tilde character as marker which is not part of the Z85 characterset. One tilde starts a comment, another finishes it. Really easy. I had not much to change and it worked like a charm, even via block boundaries.

However, I spoke with Pieter about the issue and basically he told me, that my idea was bad (he said: "badly-designed parser"). Well. As of this writing the tilde-mode code is on github, but I already started to rewrite it. Again! Holy shit.

So in the next iteration (hopefully the last one, it's a boring business), I'll use hyphens again but read the streams in a totally different way. In fact if an input stream is considered Z85 encoded, it will be read linewise by the decoder. The decoder parses every line then and that way it's easily possible to detect headers, comments and footers. Once a Z85 block is complete, it will be decoded and put into the internal read cache. If there is something left on the line, it will be saved for the next iteration. So in fact, there are now 2 caches, one for decoded data (used by the caller) and one for undecoded data (kind of read ahead cache).

It's not done yet and I'm sure this stuff will steal me another couple days. Damn. However, at least it's currently in a state where it can parse and decode a complete stream. But I didn't try it with different blocksizes and I didn't even dare to run the unittests yet.

The new code isn't pushed to github so far, because yesterday I had a major problem with git. Thanks god I managed to solve it. "git merge", goddamnshit.

Update 02.03.2014 10:41:

So, finally I reverted the tilde stuff successfully. Now, PCP uses hyphens again. While I was at it, I enhanced the decoder and parser a lot. It's now more robust and parses the input linewise. Both the pcpstream decoder and the string decoder (z85_readstring()) now use the same framework. Previously they were independent from each other, so in fact I had two parsers. This was odd anyway so I generalized it.

The only remaining bad thing are clear signatures. They are parsed directly in ed.c and not by the stream decoder. I need to extend the stream decoder to be able to work on that stuff as well.

This Github commit was the last change to get into a stable state again. All unittests pass again. Thanks god (and my wife for her patience!)

Update 02.03.2014 23:11:

While I was at it, I fixed a couple of other encoding related bugs, added unittests for it, enhanced the commandline a little and added a verbose key listing feature (someone on cypherpunks requested it). Usually a key listing looks like this:

pcp1 -l
Key ID               Type             Creation Time        Owner
0xB5B64D99AE73F3BE   primary secret   2014-03-02T23:14:23  Mallory 
0x629AFD2418EFA3BA   secret           2014-03-01T18:50:06  Alicia 
0x969D5931D7B409C6   valid public     2014-03-01T18:50:07  Bobby 
0x4EF5795E2874AD8D   valid public     2014-03-01T18:50:09  Bart 
Note the validity new flag for public keys.

Now, the new verbose listing:

pcp1 -L
Key ID               Type             Creation Time        Owner
0xB5B64D99AE73F3BE   primary secret   2014-03-02T23:14:23  Mallory 
    88b3a815 49c28236 7e6e3c31 17c286c5 7905c7a7 ec78911f 1fd76563 5688e4c0 
    encrypted: yes, serial: c940b8f0, version: 6

0x629AFD2418EFA3BA   secret           2014-03-01T18:50:06  Alicia 
    076f002c 37b39ab5 cb0818b7 1fe33168 38b4d7d6 1b6e52c2 25229159 5405ec86 
    encrypted: yes, serial: 5386733f, version: 6

0x969D5931D7B409C6   valid public     2014-03-01T18:50:07  Bobby 
    2d1efc28 ef294913 06a914be 986975d9 869d01e1 82ea026f a4c16c98 b6a2e2bb 
    signed: yes, serial: f7cb26b4, version: 6, signature fingerprint:
    324fde54 3f6725ee a8c74f67 998e5b61 10a6f2db cdb2f282 1a689be2 3af1e514 

0x4EF5795E2874AD8D   valid public     2014-03-01T18:50:09  Bart 
    9b660a1b a688d8fa 4a3b3a02 78b75363 70d01656 30045245 55d74944 f08bb5ab 
    signed: yes, serial: 1b4ed012, version: 6, signature fingerprint:
    5449b8f4 9f0fe50e 3e46c1be e9225e26 aa1354bf 6bd105c3 147a9870 8a531161 
Basically it displays the fingerprint of the keys, some flags and - if present - the key signature fingerprint. I store the signature anyway but didn't use or display it yet.

Update 04.03.2014 23:10:

I've got the last big change done, I removed -P and -S, now keys of any type are imported with -K. The new importer uses the Pcpstream decoder as well, so the last remaining part which isn't using it, is the clearsig reader. I also fixed more bugs in the decoder.