Newsgroup: mozilla.dev.tech.crypto
This document describes how the NSS code is organized, the libraries that get built from the NSS sources, and guidelines for writing NSS code. These guidelines will familiarize you with some of the ways things can be done in the NSS code. This will help you understand existing NSS code. It should also help you understand how to write new code, and where to place it.
Some of the guidelines in this document, are more forward-looking than documentary. These rules are here to help us all immediately achieve more consistent and usable code, but some existing code won't follow all these rules.
This section explains the structure and relationships of the NSS libraries. The Layering section explains how the NSS code is layered, and how higher-level functions wrap low-level functions. The Libraries section describes the NSS libraries, the functionality each provides, and the layer in which the library (mostly) operates.
Each separate component of the API should live in its own layer. The functions in these APIs should never call API layers above them. In addition, some low-level APIs may be completely opaque to higher level layers. That is, access to these functions should only be provided by the API directly above them. The NSS APIs are layered, as shown in this diagram:
The boxes in the gray section, towards the center, are exported only through PKCS #11. PKCS #11 is only exported through the Wrappers. The areas which need the most work (both here and throughout the code) is:
NSS compiles into the libraries described below. The Layer indicates the main layer, seen in the previous diagram, in which the library operates. The Directory is the location of the library code in the NSS source tree. The Public Headers is a list of header files that contain types, and functions, that are publicly available to higer-level APIs.
Library | Description | Layer | Directory | Public Headers |
---|---|---|---|---|
certdb | Provides all certificate handling functions and types. The certdb library manipulates the certificate database (add, create, delete certificates and CRLs). It also provides general certificate-handling routines (create a certificate, verify, add/check certificate extensions). | Low Cert | lib/certdb | cdbhdl.h, certdb.h, cert.h, certt.h |
certhi | Provides high-level certificate-related functions, that do not access the certificate database, nor individual certificate data directly. Currently, OCSP checking settings are exported through certhi. | High Cert | lib/certhigh | ocsp.h, ocspt.h |
crmf | Provides functions, and data types, to handle Certificate Management Message Format (CMMF) and Certificate Request Message Format (CRMF, see RFC 2511) data. CMMF no longer exists as a proposed standard; CMMF functions have been incorporated into the proposal for Certificate Management Protocols (CMP). | Same Level as SSL | lib/crmf | cmmf.h, crmf.h, crmft.h, cmmft.h, crmffut.h |
cryptohi | Provides high-level cryptographic support operations: such as signing, verifying signatures, key generation, key manipulation, hashing; and data types. This code is above the PKCS #11 layer. | Sign/Verify | lib/cryptohi | cryptohi.h, cryptoht.h, hasht.h, keyhi.h, keythi.h, key.h, keyt.h, sechash.h |
fort | Provides a PKCS #11 interface, to Fortezza crypto services. Fortezza is a set of security algorithms, used by the U.S. government. There is also a SWFT library that provides a software-only implementation of a PKCS #11 Fortezza token. | PKCS #11 | lib/fortcrypt | cryptint.h, fmutex.h, fortsock.h, fpkcs11.h, fpkcs11f.h, fpkcs11t.h, fpkmem.h, fpkstrs.h, genci.h, maci.h |
freebl | Provides the API to actual cryptographic operations. The freebl is a wrapper API. You must supply a library that implements the cryptographic operations, such as BSAFE from RSA Security. This is also known as the "bottom layer" API, or BLAPI. | Within PKCS #11, wraps Crypto | lib/freebl | blapi.h, blapit.h |
jar | Provides support for reading and writing data in Java Archive (jar) format, including zlib compression. | Port | lib/jar | jar-ds.h, jar.h, jarfile.h |
nss | Provides high-level initialiazation and shutdown of security services. Specifically, this library provides NSS_Init() for establishing default certificate, key, module databases, and initializing a default random number generator. NSS_Shutdown() closes these databases, to prevent further access by an application. | Above High Cert, High Key | lib/nss | nss.h |
pk11wrap | Provides access to PKCS #11 modules, through a unified interface. The pkcs11wrap library provides functions for selecting/finding PKCS #11 modules and slots. It also provides functions that invoke operations in selected modules and slots, such as key selection and generation, signing, encryption and decryption, etc. | Crypto Wrapper | lib/pk11wrap | pk11func.h, secmod.h, secmodt.h |
pkcs12 | Provides functions and types for encoding and decoding PKCS #12 data. PKCS #12 can be used to encode keys, and certificates, for export or import into other applications. | PKCS #12 | lib/pkcs12 | pkcs12t.h, pkcs12.h, p12plcy.h, p12.h, p12t.h |
pkcs7 | Provides functions and types for encoding and decoding encrypted data in PKCS #7 format. For example, PKCS #7 is used to encrypt certificate data to exchange between applications, or to encrypt S/MIME message data. | PKCS #7 | lib/pkcs7 | secmime.h, secpkcs7.h, pkcs7t.h |
softoken | Provides a software implementation of a PKCS #11 module. | PKCS #11: implementation | lib/softoken | keydbt.h, keylow.h, keytboth.h, keytlow.h, secpkcs5.h, pkcs11.h, pkcs11f.h, pkcs11p.h, pkcs11t.h, pkcs11u.h |
ssl | Provides an implementation of the SSL protocol using NSS and NSPR. | SSL | lib/ssl | ssl.h, sslerr.h, sslproto.h, preenc.h |
secutil | Provides utility functions and data types used by other libraries. The library supports base-64 encoding/decoding, reader-writer locks, the SECItem data type, DER encoding/decoding, error types and numbers, OID handling, and secure random number generation. | Utility for any Layer | lib/util | base64.h, ciferfam.h, nssb64.h, nssb64t.h, nsslocks.h, nssrwlk.h, nssrwlkt.h, portreg.h, pqgutil.h, secasn1.h, secasn1t.h, seccomon.h, secder.h, secdert.h, secdig.h, secdigt.h, secitem.h, secoid.h, secoidt.h, secport.h, secrng.h, secrngt.h, secerr.h, watcomfx.h |
This section describes the rules that (ideally) should be followed for naming and identifying new files, functions, and data types.
Each file should include a CVS ID string for identification. The preferred format is:
"@(#) $RCSfile: nss-guidelines.html, v $ $Revision: 48936 $ $Date: 2009-08-11 07:45:57 -0700 (Tue, 11 Aug 2009) $ $Name$"
You can put the string in a comment or in a static char array. Use #ifdef DEBUG to include the array in debug builds only. The advantage of using an array is that you can use strings(1) to pull the ID tags out of a (debug) compiled library. You can even put them in header files; the header files are protected from double inclusion. The only catch is that you have to determine the name of the array.
Here is an example from lib/base/baset.h:
#ifdef DEBUG static const char BASET_CVS_ID[] = "@(#) $RCSfile: nss-guidelines.html, v $ $Revision: 48936 $ $Date: 2009-08-11 07:45:57 -0700 (Tue, 11 Aug 2009) $ $Name$"; #endif /* DEBUG */
The difference, between this and Id, is that Id has some useless information (every file is "experimental"), and doesn't have Name. Name is the tag (if any) from which this file was pulled. If you're good with tagging your releases, and then checking out (or exporting!) from the tag for your build, this saves you from messing around with specific files revision numbers.
We have a preferred naming system for include files. We had been moving towards one, for some time, but for the NSS 3.0 project we finally wrote it down.
Data Types | Function Prototypes | |
---|---|---|
Public | nss____t.h | nss____.h |
Friend (only if required) | nss____tf.h | nss____f.h |
NSS-private | ____t.h | ____.h |
Module-private | ____tm.h | ____m.h |
The files on the right include the files to their left; the files in a row include the files directly above them. Header files always include what they need; the files are protected against double inclusion (and even double opening by the compiler).
Note: It's not necessary all eight files exist. Further, this is a simple ideal, and often reality is more complex.
We would like to keep names to 8.3, even if we no longer support win16. This usually gives us four characters to identify a module of NSS.
In short:
There are a number of ways of doing things in our API, as well as naming decisions for functions that can affect the usefulness of our library. If our library is self-consistent with how we accomplish these tasks, it makes it easier for the developer to learn how to use our functions. This section of the document should grow as we develop our API.
First some general rules. These rules are derived from existing coding practices inside the security library, since consistency is more important than debates about what might look nice.
There are many data structures in the security library whose definition is effectively private, to the portion of the security library that defines and operates on those data structures. External code does not have access to these definitions. The goal here is to increase the opaqueness of these structures. This will allow us to modify the size, definition, and format of these data structures in future releases, without interfering with the operation of existing applications that use the security library.
The first task is to ensure the data structure definition lives in a private header file, while its declaration lives in the public. The current standard in the security library is to typedef the data structure name, the easiest way to accomplish this would be to add the typedef to the public header file.
For example, for the structure SECMyOpaqueData you would add:
typedef struct SECMyOpaqueDataStr SECMyOpaqueData;
and add the actual structure definition to the private header file. In this same example:
struct SECMyOpaqueDataStr { unsigned long myPrivateData1; unsigned long myPrivateData2; char *myName; };
the second task is to determine if individual data fields, within the data structure, are part of the API. One example may be the peerCert field, in an SSL data structure. Accessor functions, for these data elements, should be added to the API.
There can be legitimate exceptions to this 'make everything opaque' rule. For example, in container structures, such as SECItem, or maybe linked list data structures. These data structures need to be examined on a case by case basis, to determine if
This section discusses memory allocation using arenas. NSS code uses arenas, and this section explains some of the improvements we are making.
NSS makes use of traditional memory allocation functions, wrapping NSPR's PR_Alloc in a util function called PORT_Alloc. Though NSS makes further use of an NSPR memory-allocation facility which uses 'Arenas' and 'ArenaPools'. This was added via javascript; a fast, lightweight, non-thread-safe (though 'free-threaded') implementation.
Experience shows that users of the security library expect arenas to be threadsafe, so we added locking, and other useful changes.
The ARENA_THREADMARK preprocessor definition (default in debug builds), and code it encloses, will add some checking for the following situation:
Threadmark code notes the thread ID, whenever an arena is marked, and disallows any allocations or marks by any other thread. (Frees are allowed.)
The ARENA_DESTRUCTOR_LIST preprocessor definition, and the code it encloses, are an effort to make the following work together:
All these are useful, but they don't combine well. Now some of the pointer-tracking pressure has eased off, we can drop its use when it becomes too difficult.
Many routines are defined to take an NSSArena *arenaOpt argument. This means if an arena is specified (non-null), it is used, otherwise (null) the routine uses the heap. You can think of the heap as a default arena you can't destroy.
NSS 3.0 introduces the concept of an error stack. When something goes wrong, the call stack unwinds, with routines returning an error indication. Each level which flags a problem, adds its own error number to the stack. At the bottom of the stack is the fundamental error, for example: file not found, and on top is an error precisely relating to what you are doing.
Note: Error stacks are vertical, and never horizontal. If multiple things go wrong simultaneously, and you want to report them all, use another mechanism.
Errors, though not integers, are done as external constants, instead of preprocessor definitions. This is so any additional error doesn't trigger the entire tree to rebuild. Likewise, the external references to errors are made in the prototypes files, with the functions which can return them. Error stacks are thread-private.
The usual semantic is that public routines clear the stack first, private routines don't. Usually, every public routine has a private counterpart, and the implementation of the public routine looks like this:
NSSImplement rv * NSSType_Method ( NSSType *t, NSSFoo *arg1, NSSBar *arg2 ) { nss_ClearErrorStack(); #ifdef DEBUG if( !nssFoo_verifyPointer(arg1) ) return (rv *)NULL; if( !nssBar_verifyPointer(arg2) ) return (rv *)NULL; #endif /* DEBUG */ return nssType_Method(t, arg1, arg2); }
Aside from error cases, all documented entry points should check pointers in a debug, wherever possible. Pointers to user-supplied buffers, and templates, should be checked against NULL. Pointers to context-style functions should be checked using special debug macros. These macros only define code when DEBUG is turned on, providing a way for systems to register, deregister, and check valid pointers.
SECPORT_DECL_PTR_CLASS(classname, size) - declare a class of pointers (labelled classname) this object file needs to check. This class is local only to this object file. Size is the expected number of pointers of type classname.
SECPORT_DECL_GLOBAL_PTR_CLASS(classname, size) - same as above except classname can be used in other object files.
SECPORT_ADD_POINTER(classname, pointer) - Add pointer as a valid pointer for classclassname. This is usually called by a Create function.
SECPORT_VERIFY_POINTER(classname, pointer, secError, returnValue)- Check if a given pointer really belongs to the requested class. If it doesn't set the error secError and return the value returnValue.
SECPORT_REMOVE_POINTER(classname, pointer) - Remove a pointer from the valid list. Usually called by a destroy function.
Finally, error logging should be added an documented when debug is turned on. Interfaces for these are in NSPR.
Code developed using the NSS APIs needs to make use of thread safety features. First to examine is object creation and deletion.
Object creation is usually not a problem. No other threads have access to allocated memory just created. Exceptions to this include objects which are created on the fly, or as global objects.
Deletion, on the other hand, may be trickier. Threads may be referencing the object at the same time a another thread tries to delete it. The semantics depend on the way the application uses the object, also how and when the application wants to destroy it. For some data structures, this problem can be removed by protected reference counting. The object does not disappear until all users have released it.
Next we examine global data, including function local static structures. Just initialized, and never to be changed global data, does not need to protection from mutexes. We should also determine if global data should be moved to a session context (see session context and global effects below).
Note: Permanent objects, like data in files, databases, tokens, etc. should be treated as global data. Global data which is changed rarely, should be protected by reader/writer locks.
Aside from global data, allocated data that gets modified needs to be examined. Data that's just been allocated, within a function, is safe to modify. No other code has access to that data pointer. Once that data pointer is made visible to the 'outside', either by returning the pointer, or attaching the pointer to an existing visible data structure, access to the data should be protected. Data structures that are read only, like SECKEYPublicKeys or PK11SymKeys, need not be protected.
Many of the data structures in the security code contain some sort of session state or session context. These data structures may be accessed without data protection as long as:
Examples of these data in structures may include things like the PKCS #7 ContentInfo structure. Example code should be included in the documentation, to show how to safely use these data objects.
A major type of global and allocated data that should be examined is various data on lists. Queued, linked, and hash table stored objects should be examined with special care. Make sure adding, removing, accessing, and destroying these objects are all safe operations.
There are a number of strategies, and entire books about how to safely access data on lists. Some simple strategies and their issues:
Where possible use the NSPR list primitives. From these you can even set up SECUtil style thread-safe lists that use some combination of the above strategies.
In order to be fully thread safe, your code must understand the semantics of the service functions it calls, and whether they are thread safe. For now, we should internally document which service functions we call, and how we expect them to behave in a threaded environment.
Finally, from an API point of view, we should examine functions which have global effects. Functions like XXX_SetDefaultYYY(); should not operate on global data, particularly if they may be called multiple times, to provide different semantics for different operations. For example, the following should be avoided :
Instead, a context handle should be created, and the SEC_SetKey() function, above, made on that handle. Fortunately most of the existing API has the correct semantics.
The exception to this global effects rule may be functions which set global state for an application at initialization time.
If a layer has some global initialization tasks, which need to be completed before the layer can be used, that layer should supply an initialization function of the form LAYER_Init(). If an initialization function is supplied, a corresponding LAYER_Shutdown() function should also be supplied. LAYER_INIT() should increment a count of the number of times it is called, and LAYER_Shutdown() should decrement that count, and shutdown when the count reaches '0'.
Open functions should have a corresponding close function. Open and close function are not reference counted, like init and shutdown functions.
In general, data objects should all have functions which create them. These functions should have the form LAYER_CreateDataType[FromDataType](). For instance generating a new key would change from PK11_KeyGen() to PK11_CreateSymKey().
In the security library we have 3 different ways of saying 'get rid of this data object': Free, Delete, and Destroy.
It turns out there are several different semantics of getting rid of a data object too:
Unfortunately, within the security library Free, Delete, and Destroy are all used interchangeably, for all sorts of object destruction. For instance, CERT_DestroyCertificate() is type 1, PK11_DestroySlot() is type 2, and PK11_DestroyTokenObject() is type 3.
Note: In non-reference counted functions, types 1 and 2 are the same.
We are standardizing on the following definitions:
Destroy - means #1 for reference counted objects, #2 for non reference counted objects.
Delete - means #3.
This has the advantage of not surfacing the reference countedness of a data object. If you own a pointer to an object, you must always destroy it. There is no way to destroy an object by bypassing it's reference count. Also, the signature of public destruction functions do not have the 'freeit' PRBool, since the structures being freed are opaque.
Functions that return a new reference or copy of a given object should have the form LAYER_DupDataType(). For instance, CERT_DupCertifiate() will remain the same, but PK11_ReferenceSlot() will become PK11_DupSlot(), and PK11_CloneContext() will become PK11_DupContext().
There are several different kinds of searches done via the security library. The first is a search for exactly one object, meeting a given criteria. These types of searches include CERT_FindCertByDERCert(), PK11_FindAnyCertFromDERCert(), PK11_FindKeyByCert(), PK11_GetBestSlot(). These functions should all have the form LAYER_FindDataType[ByDataType]().
The second kind of search, looks for all the objects that match a given criteria. These functions operate on a variety of levels. Some return allocated arrays of data, some return linked lists of data, others use callbacks to return data elements one at a time. Unfortunately, there are good reasons to maintain all these types. So here are some guidelines to make them more manageable:
All callback operating search functions should be in the low level of the API, if exposed at all. Developers dealing with SSL and PKCS #7 layers should not have to see any of these functions. These functions should have the form LAYER_TraverseStorageObjectOrList().
List and Array returning functions should be available at the higher layers of the API, most wrapping LAYER_Traverse() functions. They should have the form LAYER_LookupDataType{List|Array}[ByDataType]().
Accessor Functions should take the following formats:
LAYER_DataTypeGetElement() -- Get a specific element of a data structure.
LAYER_DataTypeSetElement() -- Set a specific element of a data structure.
LAYER_DataTypeExtractDataType() -- Get a pointer to the second data type which was derived for elements of the first data type.
Examples: PK11_SlotGetSeries(), PK11_SymKeyGetSeries(), CERT_CertificateExtractPublicKey()
Most functions will have a 'Natural' ordering for parameters. To keep consistency we should have some minimal parameter consistency. For most functions, they can be seen as operating on a particular object. This object, that the function is operating on, should come first. For instance, in most SSL functions this is the NSPR Socket, or the SSL Socket structure: Update, final, encrypt, decrypt type functions operating on their state contexts, etc.
All encrypt and decrypt functions, which return data inline, should have a consistent signature:
SECStatus MY_FunctionName(MyContext *context, unsigned char *outBuf, SECBufferLen *outLen, SECBufferLenmaxOutLength, unsigned char *inBuf, SECBufferLeninLen)
Encrypt and decrypt like functions which have different properties, additional parameters, callbacks, etc., should insert their additional parameters between the context (first parameter) and the output buffer.
All hashing update, MACing update, and encrypt/decrypt functions which act like filters should have a consistent signature:
SECStatus PK11_DigestOp(PK11Context *context, unsigned char *inBuf, SECBufferLeninLen)
Functions like these which have different properties, for example, additional parameters, callbacks, etc., should insert their additional parameters between the context (first parameter) and the input buffer.
Within your layer, multiple similar functions should have consistent parameter order.
Callback functions should all contain an opaque parameter (void *) as their first argument, passed by the original caller. Callbacks which are set, like SSL callbacks, should have defaults which provide generally useful semantics.