<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Cryptech Project - FutureWork</title><link href="https://wiki.cryptech.is/" rel="alternate"></link><link href="https://wiki.cryptech.is/feeds/futurework.atom.xml" rel="self"></link><id>https://wiki.cryptech.is/</id><updated>2017-07-27T19:02:00+00:00</updated><entry><title>Secure Channel</title><link href="https://wiki.cryptech.is/SecureChannel" rel="alternate"></link><published>2017-07-27T00:24:00+00:00</published><updated>2017-07-27T19:02:00+00:00</updated><author><name>Rob Austein</name></author><id>tag:wiki.cryptech.is,2017-07-27:/SecureChannel</id><summary type="html">&lt;p&gt;This is a sketch of a design for the secure channel that we want to
have between the Cryptech HSM and the client libraries which talk to
it.  Work in progress, and not implemented yet because a few of the
pieces are still missing.&lt;/p&gt;
&lt;h2&gt;Design goals and constraints&lt;/h2&gt;
&lt;p&gt;Basic design …&lt;/p&gt;</summary><content type="html">&lt;p&gt;This is a sketch of a design for the secure channel that we want to
have between the Cryptech HSM and the client libraries which talk to
it.  Work in progress, and not implemented yet because a few of the
pieces are still missing.&lt;/p&gt;
&lt;h2&gt;Design goals and constraints&lt;/h2&gt;
&lt;p&gt;Basic design goals:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;End-to-end between client library and HSM.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Not require yet another presentation layer if we can avoid it (so,
    reuse XDR if possible, unless we have some strong desire to switch
    to something else).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Provide end-to-end message integrity between client library and HSM.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Provide end-to-end message confidentiality between client library
    and HSM.  We only need this for a few operations, but between PINs
    and private keys it would be simpler just to provide it all the
    time than to be selective.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Provide some form of mutual authentication between client library
    and HSM.  This is tricky, since it requires either configuration
    (of the other party's authenticator) or leap-of-faith.
    Leap-of-faith is probably good enough for most of what we really
    care about (insuring that we're talking to the same dog now as we
    were earlier).&lt;/p&gt;
&lt;p&gt;Not 100% certain we need this at all, but if we're going to leave
ourselves wide open to monkey-in-the-middle attacks, there's not
much point in having a secure channel at all.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Use boring simple crypto that we already have (or almost have) and
    which runs fast.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Continue to support multiplexer.  Taken together with end-to-end
    message confidentiality, this may mean two layers of headers: an
    outer set which the multiplexer is allowed to mutate, then an
    inner set which is protected.  Better, though, would be if the
    multiplexer can work just by reading the outer headers without
    modifying anything.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Simple enough that we can implement it easily in HSM, PKCS #11
    library, and Python library.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Why not TLS?&lt;/h2&gt;
&lt;p&gt;We could, of course, Just Use TLS.  Might end up doing that, if it
turns out to be easier, but TLS is a complicated beast, with far more
options than we need, and doesn't provide all of what we want, so a
fair amount of the effort would be, not wasted exactly, but a giant
step sideways.  Absent sane alternatives, I'd just suck it up and do
this, with a greatly restricted ciphersuite, but I think we have a
better option.&lt;/p&gt;
&lt;h2&gt;Design&lt;/h2&gt;
&lt;p&gt;Basic design lifted from "Cryptography Engineering: Design Principles
and Practical Applications" (ISBN 978-0-470-47424-2,
http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470474246.html),
tweaked in places to fit tools we have readily available.&lt;/p&gt;
&lt;p&gt;Toolkit:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;AES&lt;/li&gt;
&lt;li&gt;SHA-2&lt;/li&gt;
&lt;li&gt;ECDH&lt;/li&gt;
&lt;li&gt;ECDSA&lt;/li&gt;
&lt;li&gt;XDR&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As in the book, there are two layers here: the basic secure channel,
moving encrypted-and-authenticated frames back and forth, and a higher
level which handles setup, key agreement, and endpoint authentication.&lt;/p&gt;
&lt;p&gt;Chapter 7 outlines a simple lower layer using AES-CTR and
HMAC-SHA-256.  I don't see any particular reason to change any of
this, AES-CTR is easy enough.  I suppose it might be worth looking
into AES-CCM and AES-GCM, but they're somewhat more complicated;
section 7.5 ("Alternatives") discusses these briefly, we also know
some of the authors.&lt;/p&gt;
&lt;p&gt;For key agreement we probably want to use ECDH.  We don't quite have
that yet, but in theory it's relatively minor work to generalize our
existing ECDSA code to cover that too, and, again in theory, it should
be possible to generalize our existing ECDSA fast base point multiplier
Verilog cores into fast point multiplier cores (sic: limitation of the
current cores is that they only compute scalar times the base point,
not scalar times an arbitrary point, which is fine for ECDSA but
doesn't work for ECDH).&lt;/p&gt;
&lt;p&gt;For signature (mutual authentication) we probably want to use ECDSA,
again because we have it and it's fast.  The more interesting question
is the configuration vs leap-of-faith discussion, figuring out under
which circumstances we really care about the peer's identity, and
figuring out how to store state.&lt;/p&gt;
&lt;p&gt;Chapter 14 (key negotiation) of the same book covers the rest of the
protocol, substituting ECDH and ECDSA for DH and RSA, respectively.
As noted in the text, we could use a shared secret key and a MAC
function instead of public key based authentication.&lt;/p&gt;
&lt;p&gt;Alternatively, the Station-to-Station protocol described in 4.6.1 of
"Guide to Elliptic Curve Cryptography" (ISBN 978-0-387-95273-4,
https://link.springer.com/book/10.1007/b97644) appears to do what
we want, straight out of the box.&lt;/p&gt;
&lt;p&gt;Interaction with multiplexer is slightly interesting.  The multiplexer
really only cares about one thing: being able to match responses from
the HSM to queries sent into the HSM, so that the multiplexer can send
the responses back to the right client.  At the moment, it does this
by seizing control of the client_handle field in the RPC frame, which
it can get away with doing because there's no end-to-end integrity
check at all (yuck).  We could add an outer layer of headers for the
multiplexer, but would rather not.&lt;/p&gt;
&lt;p&gt;The obvious "real" identity for clients to use would be the public
keys (ECDSA in the above discussion) they use to authenticate to the
HSM, or a hash (perhaps truncated) thereof.  That's good as far as it
goes, and may suffice if we can assume that clients always have unique
keys, but if client keys are something over which the client has any
control (which includes selecting where they're stored, which we may
not be able to avoid), we have to consider the possibility of multiple
clients using the same key (yuck).  So a candidate replacement for the
client_handle for multiplexer purposes would be some combination of a
public key hash and a process ID, both things the client could provide
without the multiplexer needing to do anything.&lt;/p&gt;
&lt;p&gt;The one argument in favor of leaving control of this to the
multiplexer (rather than the endpoints) is that it would (sort of)
protect against one client trying to masquerade as another -- but
that's really just another reason why clients should have their own
keys to the extent possible.&lt;/p&gt;
&lt;p&gt;As a precaution, perhaps the multiplexer should check for duplicate
identifiers, then do, um, something? if it finds duplicates.  This
kind of violates Steinbach's Guideline for Systems Programming ("Never
test for an error condition you don't know how to handle").  Obvious
answer is to break all connections old and new using the duplicate
identity, minor questions about how to reset from that, whether worth
doing at all, etc.  Maybe clients just shouldn't do that.&lt;/p&gt;
&lt;h2&gt;Open issues&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Does the resulting design pass examination by clueful people?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Does this end up still being significantly simpler than TLS?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The Cryptography Engineering protocols include a hack to work
    around a length extension weakness in SHA-2 (see section 5.4.2).
    Do we need this?  Would we be better off using SHA-3 instead?  The
    book claims that SHA-3 was expected to fix this, but that was
    before NIST pissed away their reputation by getting too cosy with
    the NSA again.  Over my head, ask somebody with more clue.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;</content><category term="FutureWork"></category></entry><entry><title>Development of a Cryptech ASIC Implementation</title><link href="https://wiki.cryptech.is/ASICImplementations" rel="alternate"></link><published>2016-12-15T22:44:00+00:00</published><updated>2016-12-15T22:44:00+00:00</updated><author><name>Cryptech Core Team</name></author><id>tag:wiki.cryptech.is,2016-12-15:/ASICImplementations</id><summary type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;The aim of the Cryptech project is to develop an open, free, and
auditable HSM.  The Cryptech HSM includes both SW and HW parts.  In at
least the first iteration of the Cryptech HSM, the HW parts are
implemented using FPGA devices.  However, the ability to implement the
HW …&lt;/p&gt;</summary><content type="html">&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;The aim of the Cryptech project is to develop an open, free, and
auditable HSM.  The Cryptech HSM includes both SW and HW parts.  In at
least the first iteration of the Cryptech HSM, the HW parts are
implemented using FPGA devices.  However, the ability to implement the
HW parts in a Cryptech ASIC device in a future iteration is anticipated
in the design.  This text provides a short description of what the HW
part of the Cryptech HSM contains, the design style used, and what would
have to change in order to implement the HW part in an ASIC.&lt;/p&gt;
&lt;h2&gt;General digital functions and internal memories&lt;/h2&gt;
&lt;p&gt;The Cryptech digital functionality cores, such as the SHA-256 core, are
written in generic RTL (Register Transfer Level) Verilog code.  The code
is written in a fairly conservative coding style and use language
features from IEEE 1364-2001 (aka Verilog 2001).&lt;/p&gt;
&lt;p&gt;All RTL code is divided into modules that contain one process for register updates and reset (&lt;em&gt;reg_update&lt;/em&gt;), one or more combinational processes for datapath and support logic such as counters. Finally if needed, each module has a separate process that implements the logic for the final state machine that controls the behaviour of the module.&lt;/p&gt;
&lt;p&gt;All cores are divided into a core, for example &lt;em&gt;sha256_core.v&lt;/em&gt; and a number of submodules the core instantiates. The core provides raw, wide ports (256 bit wide key for AES for example) that is not suitable to use in a stand alone system. Instead each core comes with a top level wrapper, for example &lt;em&gt;sha256.v&lt;/em&gt;. This top level wrapper contains all registers and logic needed to provide all functionality of the core via a simple 32-bit memory like interface. If the core is going to be used as a tightly integrated submodule, the wrapper can be discarded. Similarly, if the core is going to be used in a bus system that use a specific bus standard such as AMBA AHB, CoreConnect or WISHBONE, only the top level wrapper will be needed to be replaced or modified to match the desired bus standard.&lt;/p&gt;
&lt;p&gt;The RTL code does not explicitly instantiate any hard macros such as
memories, multipliers, etc.  Instead all such functions are left to the
synthesis tool to infer based on the code. All memories are placed in separate modules to allow easy modification of the design. In an ASIC setting any memories not automatically mapped will be replaced by instantiation of specific macros.&lt;/p&gt;
&lt;p&gt;Some of the memories in the designs have combinational read (i.e the read
data is not locked by an output register, which infers a one cycle read
latency). For some FPGA technologies these memories are not compatible with the available physical memories. The synthesis tools therefor implement these memories
using separate registers rather than selecting a memory instance.  In an ASIC
implementation these memories would likely become real memory macros to allow for a faster and more compact implementation.&lt;/p&gt;
&lt;h2&gt;Interfaces&lt;/h2&gt;
&lt;p&gt;External interfaces such as GPIO, Ethernet GMII, UART, etc., will always
require some modification for the Cryptech design to be implemented in a
given technology, whether it is a specific FPGA type or an ASIC.  The
important thing is that the Cryptech design does not use technology
specific macros to implement the interfaces.  But pin assignments,
timing, and electrical requirements will always require adjustment and
work.&lt;/p&gt;
&lt;h2&gt;Clocking and reset&lt;/h2&gt;
&lt;p&gt;The design style used in the Cryptech Verilog code currently follows the
guidelines from the FPGA vendors Altera and Xilinx.  This means that we
use synchronous reset.  For an ASIC implementation this will also work,
even though asynchronous reset is far more common in ASIC designs.  Changing
to asynchronous reset is not a very big undertaking however, as the
register reset and update clocking are separated into easily
identifiable processes (&lt;em&gt;reg_update&lt;/em&gt;) in the modules.&lt;/p&gt;
&lt;p&gt;Most if not all registers in the Cryptech Verilog code have a defined
reset state.  Most registers also have a write enable signal that
controls the update.  This corresponds well with the registers available
in FPGA technologies from Altera and Xilinx and their recommended design strategy from the vendors. This is also in line with common
and good design styles for ASICs, which allows for compact code and low
power implementations. The design is currently not use any clock gating. In future revisions this might be added if power consumption needs to be reduced and does not add side channel issues.&lt;/p&gt;
&lt;h2&gt;External memories&lt;/h2&gt;
&lt;p&gt;The Cryptech hardware design will use external persistent memories for
protected key storage as well as external SRAM for protected master key
storage.  In an ASIC implementation the master key memory would probably
be integrated to further enhance security.&lt;/p&gt;
&lt;p&gt;Just like other external interfaces (see above), the interfaces for the
external memories do not use any explicitly instantiated hard macros in
the FPGAs.&lt;/p&gt;
&lt;h2&gt;Entropy sources&lt;/h2&gt;
&lt;p&gt;The current Cryptech design contains two separate physical entropy
sources.&lt;/p&gt;
&lt;p&gt;1: An avalanche noise based entropy source placed outside the FPGA.  The
entropy source signal is sampled by the FPGA using a flank detection
mechanism.&lt;/p&gt;
&lt;p&gt;An ASIC implementation would be able to use the external entropy source just like the FPGA. Furthermore, depending on the process options, it might be
possible to have an internal avalanche diode based on ESD structures commonly used in I/O pin implementations. In a power management capable process, functionality available in step-up converters might also be possible to use as internal avalanche noise source.&lt;/p&gt;
&lt;p&gt;Note that integrating the avalanche noise source does not mean that an off-chip noise source is excluded. The Cryptech RNG is modular and having both an internal and an external avalanche noise source is quite possible.&lt;/p&gt;
&lt;p&gt;2: A ring oscillator based entropy source placed inside the FPGA. The ring oscillator used in the FPGA is based on carry chain feedback through adders. An ASIC implementation of this ring oscillator should work and produce noise with similar characteristics. However the specific circuit will have to be characterized with explicit layout and qualified for the given process.&lt;/p&gt;
&lt;h2&gt;Toolchain&lt;/h2&gt;
&lt;p&gt;Crypech currently use Verilog simulators for functional verification and commercial FPGA tools for implementation including time analysis.&lt;/p&gt;
&lt;p&gt;An ASIC implementation will require several new tools including tools for synthesis, place &amp;amp; route and static time analysis that is acceptable as sign-off tool by the chip process vendor.&lt;/p&gt;
&lt;h2&gt;Conclusions&lt;/h2&gt;
&lt;p&gt;The HW designed for the first iteration of Cryptech is not specifically
designed for FPGA implementation, but is in fact designed in a generic
way to allow for easy implementation using different technologies such
as ASICs.&lt;/p&gt;
&lt;p&gt;There are however parts of the design that will have to be updated or
modified in order to create a good ASIC implementation.  The Cryptech
project is confident that we know what those parts are and what they
would entail.&lt;/p&gt;
&lt;p&gt;Developing an ASIC will however require new tools which will incur costs.&lt;/p&gt;</content><category term="FutureWork"></category></entry><entry><title>Issues of an Assured Tool-Chain</title><link href="https://wiki.cryptech.is/AssuredTooChain" rel="alternate"></link><published>2016-12-15T22:44:00+00:00</published><updated>2016-12-15T22:44:00+00:00</updated><author><name>Cryptech Core Team</name></author><id>tag:wiki.cryptech.is,2016-12-15:/AssuredTooChain</id><summary type="html">&lt;p&gt;We do not have any assurance that our basic tools are not compromised.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Compilers&lt;/li&gt;
&lt;li&gt;Operating Systems&lt;/li&gt;
&lt;li&gt;Hardware Platforms&lt;/li&gt;
&lt;li&gt;Verilog and Other Tools to Produce Chips&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;At the base, is the compiler.  The fear was first formally expressed in
Ken Thompson's 1984 Turing Award Lecture
&lt;a href="http://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf"&gt;Reflections on Trusting Trust&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;David A …&lt;/p&gt;</summary><content type="html">&lt;p&gt;We do not have any assurance that our basic tools are not compromised.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Compilers&lt;/li&gt;
&lt;li&gt;Operating Systems&lt;/li&gt;
&lt;li&gt;Hardware Platforms&lt;/li&gt;
&lt;li&gt;Verilog and Other Tools to Produce Chips&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;At the base, is the compiler.  The fear was first formally expressed in
Ken Thompson's 1984 Turing Award Lecture
&lt;a href="http://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf"&gt;Reflections on Trusting Trust&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;David A. Wheeler's PhD thesis, &lt;a href="http://www.dwheeler.com/trusting-trust/"&gt;Fully Countering Trusting Trust through Diverse Double-Compiling&lt;/a&gt;
outlines how we might deal with the compiler trust conundrum.&lt;/p&gt;</content><category term="FutureWork"></category></entry></feed>