New CPUs have special instructions that can do 64 bit binary field operations efficiently (the main requirement for this is to have a carryless multiplication). This gets rid of the logarithm lookup tables to compute field multiplications. This benchmark written by @kilic and @saitima shows that this is 4x faster than the 16 bit lookup table approach (it can do a 64 bit operation in the same time it takes to do 16 bit operations using lookup tables)
https://github.com/kilic/gf
This indicates we should switch to 64 bit field elements to improve performance.
New CPUs have special instructions that can do 64 bit binary field operations efficiently (the main requirement for this is to have a carryless multiplication). This gets rid of the logarithm lookup tables to compute field multiplications. This benchmark written by @kilic and @saitima shows that this is 4x faster than the 16 bit lookup table approach (it can do a 64 bit operation in the same time it takes to do 16 bit operations using lookup tables)
https://github.com/kilic/gf
This indicates we should switch to 64 bit field elements to improve performance.