Are the traditional Fletcher/Adler checksum algorithms optimal?
I am not referring to the common optimizations applied to these
algorithms. For example, the controlling of loop iterations to
avoid sum overflows then truncating outside of the main loop.
I am referring to the design itself. I understand the second sum
(sum2) was introduced to make the algorithm position-dependent but
it is truly sub-optimal? Sum2 is just a modification of sum1 (the
sum of the previous sum1 value added to the current sum1 value).
If we take the 16-bit version as our example, sum1 is a 8-bit
product of the input data, while sum2 is a product of sum1, so
therefore the final 16-bit checksum is in fact an 8-bit product
of the data appended to 16-bits to catch out-of-sequence input.
This means our final checksum, an 8-bit sum of our input data,
is a value of only 256 possible values. As we are passing on
a 16-bit value with our data we could have a checksum value that
is one of 65536 possible values.
It occurred to me that if we had a means of ensuring a position-
dependent check without sacrificing any bits of the checksum we
would have an exponentially better validation.
As I understand it, sum2 was introduced solely to identify out-
of-sequence transmission, and so can be discarded if we find an
alternative to producing a position-dependent checksum. And as
it turns out, we do have an alternative and it comes cheap.
It is cheap because it does not add extra coding to the process
compared to current 'sum2' designs - it is the index position
in the sequence that when hashed with each corresponding byte
ensures a position-dependent check.
One final note - the design below is free of overflow checks,
lossless reduction, and possible loop optimizations, just for
clarity, as this is
about a new out-of-sequence error check technique, not about
implementation details. Fletch64 is not demonstrated below as
it may require a more complicated implementation but the
byte/index hash applies the same.
Revision - because a checksum algorithm can process a large data
count the index position check value could itself cause premature
overflow and require a higher number of reduction operations with
lower inner loop count. The fix was to truncate the index position
check to 8 bits. Now the checksum can process a much greater data
count in the inner loop before overflow.
The only down-side to this change is if a contiguous string of the
data of exactly 256 byte is displaced any multiple of 256 positions
from its original position the error would go undetected.
uint8_t Fletch8( uint8_t *data, int count )
{
uint32_t sum = 0;
int index;
for ( index = 0; index < count; ++index )
sum = sum + ( data[index] xor index & ffh );
return sum & ffh;
}
uint16_t Fletch16( uint8_t *data, int count )
{
uint32_t sum = 0;
int index;
for ( index = 0; index < count; ++index )
sum = sum + ( data[index] xor index & ffh );
return sum & ffffh;
}
uint32_t Fletch32( uint8_t *data, int count )
{
uint64_t sum = 0;
int index;
for ( index = 0; index < count; ++index )
sum = sum + ( data[index] xor index & ffh );
return sum & ffffffffh;
}