Hi @todo, these are some interesting questions about the behavior of SERIAL
with an extremely large number of inserts per second. I’ll try to answer them to the best of my knowledge.
The limit is the number of bits used to represent the timestamp part of the return value of unique_rowid()
, 48 bits, with a resolution of 10 microseconds. The implementation uses GenerateUniqueInt
(implementation here: https://github.com/cockroachdb/cockroach/blob/280c15872575a775b4f5f7a63fe8fd3e29c3dfed/pkg/sql/sem/builtins/builtins.go#L3456) to provide unique integers. That method uses locking to ensure the results are truly unique (though there is a todo comment in that code that raises an interesting question).
// TODO(pmattis): Do we have to worry about persisting the milliseconds value
// periodically to avoid the clock ever going backwards (e.g. due to NTP
// adjustment)?
Since the lock that GenerateUniqueInt
takes is global, there doesn’t seem to be a way to cause a collision just by increasing load. Per that todo comment I’m not so sure about clock jumps.
Since the other part of the default value is the node id, I think the limit is per node, since rows created on different nodes would never collide. (see https://www.cockroachlabs.com/docs/stable/create-table.html#create-a-table-with-auto-generated-unique-row-ids)
That’s an interesting question. GenerateUniqueInt
guarantees its result is monotonic. It looks to me like it would just mean that the timestamp clock is always running faster than real time, so the timestamp gets further and further ahead. This just means you won’t have the full 89 years of unique timestamps that method guarantees, but you shouldn’t have a uniqueness violation as far as I can tell.
We’d definitely recommend using UUID
instead if that’s possible for your application, since it’s so much more resistant to collision. The performance will be better, too, since new row ids will scatter across ranges for the table rather than concentrating in one hot range.