I’ve been searching information in docs (datatype bytes section), but I became a bit confused.
- I’m planning to use CockroachDB to store files I receive from FTP server. Biggest files are limited to 10MB in size. Most of files are XML and have 15kb in average size. Biggest files are text (column delimited with lots of spaces inside) files. Few files are zip/rar/7zip/gz/tar files.
- Most of files are plain text, and can be highly compacted using any algoritm (more than 1:10 in some cases)
- After some processing, we’ll record a log linked to the file
- After 30~90 days, I’ll delete files and their logs.
- We do process almost 25.000 files per day, 24 x 7.
My questions are:
- Is 5MB too much for geo-replicated data (3 datacenters, maximum 10ms network delay between them with 20Mbps full duplex connection) - shall I foresee problems?
- Does CockroachDB compact files by itself (easier for me), or should I compact them before sending (bit harder, have to touch several systems)?
- Does CockroachDB support textual indexes over bytea for searching certain blocks of text?
Thanks in advance!