Storing IPs as numerics might be problematic for v6 IP addresses considering the 32-bit limit on integers.
I think we could use a new custom Composr type for SESSION (VARCHAR(15)) or even just call it TOKEN to also include things like CSRF. It might come in handy especially for the privacy system to auto-anonymise those in the serialise method.
Anywhere else where we could, we could perhaps add a custom DB-meta type as well... perhaps ID_CODE for VARCHAR(100) (if there would never be a case a code name would be > 100 characters... ideally this should never happen as code names are often used in URLs).
I think we should generally scrub DB-Meta and adjust current types to save on space, just making sure for instance we're not using longtext on something that should be short_text. I noticed for example in v11 there were several username fields that were short_text when they should have been ID_TEXT. So v11 could get a meta look-over at least, and v12 implementation of new types.
Upon further thought, I believe this should be tackled for v11. The stats table has been an ongoing issue for space in both v9 and v10. We should address it ASAP.
"I think we should generally scrub DB-Meta and adjust current types to save on space, just making sure for instance we're not using longtext on something that should be short_text. I noticed for example in v11 there were several username fields that were short_text when they should have been ID_TEXT"
SHORT_TEXT is varchar(255)
ID_TEXT is varchar(80)
The maximum_username_length option is configurable. It should be capped at 255, and is not, so I'll fix that. Regardless, ID_TEXT is not long enough.
Using LONG_TEXT when we could get away with SHORT_TEXT is not necessarily good. LONG_TEXT uses fewer bytes due to variable-length encoding. But there's a performance penalty, and indexing penalty, to it.
I don't think in general we're in a bad spot. You can buy a 1TB SSD for under $70 now. The stats table is a particular concern because it just grows and grows, stores a lot, and it needs to be very indexable.
I'm untagging as v11 though because it's too late now, and it's not an order-of-magnitude kind of fix.
A SESSION field type does make sense. Reducing length of pages and operating_system in there would require a bit more thought.
I stand corrected, f_members.m_username is actually an ID_TEXT. However, members from outside Composr could have longer usernames, so the inconsistency may be expected.
I think we could use a new custom Composr type for SESSION (VARCHAR(15)) or even just call it TOKEN to also include things like CSRF. It might come in handy especially for the privacy system to auto-anonymise those in the serialise method.
Anywhere else where we could, we could perhaps add a custom DB-meta type as well... perhaps ID_CODE for VARCHAR(100) (if there would never be a case a code name would be > 100 characters... ideally this should never happen as code names are often used in URLs).
I think we should generally scrub DB-Meta and adjust current types to save on space, just making sure for instance we're not using longtext on something that should be short_text. I noticed for example in v11 there were several username fields that were short_text when they should have been ID_TEXT. So v11 could get a meta look-over at least, and v12 implementation of new types.
SHORT_TEXT is varchar(255)
ID_TEXT is varchar(80)
The maximum_username_length option is configurable. It should be capped at 255, and is not, so I'll fix that. Regardless, ID_TEXT is not long enough.
Using LONG_TEXT when we could get away with SHORT_TEXT is not necessarily good. LONG_TEXT uses fewer bytes due to variable-length encoding. But there's a performance penalty, and indexing penalty, to it.
I don't think in general we're in a bad spot. You can buy a 1TB SSD for under $70 now. The stats table is a particular concern because it just grows and grows, stores a lot, and it needs to be very indexable.
I'm untagging as v11 though because it's too late now, and it's not an order-of-magnitude kind of fix.
A SESSION field type does make sense. Reducing length of pages and operating_system in there would require a bit more thought.
Good point about ipv6.