According to Wikipedia, Twitter has today around 200 million users that together post 65 million tweets a day, with a limit of 140 characters per tweet. This way, I decided to do some math to find out how much storage space this number of daily tweets requires and the first step was to learn how the 140 characters are counted.
Counting charactersBasically, Twitter uses the NFC form of the text to count the length of a tweet, which favors the combined character form. Special characters are represented by one codepoint, that is encoded as two bytes in UTF-8. You can find a lot more detailed information about it following their tutorial on counting characters and the references in it.
The mathAssuming the worst case scenario, in which all the 140 characters used and are special ones, a single tweet would take 280 bytes of storage space. If 200 million users generates 65 million tweets a day, they require almost 17GB of disk space a day. If a user is limited to post 1000 in their timeline, that means the 200 million users require something around 500TB of data.
ConclusionTwitter has to manage 500TB of data, of which 17GB is daily modified. Considering that 75% of the requests to twitter.com are API calls, that means a lot of work on querying all that data. OK, leave my drive out of it!
Do you feel that this isn't much and they should raise the limits? Follow me on twitter and make me post more! :)