Cloud databases and scalability (was Rodeo: 2 questions)

David Bovill david at vaudevillecourt.tv
Thu May 20 17:01:24 EDT 2010


On 20 May 2010 21:52, Mark Wieder <mwieder at ahsoftware.net> wrote:

>
> Twitter, Digg, Reddit, etc are indeed reaching the scalability limits
> of their databases, and are moving into the NOSQL world of Cassandra,
> Hadoop, CouchDB... but you need to up at the level of shoveling around
> petabytes of data before you need to start dealing with those issues.
> And dealing with NOSQL databases means you need to know your data
> formats and how you're going to be using them before you start,
> because once you slice and denormalize your data into various storage
> compartments you lose things like the ability to do adhoc searching.
>

Thanks for the input Mark - terabytes of data is not likely, and would also
be expensive. The biggest real world problem is lots of concurrent users.
For a live event - for instance something that would be promoted on TV, you
might get a lot of users in one go - then nothing. Amazon SimpleDB seems an
interesting way to cope with one off high capacity surges like this at very
low cost (as long as the data fields are small).

Can anyone advise on how you would do a stress test to replicate say 100,000
concurrent users - are their tool kits or a company perhaps that is set up
to do that sort of thing?



More information about the use-livecode mailing list