Monday, February 05, 2007

before the flood....

We are in the process of updating our domain-database: we just started an insert of roughly 150.000.000 hostnames, bringing our db-system to the limit.

We ceased "regular" spidering and updates for a while to catch up with this bulk-data. Currently we run at a rate of about 1.000.000 new domains per day, which we consider not really bad, but still unsatisfying. We are currently testing a NAS-Array running on 6 SCSI-Disks (currently as an experiment - we will invest in more hardware if this proves to be faster than the current system).

thomas

4 comments:

Anonymous said...

oh man sounds good - where did you get 150 mio hostnames in one go??

thomas said...

there were several "gos".
we're not the only ones to collect hostnames. and we used and abused a few repositories - we should be pretty complete on .com/.net when all hosts are imported. but i guess this will take several months - even with our new, faster database-equipment.

tom

Anonymous said...

any hint where one can find such sites? used to look for these - but no luck... or maybe I'm looking in the wrong place or with the wrong keywords?

Anonymous said...

Yo me wondering too - spidering would take ages. So where can one find hosts presented in a "self service" form? Kim