Switched the hostonip-script to the new database - data should be dynamic again. DB under heavy load, lookups might still not be as fast as the should be.
tom
Tuesday, February 28, 2006
Sunday, February 26, 2006
Database moved
The Domains-Database moved to a faster, bigger server. Queries should be faster now, and we hope to broaden our range of known hosts and domains quite fast.
Queries on Serversniff will continue to run against the old (now static) database-host until we cleaned things up in the new one.
tom
Queries on Serversniff will continue to run against the old (now static) database-host until we cleaned things up in the new one.
tom
Saturday, February 25, 2006
Webserver-Detection
Fixed various minor glitches in Serversniffs Webserver-Detection. We care about errors now and might link to more information about the various webservers and their modules. Check it out and mail us (or comment here) if we missed something. You are welcome to add text to the wiki!
Check it out:
http://www.serversniff.net/get_httpserver.php
tom
Check it out:
http://www.serversniff.net/get_httpserver.php
tom
Friday, February 24, 2006
Hasshhhh....
Sshhhhh - don't tell anybody: we had nasty (and quite common) bugs in our hash-creator: while it worked well with "usual" strings, some hashes didn't work, especially those with a ', " or \. Since nobody complained, they might have gone completely unnoticed.
They are fixed now, only the ntlm-/lm-hashes weren't completely fixable: strings with both ' and " will give an empty hash and an error-message now. I'll try to fix this issue during the next week as well.
Mapping the Net crossed 3 Million Domains, but is running out of Disk-Space. The new server is here and will replace the old one by the end of March.
tom
They are fixed now, only the ntlm-/lm-hashes weren't completely fixable: strings with both ' and " will give an empty hash and an error-message now. I'll try to fix this issue during the next week as well.
Mapping the Net crossed 3 Million Domains, but is running out of Disk-Space. The new server is here and will replace the old one by the end of March.
tom
Friday, February 17, 2006
Roger Schwarz, again
Soultcer pointed in a comment to http://lists.suse.com/archive/suse-linux/2001-Oct/4473.html - thats what you find at google, jup.
What I still asked myself is, who he really was. Is compiling the memory into erverstrings of many many webservers some kind of running gag? - Or is Rogers memory still present at T-Online? - I doubt that there are many people if anybody at all working for T-Online who used to know Roger. IT-Business and new economy used to be some kind of work fast, change often, don't really think about your past collegues.
tom
What I still asked myself is, who he really was. Is compiling the memory into erverstrings of many many webservers some kind of running gag? - Or is Rogers memory still present at T-Online? - I doubt that there are many people if anybody at all working for T-Online who used to know Roger. IT-Business and new economy used to be some kind of work fast, change often, don't really think about your past collegues.
tom
Sunday, February 12, 2006
In memoriam Roger Schwarz
Did Bugfixing, Adding and Removing....
Twiddled the Domain-Reaper-Script to prefer the new domains from the queue. We should break 3 million unique Domains soon.
- Hardcorebugfixing: I disabled the DNS-Script until i fixed it up. This may take a bit of time.
- Fixed the HTTP-Header-Script to implement Port-Numbers and to check hostnames with multiple IPs (try www.google.com)
- Added a HTTP-Server-Detection (for those who think the Header-Script is tooo complex
- Fixed a few Bugs on Texts and Links
- Started customizing the english wiki - it't time to start doing this
- Still bugs left to fix: HTTP-Header and Servertype don't support https.
Twiddled the Domain-Reaper-Script to prefer the new domains from the queue. We should break 3 million unique Domains soon.
Tuesday, February 07, 2006
the good, the bad and the ugly...
or the f*cking second-level-domains.
And no - don't tell me anything about ISO and standards - it seems, that some NICs set up standards for each and everything, while others just comply to "unwritten" standards.
Look at .pl with the "Standard-SLs"
agro.pl, aid.pl, atm.pl, auto.pl, biz.pl, com.pl, edu.pl, gmina.pl, gsm.pl, info.pl, mail.pl, media.pl, miasta.pl, mil.pl, net.pl, nom.pl, org.pl, pc.pl, priv.pl, realestate.pl, rel.pl, shop.pl, sklep.pl, sos.pl, targi.pl, tm.pl, tourism.pl, travel.pl, turystyka.pl
or .br with
adm.br, adv.br, am.br, arq.br, art.br, bio.br, cng.br, cnt.br, com.br, ecn.br, eng.br, esp.br, etc.br, eti.br, fm.br, fot.br, fst.br, g12.br, gov.br, ind.br, inf.br, jor.br, lel.br, med.br, mil.br, net.br, nom.br, ntr.br, odo.br, org.br, ppg.br, pro.br, psc.br, psi.br, rec.br, slg.br, tmp.br, tur.br, tv.br, vet.br, zlg.br
How many domains do you expect to find under Second-Level-Domains as fancy as "turystyka.pl" or "vet.br" ?? - Try a zonetransfer and see, if you get 100 Domainnames.
Others "just do it" and set up pseudo-SLDs like com.al.
Others just do it and set up pseudo-SLDs for every part of the country: stuff like all the italian or american "ro.it", "bz.it", "bs.it, "ut.us", "ws.us" and so on...
Others just register a fancy domain and sell subdomains like gb.net, gb.com, us.com, ru.com, eu.com, de.vu and so on.
Hey, NS-Admins, Hey ICANN:
this is UGLY!
- at least for me trying to get things sorted out in a manner to make queries simple and understandable for somebody who doesn't (want) to know about SLDs.
tom
And no - don't tell me anything about ISO and standards - it seems, that some NICs set up standards for each and everything, while others just comply to "unwritten" standards.
Look at .pl with the "Standard-SLs"
agro.pl, aid.pl, atm.pl, auto.pl, biz.pl, com.pl, edu.pl, gmina.pl, gsm.pl, info.pl, mail.pl, media.pl, miasta.pl, mil.pl, net.pl, nom.pl, org.pl, pc.pl, priv.pl, realestate.pl, rel.pl, shop.pl, sklep.pl, sos.pl, targi.pl, tm.pl, tourism.pl, travel.pl, turystyka.pl
or .br with
adm.br, adv.br, am.br, arq.br, art.br, bio.br, cng.br, cnt.br, com.br, ecn.br, eng.br, esp.br, etc.br, eti.br, fm.br, fot.br, fst.br, g12.br, gov.br, ind.br, inf.br, jor.br, lel.br, med.br, mil.br, net.br, nom.br, ntr.br, odo.br, org.br, ppg.br, pro.br, psc.br, psi.br, rec.br, slg.br, tmp.br, tur.br, tv.br, vet.br, zlg.br
How many domains do you expect to find under Second-Level-Domains as fancy as "turystyka.pl" or "vet.br" ?? - Try a zonetransfer and see, if you get 100 Domainnames.
Others "just do it" and set up pseudo-SLDs like com.al.
Others just do it and set up pseudo-SLDs for every part of the country: stuff like all the italian or american "ro.it", "bz.it", "bs.it, "ut.us", "ws.us" and so on...
Others just register a fancy domain and sell subdomains like gb.net, gb.com, us.com, ru.com, eu.com, de.vu and so on.
Hey, NS-Admins, Hey ICANN:
this is UGLY!
- at least for me trying to get things sorted out in a manner to make queries simple and understandable for somebody who doesn't (want) to know about SLDs.
tom
Sunday, February 05, 2006
Cleaning the mess
Our queue is down to around 200.000 Hostnames and it seems that we can start filling it up again slowly in a few days. There are still huge zonetransfers every now and then. The net's huge.b
Implemented part of the data in the Subdomains-Script, for this is one of the most-used scripts on serversniff (i still can't imagine why).
tom
Implemented part of the data in the Subdomains-Script, for this is one of the most-used scripts on serversniff (i still can't imagine why).
tom
Friday, February 03, 2006
Twiddling the Database
I fixed the broken ICMP-Trace yesterday - it shows now a "we know XX Domainnames for this host" if you call the script and links to our "hostonip"-Script.
We seem to have passed the phase of the HUGE Zonetransfers finally - we import constantly new Domains but the queue of our "hostnames to import" stays roughly the same - it's hanging around 3.1 Million Hostnames for a few days now.
In the beginning, we transferred really huge zones, mostly .edu or other universities, also some second-level Domains like com.br and a few "zone-spammers": the stuffed literally millions of hostnames into their zone-files - regarding the content of the websites they seem to use this simply for spamming. But then, I don't know any search-engine that does Zone-Transfers.
Besides of the Spam-Domains we still have a few Million more exotic hostnames in a separate queue that we will serve if we are up to date with the "important" TLDs.
The queue is getting more and more the bottleneck of MappingTheNet - we have around 10 Tasks stuffing hosts in the queue on one side and 4 tasks of checking the hostnames and sorting them into the database. The "sort-in-tasks" take a long time to query the db for hosts to do, for they have to lock the database while they read and update the entry. I'm praying for a postgresql-guru, but i might end up with having to solve this problem by myself.
We are now aware of
4.913.386 unique Hostnames
5.302.751 unique IPs (that have at least 1 hostname assigned)
2.155.215 unique Domains
3.186.760 hostnames waiting to be sorted in from our queue
tom
We seem to have passed the phase of the HUGE Zonetransfers finally - we import constantly new Domains but the queue of our "hostnames to import" stays roughly the same - it's hanging around 3.1 Million Hostnames for a few days now.
In the beginning, we transferred really huge zones, mostly .edu or other universities, also some second-level Domains like com.br and a few "zone-spammers": the stuffed literally millions of hostnames into their zone-files - regarding the content of the websites they seem to use this simply for spamming. But then, I don't know any search-engine that does Zone-Transfers.
Besides of the Spam-Domains we still have a few Million more exotic hostnames in a separate queue that we will serve if we are up to date with the "important" TLDs.
The queue is getting more and more the bottleneck of MappingTheNet - we have around 10 Tasks stuffing hosts in the queue on one side and 4 tasks of checking the hostnames and sorting them into the database. The "sort-in-tasks" take a long time to query the db for hosts to do, for they have to lock the database while they read and update the entry. I'm praying for a postgresql-guru, but i might end up with having to solve this problem by myself.
We are now aware of
4.913.386 unique Hostnames
5.302.751 unique IPs (that have at least 1 hostname assigned)
2.155.215 unique Domains
3.186.760 hostnames waiting to be sorted in from our queue
tom
Thursday, February 02, 2006
Adventures in diving through the web
We started a project called "Map the Net" recently: we created a crawlerscript thats mangling it's way through the global network and tries to get as many domains, ips and hostnames as it can get.
The results get stored - you guessed it, in a huge database and will make us very, very famous and rich somewhere in time.
Nobody but us has this information, let aside the major search-engines, the NSA, Dan Kaminsky and maybe the lovely guys at Netcraft.
tom
The results get stored - you guessed it, in a huge database and will make us very, very famous and rich somewhere in time.
Nobody but us has this information, let aside the major search-engines, the NSA, Dan Kaminsky and maybe the lovely guys at Netcraft.
tom
Subscribe to:
Posts (Atom)