[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Scanner



comments in-line:


Matt wrote:
So let me get this straight...

So what you want to do is go through someone's SOA (Start of
Authority) and search for just keywords that you choose in order to
find all sites containing those keywords?
-----------------
i just want to search for domain names similar to what netcraft is doing but on locally downloaded zone files that will parse/match keywords such as *sex.*, *hate*.*, *porn*.*...etc.




I don't think that's gonna happen. There's no way you're getting the entire SOA for any registrar so that you can do that. You would be 100,000,000 times better setting up your own proxy firewall and setting up a content filtering on it and use the same keywords to prevent people from accessing those sites. If you wanted to, over time, you could log the events of attempted traffic with those keywords and the sites people are trying to go to in order to build yourself a listing of prohibited sites and then drop the keyword filtering, but your strongest option is to stay with a proxy with content filtering.
----------
trying to build this list for a content filtering product :-)


There's a reason why there are companies out there that make big money doing this kind of filtering technique. Because it's not that simple to do. Cosmin's idea is kinda close to a reasonable way to go out and get addresses, but could take a long time of searching to pull down every possibility (e.g. Google search Results 1 - 10 of about 76,800,000 for inurl:porn. (0.12 seconds)). Good luck reading all 76 million results.
-----------------
i wish i could *grin*, but i can only hit up to 1000 sites even though it says 990,000 sites. just wondering whether having an engine sitting locally (such as google) will help overcome this limit besides other features it offers.



regards, /vicky

Just my .02


--


On Mon, 28 Mar 2005 12:36:50 -0800, Vicky Rode <vicky.rode@gmail.com> wrote:

We've already looked at netcarft and it has been partially helpful.

What I'm looking at doing (besides data that I receive via peering) is
searching via keywords through sync'd dns zonefiles and parse the output
 to a filter database something similar to update file if you will.

This is being done as a home-grown solution.

regards,
//vicky//

J. Oquendo wrote:

Actually Vicky, you're quite wrong. I'm sure this will be what you
speficied more or less. Netcratft's search DNS
http://searchdns.netcraft.com/?host

However, I think it only finds sites that have either been checked on
Netcraft, or perhaps sites that have been queried or something. Not sure
of the parameters behind how they obtain the information.

On Fri, 25 Mar 2005, Vicky Rode wrote:



absolutely NOT but in fact to search for offending sites (porn,
call-home..etc) to be blocked at our filtering appliance.



regards,
/vicky

Alexander Chamandy wrote:


On Wed, 02 Mar 2005 17:42:24 -0800, Vicky Rode <vicky.rode@gmail.com> wrote:



Hi there,

Just wondering if there is any way I could use a scanner (I have a home
grown script for this) that would go thru the DNS registries from some
public source, scan for keywords in the domain name.

Will appreciate if someone can point me in the right direction.

regards,
/vicky


You mean to scan whois records for particular domains for keywords in
the registration information or scan the registry for domain names
with certain keywords?  This wouldn't be used for gathering
information such as e-mail addresses to spam, would it?



=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
J. Oquendo
GPG Key ID 0x0D99C05C
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x0D99C05C

sil @ infiltrated . net http://www.infiltrated.net

"How a man plays the game shows something of his
character - how he loses shows all" - Mr. Luckey