-
Notifications
You must be signed in to change notification settings - Fork 12
TLD Whitelists to limit domain searches #11
Comments
To be clear, this is basically just a set of regexps to match the end of a domain. So for example, we know that any site that ends with |
Oh, lets get some checkboxes :)
|
I think it should be enough to simply leave a contact email somewhere on the bottom of our website that users will use to contact us in case they feel like a certain domain should get whitelisted. |
We can use gman as the whitelist. Here is the actual list. |
Oh man. If only i had known. I should have put my faith in Balter. |
I can't figure out where the whitelist is, though. A search for “gov” doesn't yield anything. |
Hi, Waldo, Waldo Jaquith wrote:
|
Ah yep, @waldoj just set you up with a user/pass & fired the details your way. |
Oh, an admin section! Great. :) |
btw, the format for the TLDs are just |
Do you have any sense of at what point this will become bogged down? That is, will adding thousands of domain names be problematic? (Of course, I'll bulk load them directly into MySQL.) |
Waldo, this does not have to be a single domain. Waldo Jaquith wrote:
|
Sure, I follow, but only 1% of the domains that I'm concerned with are .gov domains. |
Do you think we should remove the whitelist at all? The danger of this Waldo Jaquith wrote:
|
Nope—I think it's a fine idea. I just want to expand it to include the domain names for all governments within the United States. |
Waldo, if there is such a list somewhere, it will be extremely easy to Waldo Jaquith wrote:
|
This is the list, right here: https://github.com/benbalter/gman/blob/master/config/domains.txt |
Do you want me to hardcode them or make a script that will add them to Waldo Jaquith wrote:
|
Oh, it's OK—you don't need to do anything with it. I'm happy to take care of this. :) I'm just wondering if this number of domains is going to be problematic for the software. |
I don't think it is going to give you any kind of problems. 10000 Waldo Jaquith wrote:
|
Nah, that's fine—since it's MySQL, I'm happy to do it. I only need this for my own installation of this software—for the base package, there's no need to include this, so no standardized, replicable process is required. |
Got this. In case you need any sort of help, feel free to message me and Waldo Jaquith wrote:
|
We're going to need two whitelists, one for file types (which we already have) and one for government domains.
Regretfully, the domain issue is going to be an inconsistent solution at best. There are plenty of government sites at .com domains, but we don't want people to be able to search google.com.
The proposed solution for this is to let any search that validate against the TLD whitelist go through automatically, and to throw up a captcha and a form for more information about the site for sites that don't match the whitelist.
The text was updated successfully, but these errors were encountered: