Skip to content
This repository has been archived by the owner on May 5, 2020. It is now read-only.

Provide more than 300 results #26

Open
rebeccawilliams opened this issue Feb 19, 2015 · 9 comments
Open

Provide more than 300 results #26

rebeccawilliams opened this issue Feb 19, 2015 · 9 comments

Comments

@rebeccawilliams
Copy link

It looks like the maximum return of files for this tool is 300. It would be wonderful to have the return be higher than that so truly all of the data is surfaced through this tool.

If not in browser, as a downloadable file.

@waldoj
Copy link
Member

waldoj commented Feb 19, 2015

One person typing in data.gov would give us a $1000 API bill from Bing. We're always going to need limitations on this, because it can theoretically cost an unlimited amount to run.

@rebeccawilliams
Copy link
Author

Would caching results and searching for the next 300 datasets over and over be a possibility? If folks want more results, should they just use the Bing or Google UI (does that max out too)?

@waldoj
Copy link
Member

waldoj commented Feb 19, 2015

No, we're charged for every query. We cap results at 300—Bing doesn't. 600 just costs us twice as much as 300. :) It's a setting in the config file—anybody can download LMGTDFY, set the cap at whatever they like, run it themselves, and pay Bing for that API access.

@waldoj
Copy link
Member

waldoj commented Feb 20, 2015

Here's Bing's API pricing, incidentally. It's my understanding that a "transaction" provides up to 50 results. We're on the free tier right now, and that'll get us a maximum of 5,000 domains entered a month (assuming that every search returns 0–50 results). It'll get us a minimum of 833 domains a month (assuming that every search returns 300 or more results). How many results we're going to average, it's too soon to say—it hasn't even been 36 hours. But once we have some baseline usage data, it'll be possible to tweak the number of returns.

With project funding, I'd make the cap much higher—perhaps 1,000 results? But right now, U.S. Open Data is in no position to make a long-term financial commitment to keep the site operating—it's best to keep it on the free tier.

One more point, for the record: I'm just talking about our installation of this software. Anybody's free to install it on their own system, get a Bing API key, and set the cap at as many results as they want!

@rebeccawilliams
Copy link
Author

Ah thank you for providing that information here!

@emily878
Copy link

So much fun! Also, as an FYI and not an issue (at least from my perspective), I'm getting 600+ results from my state .gov domain searches.

@waldoj
Copy link
Member

waldoj commented Feb 24, 2015

Huh. That's odd. Looking through the records, I'm not seeing any searches yielding more than 300 results. I wonder if that's resulting from duplicates, per #18?

@emily878
Copy link

That looks like it's probably it - at least for the most part. For
Virginia.gov I got 650 records, of which 335 were duplicates.

On Tue, Feb 24, 2015 at 6:22 PM, Waldo Jaquith [email protected]
wrote:

Huh. That's odd. Looking through the records, I'm not seeing any searches
yielding more than 300 results. I wonder if that's resulting from
duplicates, per #18 #18?


Reply to this email directly or view it on GitHub
#26 (comment).

Emily Shaw
National Policy Manager | Sunlight Foundation |
(o) 202-742-1520 x 282 | (c) 207-233-5684
@emilydshaw http://twitter.com/emilydshaw

@waldoj
Copy link
Member

waldoj commented Feb 25, 2015

Ugh.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants