Yelp Scraper to Get Business Info in a Geographic Area

I spent the past couple days on one of my first Python projects - using the Yelp API to compile a list of restaurants in a defined geographic area.

It's been a good project. Because of some limitations of the API, I had to do some interesting tricks to make it work. One problem with the API is that it only allows 20 hits per query, so if you want to do a big query, you have to divide it up into tiny queries that have fewer than 20 hits each.

To accomplish that, if a query gets 20 hits within those two points, it will divide the longer dimension of the rectangle created by the points in half, and perform a query on each of those two new rectangles. For each of those, if there are 20 hits, it will again divide it in two and perform two new queries, and so forth until less than 20 hits are found for the rectangle. Once less than 20 hits are found, the data is entered into a database. Once all the points have been added to the database, a comma separated file is created, and the program ends.

It was pretty incredible switching to Python for this project from my usual Java, and also using an official API for the first time. This project ended up being about 200 lines (half of which are comments). I can't imagine how long it would be with Java, since I used some rather powerful Python modules to accomplish this (namely, csv, urllib & json).

If anybody is interested in seeing/using the code, let me know. It should be useful if you need a list of restaurants or other businesses in a certain area. Worthy causes only please!

Hi,

Thanks for putting this out here. Does you script also physical address and website address of the business? For example : I need listing of all the businesses with their categories and addresses from this URL

http://www.yelp.com/search?find_loc=San+Francisco,+CA#attrs=Alcohol.beer...

Thanks

can you make script to grab all Yelp: businesses emails ?
what language ?
desktop or web script ?

thanks

will pay ..sir :)

im the above.. forgot email

The script is written in Python, and cannot currently grab business emails. Please contact me directly if you're interested in the script.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.