How high on Google search does a website rank for a specific search query ?
Instead of fixing the Perl script, I've decided to rewrite it in Python, and at the same time check what other options I have to complete this task. It turns out that in the time that has passed since my previous research on this topic, Google have added an option to use their AJAX Search API with external HTTP requests. Just build a GET request and the results will be resulted as a JSON string. Undoubtedly, this is more convenient than screen scraping.
However, this approach is still limited - because the API won't ever return more than 32 results (and even for that, 4 separate requests have to be issued) . To get more results, one still has to use screen scraping.
Anyway, I've coded both approaches as Python modules. The first one, google_api_search uses the Google search API to find results. The second, google_web_search issues normal HTTP search requests to Google with mechanize, and processes the results with BeautifulSoup. I've also whipped a simple CGI interface to these modules: google_search_rank.
|||This is the main disadvantage of "screen scraping" methods - a high sensitivity to the layout of the HTML code.|
|||There is room for optimism, however. When Google first made this option available, they limited the amount of results to 8. Due to many requests, they've increased it to 32, and perhaps they will increase it even more in the future.|