This repository contains a customizable Google Maps scraper designed to extract business information based on user-defined input criteria. The scraper is ideal for collecting structured data like software house details, restaurants, or other businesses in a specific location.
- Customizable Inputs: Specify search queries (e.g., "software houses in New York") to scrape relevant business details.
- Data Extraction: Scrapes the following details from Google Maps:
- Business name
- Website URL
- Phone number
- Address
- Reviews
- Website Social Media links
- Output Formats: Choose between
CSVorJSONfor storing the scraped data. - Review Scraping: Option to scrape business reviews by enabling
Review_Scrape(default:False). - Website Scraping: Optionally visit business websites and extract social media links by enabling
Website_Scrape(default:False). - Modular Design: Clean and modular code structure for easy updates and maintenance.
├── Main.py # Entry point of the application
├── utils/
│ ├── Scraper.py # Core scraping logic for Google Maps
│ ├── Pprints.py # Utility for terminal-friendly progress printing
│ ├── DataHandler.py # Handles data storage and format selection
│ ├── Website_Scraper.py # Handles scraping of social media links from websites
-
Clone the repository:
git clone https://github.com/<your-username>/google-maps-scraper.git cd google-maps-scraper
-
Install the required dependencies:
pip install -r requirements.txt
-
Open
Main.pyand configure the desired settings:- Search Query: Define the business and location to scrape.
- Output Format: Set to either
CSVorJSON. - Review Scrape: Enable (
True) or disable (False) review scraping. - Website Scrape: Enable (
True) or disable (False) website scraping.
-
Run the scraper:
python Main.py
To include reviews, set the Review_Scrape flag to True in Main.py. Reviews will be included in the output.
To scrape social media links from websites, set the Website_Scrape flag to True in Main.py.
Select the desired output format (CSV or JSON) in Main.py for storing the scraped data.
from utils.Scraper import Scraper
from utils.DataHandler import DataHandler
from utils.Website_Scraper import WebsiteScraper
# Configuration
search_query = "software houses in New York"
output_format = "CSV" # Options: "CSV", "JSON"
review_scrape = True # Enable or disable review scraping
website_scrape = False # Enable or disable website scraping
if __name__ == "__main__":
# Instantiate the Main class with custom configurations and start the scraper.
app = Main(headless_mode='--headless', reviews_scrape=True, output_format='csv', website_scrape=True)
app.run()The scraper generates structured data files based on the chosen format:
- CSV Output:
Name, Website, Phone, Address, Reviews (if enabled), Social Media Links (if enabled) - JSON Output:
[ { "name": "Example Business", "website": "https://example.com", "phone": "+1-234-567-890", "address": "123 Example Street, New York, NY", "reviews": ["Review 1", "Review 2", ...], # Optional "social_media_links": ["https://twitter.com/example", ...] # Optional } ]
- Python 3.8+
- Required Python packages (listed in
requirements.txt):requestsbeautifulsoup4pandasjson- Any other dependencies used in the project
Contributions are welcome! Feel free to create a pull request or open an issue to report bugs or request features.
This project is licensed under the MIT License. See the LICENSE file for details.

