Scrape a website for suburbs using Beautiful Soup and python

Sometimes you just need a list. No html, no formatting etc, and you need it quick. Here is a quick project to get a list of suburbs from the Brisbane City Council page located here:

https://www.brisbane.qld.gov.au/about-council/council-information-and-rates/brisbane-suburbs

Of course, you are first going to need to install python, then your dependencies. I use pip to install them.

import requests
from bs4 import BeautifulSoup
import csv
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# create a new csv file
with open('suburbs.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(['Suburb'])

# get the html
response = requests.get('https://www.brisbane.qld.gov.au/about-council/council-information-and-rates/brisbane-suburbs')
soup = BeautifulSoup(response.text, 'html.parser')

# find all the tables
tables = soup.find_all('table')
for i, table in enumerate(tables, 1):
logger.info(f'Processing table {i}')
# find all the table rows
rows = table.find_all('tr')
for j, row in enumerate(rows, 1):
logger.info(f'Processing row {j}')
# find the suburb in each row
suburb_cell = row.find('td')
if suburb_cell:
suburb = suburb_cell.text.strip()
logger.info(suburb)
writer.writerow([suburb])


This gives the following output:

Suburb
Acacia Ridge
Albion
Alderley
Algester
Annerley
Anstead
Archerfield
Ascot
Ashgrove
Aspley
Auchenflower
Bald Hills
Balmoral
Banks Creek
Banyo
Bardon
Bellbowrie
Belmont
Boondall
Bowen Hills
Bracken Ridge
Bridgeman Downs
Brighton
Brisbane Airport
Brisbane City
Brookfield
Bulimba
Burbank
Calamvale
Camp Hill
Cannon Hill
Carina
Carina Heights
Carindale
Carseldine
Chandler
Chapel Hill
Chelmer
Chermside
Chermside West
Chuwar
Clayfield
Coopers Plains
Coorparoo
Corinda
Darra
Deagon
Doolandella
Drewvale
Durack
Dutton Park
Eagle Farm
East Brisbane
Eight Mile Plains
Ellen Grove
England Creek
Enoggera
Enoggera Reservoir
Everton Park
Fairfield
Ferny Grove
Fig Tree Pocket
Fitzgibbon
Forest Lake
Fortitude Valley
Gaythorne
Geebung
Gordon Park
Graceville
Grange
Greenslopes
Gumdale
Hamilton
Hawthorne
Heathwood
Hemmant
Hendra
Herston
Highgate Hill
Holland Park
Holland Park West
Inala
Indooroopilly
Jamboree Heights
Jindalee
Kalinga
Kangaroo Point
Karana Downs
Karawatha
Kedron
Kelvin Grove
Kenmore
Kenmore Hills
Keperra
Kholo
Kuraby
Lake Manchester
Larapinta
Lota
Lutwyche
Lytton
Macgregor
Mackenzie
Manly
Manly West
Mansfield
McDowall
Middle Park
Milton
Mitchelton
Moggill
Moorooka
Morningside
Mt Coot-tha
Mt Crosby
Mt Gravatt
Mt Gravatt East
Mt Ommaney
Murarrie
Nathan
New Farm
Newmarket
Newstead
Norman Park
Northgate
Nudgee
Nudgee Beach
Nundah
Oxley
Paddington
Pallara
Parkinson
Petrie Terrace
Pinjarra Hills
Pinkenba
Port of Brisbane
Pullenvale
Ransome
Red Hill
Richlands
Riverhills
Robertson
Rochedale
Rocklea
Runcorn
Salisbury
Sandgate
Seven Hills
Seventeen Mile Rocks
Sherwood
Shorncliffe
Sinnamon Park
South Brisbane
Spring Hill
St Lucia
Stafford
Stafford Heights
Stones Corner
Stretton
Sumner
Sunnybank
Sunnybank Hills
Taigum
Taringa
Tarragindi
Teneriffe
Tennyson
The Gap
Tingalpa
Toowong
Upper Brookfield
Upper Kedron
Upper Mt Gravatt
Virginia
Wacol
Wakerley
Wavell Heights
West End
Westlake
Willawong
Wilston
Windsor
Wishart
Woolloongabba
Wooloowin
Wynnum
Wynnum West
Yeerongpilly
Yeronga
Zillmere


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *