浏览代码

First commit

master
Adrian Short 6 年前
当前提交
1665659418
共有 5 个文件被更改,包括 66 次插入0 次删除
  1. +4
    -0
      .gitignore
  2. +6
    -0
      Gemfile
  3. +4
    -0
      README.md
  4. +19
    -0
      councils.csv
  5. +33
    -0
      scraper.rb

+ 4
- 0
.gitignore 查看文件

@@ -0,0 +1,4 @@
.ruby-*
*.db
*.sqlite
Gemfile.lock

+ 6
- 0
Gemfile 查看文件

@@ -0,0 +1,6 @@
source "https://rubygems.org"

ruby '2.3.1'

gem 'uk_planning_scraper', :git => 'https://github.com/adrianshort/uk_planning_scraper/'
gem 'scraperwiki', :git => 'https://github.com/openaustralia/scraperwiki-ruby/', :branch => 'morph_defaults'

+ 4
- 0
README.md 查看文件

@@ -0,0 +1,4 @@
# BT InLink planning applications scraper

Scrapes planning applications data for [BT InLink kiosks](https://www.adrianshort.org/tags/inlinkuk/) from UK council websites.


+ 19
- 0
councils.csv 查看文件

@@ -0,0 +1,19 @@
City of London,http://www.planning2.cityoflondon.gov.uk/online-applications/search.do?action=advanced
Barking and Dagenham,http://paplan.lbbd.gov.uk/online-applications/search.do?action=advanced
Barnet,https://publicaccess.barnet.gov.uk/online-applications/search.do?action=advanced
#Bexley,http://pa.bexley.gov.uk/online-applications/search.do?action=advanced
Brent,https://pa.brent.gov.uk/online-applications/search.do?action=advanced&searchType=Application
Bromley,https://searchapplications.bromley.gov.uk/online-applications/search.do?action=advanced
#Croydon,http://publicaccess2.croydon.gov.uk/online-applications/search.do?action=advanced
Ealing,https://pam.ealing.gov.uk/online-applications/search.do?action=advanced
Enfield,https://planningandbuildingcontrol.enfield.gov.uk/online-applications/search.do?action=advanced
#Newham,https://pa.newham.gov.uk/online-applications/search.do?action=advanced
Sutton,https://planningregister.sutton.gov.uk/online-applications/search.do?action=advanced
#Greenwich,https://planning.royalgreenwich.gov.uk/online-applications/search.do?action=advanced
#Hammersmith and Fulham,http://public-access.lbhf.gov.uk/online-applications/search.do?action=advanced
Lambeth,https://planning.lambeth.gov.uk/online-applications/search.do?action=advanced
Lewisham,http://planning.lewisham.gov.uk/online-applications/search.do?action=advanced
Southwark,https://planning.southwark.gov.uk/online-applications/search.do?action=advanced
#Tower Hamlets,https://development.towerhamlets.gov.uk/online-applications/search.do?action=advanced
Westminster,http://idoxpa.westminster.gov.uk/online-applications/search.do?action=advanced
#Bristol,https://planningonline.bristol.gov.uk/online-applications/search.do?action=advanced

+ 33
- 0
scraper.rb 查看文件

@@ -0,0 +1,33 @@
require 'uk_planning_scraper'
require 'scraperwiki'
require 'date'
require 'time'
require 'csv'

councils = []

CSV.foreach('councils.csv') do |line|
councils << { name: line[0], url: line[1] } unless line[0][0] == '#'
end

params = {
validated_from: Date.today - ENV['MORPH_DAYS'].to_i,
validated_to: Date.today,
description: 'inlink',
}

councils.each do |council|
apps = UKPlanningScraper.search(council[:url], params)
apps.map! do |app|
app.merge(
{
la_name: council[:name],
updated_at: Time.now
}
)
end
ScraperWiki.save_sqlite([:council_reference, :la_name], apps, 'applications')
puts "#{council[:name]}: #{apps.size}"
end

正在加载...
取消
保存