A Ruby gem to get planning applications data from UK council websites.
Вы не можете выбрать более 25 тем Темы должны начинаться с буквы или цифры, могут содержать дефисы(-) и должны содержать не более 35 символов.

6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
6 лет назад
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252
  1. # UK Planning Scraper
  2. **PRE-ALPHA: Only works with Idox and Northgate sites and spews a lot of stuff
  3. to STDOUT. Not for production use.**
  4. This gem scrapes planning applications data from UK local planning authority
  5. websites, eg Westminster City Council. Data is returned as an array of hashes,
  6. one hash for each planning application.
  7. This scraper gem doesn't use a database. Storing the output is up to you. It's
  8. just a convenient way to get the data.
  9. Currently this only works for Idox and Northgate sites. The ultimate aim is to
  10. provide a consistent interface in a single gem for all variants of all planning
  11. systems: Idox Public Access, Northgate Planning Explorer, OcellaWeb, Agile
  12. Planning and all the one-off systems.
  13. This project is not affiliated with any organisation.
  14. ## Installation
  15. Add this line to your application's Gemfile:
  16. ```ruby
  17. gem 'uk_planning_scraper', \
  18. git: 'https://github.com/adrianshort/uk_planning_scraper/'
  19. ```
  20. And then execute:
  21. $ bundle install
  22. Or install it yourself as:
  23. $ gem install specific_install
  24. $ gem specific_install adrianshort/uk_planning_scraper
  25. ## Usage
  26. ### First, require your stuff
  27. ```ruby
  28. require 'uk_planning_scraper'
  29. require 'pp'
  30. ```
  31. ### Scrape from a council
  32. Applications in Westminster decided in the last seven days:
  33. ```ruby
  34. pp UKPlanningScraper::Authority.named('Westminster').decided_days(7).scrape
  35. ```
  36. ### Scrape from a bunch of councils
  37. Scrape the last week's planning decisions across the whole of
  38. London (actually 23 of the 35 authorities right now):
  39. ```ruby
  40. authorities = UKPlanningScraper::Authority.tagged('london')
  41. authorities.each do |authority|
  42. applications = authority.decided_days(7).scrape
  43. pp applications
  44. # You'll probably want to save `applications` to your database here
  45. end
  46. ```
  47. ### Satisfy your niche interests
  48. Launderette applications validated in the last seven days in Scotland:
  49. ```ruby
  50. authorities = UKPlanningScraper::Authority.tagged('scotland')
  51. authorities.each do |authority|
  52. applications = authority.validated_days(7).keywords('launderette').scrape
  53. pp applications # You'll probably want to save `apps` to your database here
  54. end
  55. ```
  56. ### More scrape parameter methods
  57. Chain as many scrape parameter methods on a `UKPlanningScraper::Authority`
  58. object as you like, making sure that `scrape` comes last.
  59. ```ruby
  60. received_from(Date.parse("1 Jan 2016"))
  61. received_to(Date.parse("31 Dec 2016"))
  62. # Received in the last n days (including today)
  63. # Use instead of received_to, received_from
  64. received_days(7)
  65. validated_to(Date.today)
  66. validated_from(Date.today - 30)
  67. validated_days(7) # instead of validated_to, validated_from
  68. decided_to(Date.today)
  69. decided_from(Date.today - 30)
  70. decided_days(7) # instead of decided_to, decided_from
  71. # Check that the systems you're scraping return the
  72. # results you expect for multiple keywords (AND or OR?)
  73. keywords("hip gable")
  74. applicant_name("Mr and Mrs Smith") # Currently Idox only
  75. application_type("Householder") # Currently Idox only
  76. development_type("") # Currently Idox only
  77. scrape # runs the scraper
  78. ```
  79. ### Save to a SQLite database
  80. This gem has no interest whatsoever in persistence. What you do with the data it
  81. outputs is up to you: relational databases, document stores, VHS and clay
  82. tablets are all blissfully none of its business. But using the
  83. [ScraperWiki](https://github.com/openaustralia/scraperwiki-ruby) gem is a really
  84. easy way to store your data:
  85. ```ruby
  86. require 'scraperwiki' # Must be installed, of course
  87. ScraperWiki.save_sqlite([:authority_name, :council_reference], applications)
  88. ```
  89. That `applications` param can be a hash or an array of hashes, which is what
  90. gets returned by our `Authority.scrape`.
  91. ### Find authorities by tag
  92. Tags are always lowercase and one word.
  93. ```ruby
  94. london_auths = UKPlanningScraper::Authority.tagged('london')
  95. ```
  96. We've got tags for areas:
  97. - london
  98. - innerlondon
  99. - outerlondon
  100. - northlondon
  101. - southlondon
  102. - greatermanchester
  103. - surrey
  104. - wales
  105. We also automatically add tags for software systems:
  106. - idox
  107. - northgate
  108. - ocellaweb
  109. - agileplanning
  110. - unknownsystem -- for when we can't identify the system
  111. and whatever you'd like to add that would be useful to others.
  112. ### WTF is up with London?
  113. London has got 32 London Boroughs, tagged `londonboroughs`. These are the
  114. councils under the authority of the Mayor of London and the Greater London
  115. Authority.
  116. It has 33 councils: the London Boroughs plus the City of London (named `City of
  117. London`). We don't currently have a tag for this, but if you want to add
  118. `londoncouncils` please go ahead.
  119. And it's got 35 local planning authorities: the 33 councils plus the two
  120. `londondevelopmentcorporations`, named `London Legacy Development Corporation`
  121. and `Old Oak and Park Royal Development Corporation`. The tag `london` covers
  122. all (and only) the 35 local planning authorities in London.
  123. ```ruby
  124. UKPlanningScraper::Authority.tagged('londonboroughs').size
  125. # => 32
  126. UKPlanningScraper::Authority.tagged('londondevelopmentcorporations').size
  127. # => 2
  128. UKPlanningScraper::Authority.tagged('london').size
  129. # => 35
  130. ```
  131. ### More fun with Authority tags
  132. ```ruby
  133. UKPlanningScraper::Authority.named('Merton').tags
  134. # => ["england", "london", "londonboroughs", "northgate", "outerlondon", "southlondon"]
  135. UKPlanningScraper::Authority.not_tagged('london')
  136. # => [...]
  137. UKPlanningScraper::Authority.named('Islington').tagged?('southlondon')
  138. # => false
  139. ```
  140. ### List all authorities
  141. ```ruby
  142. UKPlanningScraper::Authority.all.each { |a| puts a.name }
  143. ```
  144. ### List all tags
  145. ```ruby
  146. pp UKPlanningScraper::Authority.tags
  147. ```
  148. ## Add your favourite local planning authorities
  149. The list of authorities is in a CSV file in `/lib/uk_planning_scraper`:
  150. https://github.com/adrianshort/uk_planning_scraper/blob/master/lib/uk_planning_scraper/authorities.csv
  151. The easiest way to add to or edit this list is to edit within GitHub (use the
  152. pencil icon) and create a new pull request for your changes. If accepted, your
  153. changes will be available to everyone with the next version of the gem.
  154. The file format is one line per authority, with comma-separated:
  155. - Name (omit "the", "council", "borough of", "city of", etc. and write "and" not
  156. "&", except for `City of London` which is a special case)
  157. - URL of the search form (use the advanced search URL if there is one)
  158. - Tags (use as many comma-separated tags as is reasonable, lowercase and all one
  159. word.)
  160. There's no need to manually add tags to the `authorities.csv` file for the
  161. software systems like `idox`, `northgate` etc as these are added automatically.
  162. Please check the tag list before you change anything:
  163. ```ruby
  164. pp UKPlanningScraper::Authority.tags
  165. ```
  166. ## Development
  167. After checking out the repo, run `bin/setup` to install dependencies. You can
  168. also run `bin/console` for an interactive prompt that will allow you to
  169. experiment.
  170. To install this gem onto your local machine, run `bundle exec rake install`. To
  171. release a new version, update the version number in `version.rb`, and then run
  172. `bundle exec rake release`, which will create a git tag for the version, push
  173. git commits and tags, and push the `.gem` file to
  174. [rubygems.org](https://rubygems.org).
  175. ## Contributing
  176. Bug reports and pull requests are welcome on GitHub at
  177. https://github.com/adrianshort/uk_planning_scraper.