pip install researchpy bs4
Iowa Culture App
Brief Description
Iowa Culture App provides detailed information on various arts, history and cultural destinations in Iowa.
Source
Source: Iowa Culture
A sample code to extract the data from Iowa Culture is illustrated below. A complete code is located in the html file Code for Web Scraping.
import json
import re
from bs4 import BeautifulSoup
import pandas as pd
import warnings
import researchpy as rp
import os
'ignore') warnings.filterwarnings(
The extract_info()
function below extracts the dataset from the culture app website based on a provided html format. This function is specific to the nature of the html format presented on the culture app website.
def extract_info(data):
= []
results for place in data['p']:
= {}
info 'title'] = place['title']
info['latitude'] = place['lat']
info['longitude'] = place['lon']
info['image'] = f"/images/icons/markers/{place['icon']}"
info[
# Extracting distance
= place['win'].find('<span class="distance">') + len('<span class="distance">')
distance_start = place['win'].find('</span>', distance_start)
distance_end 'distance'] = place['win'][distance_start:distance_end].strip()
info[
# Extracting full address
= place['win'].find('<strong>Address:</strong>') + len('<strong>Address:</strong>')
address_start = place['win'].find('</p>', address_start)
address_end 'address'] = place['win'][address_start:address_end].strip().replace('<br />', ', ')
info[
# Extracting phone number
= place['win'].find('<i class="fa fa-phone fa-fw"></i>')
phone_start if phone_start != -1:
+= len('<i class="fa fa-phone fa-fw"></i>')
phone_start = place['win'].find('</p>', phone_start)
phone_end 'phone'] = place['win'][phone_start:phone_end].strip()
info[else:
'phone'] = None
info[
# Extracting website
= place['win'].find('<i class="fa fa-globe fa-fw"></i><a href="')
website_start if (website_start != -1):
+= len('<i class="fa fa-globe fa-fw"></i><a href="')
website_start = place['win'].find('"', website_start)
website_end 'website'] = place['win'][website_start:website_end].strip()
info[else:
'website'] = None
info[
results.append(info)
return results
extract_city_county()
: is used to extract the the specific city and county from the address since the html format from the website does not explicitly give the either the city or county, but gives them in a single address.
extract_marker()
: extracts the marker from the image path in the html file to know the specific category.
# Function to get city and county from the address
def extract_city_county(address):
= address.split(',')
parts = None
city = None
county for i, part in enumerate(parts):
if 'IA' in part:
= parts[i-1].strip()
city if 'County' in part:
= part.strip().replace(' County', '')
county return city, county
# Function to extract image path as marker
def extract_marker(image_path):
= os.path.basename(image_path)
base_name = os.path.splitext(base_name)[0]
marker_name return marker_name
The functionality of extract_info()
is tested in the chunk below
= {
arts "s": 1,
"p": [
{"icon": "national-register-of-historic-places.png",
"title": "Christian Peterson Courtyard Sculptures and Dairy Industry Building",
"lat": "42.026731",
"lon": "-93.642860",
"win": "<div class=\"window-container scroll\">\n\t<div class=\"image-container\">\n\t\t<img src=\"/images/default/history.jpg\" class=\"scale\" />\n\t</div>\n\t<div class=\"window-content\">\n\t\t<div class=\"window-title\">\n\t\t\t<div class=\"pin\"><figure><img src=\"/images/icons/markers/national-register-of-historic-places.png\" alt=\"\" class=\"scale\" /></figure></div>\n\t\t\t<div class=\"rl-content\"><span class=\"title\">Christian Peterson Courtyard Sculptures and Dairy Industry Building</span><span class=\"city\"><a href=\"https://maps.google.com/maps?q=42.026731,-93.642860\" target=\"_blank\">Get Directions</a></span>\n\t\t\t<span class=\"distance\">0.1mi.</span></div>\n\t\t</div>\n\n\t\t<p>Christian Petersen was the only sculptor in the Public Works of Art Project in Iowa City in 1934, directed by Iowa Painter Grant Wood. His sculptures here are regarded as "significant artistic statements on agriculture, technology, and higher education in mid-1930s America." The Christian Peterson Courtyard Sculptures and Dairy Industry Building was added to the National Register of Historic Places in 1987.</p>\t\t\t\t\t\t\t\t<p class=\"loc-address\">\n\t\t\t<strong>Address:</strong>\n\t\t\tUnion Dr.<br />Ames, IA 50010<br />Story County\t\t</p>\n\t\t\t\t\n\t\t</div>\n\t</div>"
},
{"icon": "public-art.png",
"title": "Escalieta I",
"lat": "42.025650",
"lon": "-93.644474",
"win": "<div class=\"window-container scroll\">\n\t<div class=\"image-container\">\n\t\t<img src=\"/images/default/arts.jpg\" class=\"scale\" />\n\t</div>\n\t<div class=\"window-content\">\n\t\t<div class=\"window-title\">\n\t\t\t<div class=\"pin\"><figure><img src=\"/images/icons/markers/public-art.png\" alt=\"\" class=\"scale\" /></figure></div>\n\t\t\t<div class=\"rl-content\"><span class=\"title\">Escalieta I</span><span class=\"city\"><a href=\"https://maps.google.com/maps?q=42.025650,-93.644474\" target=\"_blank\">Get Directions</a></span>\n\t\t\t<span class=\"distance\">0.1mi.</span></div>\n\t\t</div>\n\n\t\t<p>Manuel Neri created this marble sculpture in 2004. The sculpture references the female form, a goddess figure, in a state of transformation.</p>\t\t\t\t\t\t\t\t<p class=\"loc-address\">\n\t\t\t<strong>Address:</strong>\n\t\t\tWallace Rd.<br />Gerdin Business Building, Iowa State University<br />Ames, IA 50011<br />Story County\t\t</p>\n\t\t<p class=\"iw-contact\"><i class=\"fa fa-phone fa-fw\"></i> 515.242.6195</p>\t\t<p class=\"iw-contact\"><i class=\"fa fa-globe fa-fw\"></i><a href=\"http://www.publicartarchive.org/work/escalieta-i\" target=\"_blank\">Website</a></p>\n\t\t</div>\n\t</div>"
},
{"icon": "public-art.png",
"title": "History of Dairying",
"lat": "42.026875",
"lon": "-93.642983",
"win": "<div class=\"window-container scroll\">\n\t<div class=\"image-container\">\n\t\t<img src=\"https://www.iowacultureapp.com/images/story-artpublic-historyofdairying.jpg\" class=\"scale\" />\n\t</div>\n\t<div class=\"window-content\">\n\t\t<div class=\"window-title\">\n\t\t\t<div class=\"pin\"><figure><img src=\"/images/icons/markers/public-art.png\" alt=\"\" class=\"scale\" /></figure></div>\n\t\t\t<div class=\"rl-content\"><span class=\"title\">History of Dairying</span><span class=\"city\"><a href=\"https://maps.google.com/maps?q=42.026875,-93.642983\" target=\"_blank\">Get Directions</a></span>\n\t\t\t<span class=\"distance\">0.1mi.</span></div>\n\t\t</div>\n\n\t\t<p>Christian Peterson, an artist-in-residence at Iowa State University from 1934 to 1955, was commissioned to create this sculptural mural, depicting the history of \r\n the dairy industry in Iowa and America, as part of the Depression-era Public Works of Art Project. The bas relief, measuring 84 inches x 972 inches, took eight months to complete. This terra cotta work was the beginning of Peterson's 21 year career as campus sculptor-in-residence.</p>\t\t\t\t\t\t\t\t<p class=\"loc-address\">\n\t\t\t<strong>Address:</strong>\n\t\t\tFood Sciences Building Courtyard<br />Iowa State University<br />Ames, IA 50010<br />Story County\t\t</p>\n\t\t<p class=\"iw-contact\"><i class=\"fa fa-phone fa-fw\"></i> 515.294.3342</p>\t\t\n\t\t</div>\n\t</div>"
}
] }
= extract_info(arts)
arts_data = pd.DataFrame(arts_data)
arts_data 'city', 'county']] = arts_data['address'].apply(lambda x: pd.Series(extract_city_county(x)))
arts_data[['marker'] = arts_data['image'].apply(extract_marker)
arts_data[ arts_data
title | latitude | longitude | image | distance | address | phone | website | city | county | marker | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | Christian Peterson Courtyard Sculptures and Da... | 42.026731 | -93.642860 | /images/icons/markers/national-register-of-his... | 0.1mi. | Union Dr., Ames, IA 50010, Story County | None | None | Ames | Story | national-register-of-historic-places |
1 | Escalieta I | 42.025650 | -93.644474 | /images/icons/markers/public-art.png | 0.1mi. | Wallace Rd., Gerdin Business Building, Iowa St... | 515.242.6195 | http://www.publicartarchive.org/work/escalieta-i | Ames | Story | public-art |
2 | History of Dairying | 42.026875 | -93.642983 | /images/icons/markers/public-art.png | 0.1mi. | Food Sciences Building Courtyard, Iowa State U... | 515.294.3342 | None | Ames | Story | public-art |
Cultural_Locations_per_10k
Measure Description
For our analysis, we focused on the total count of historic sites, museums, public art, monuments, and similar locations, collectively referred to as cultural locations. To assess the distribution of these cultural sites within each city or county, we computed the number of cultural locations per 10,000 people. This measure allows us to compare the availability and density of cultural resources across different areas.
Measure Calculation
After extracting the relevant data from the website, we tallied the total number of cultural locations for each city or county. To calculate our measure of interest (Cultural_Locations_per_10k), we used the following steps:
- Extracted city/county population data from the American Community Survey (ACS).
- Divided the total number of cultural locations in each city/county by the population of the corresponding city/county.
- Multiplied the results by 10,000 to express the number of cultural locations per 10,000 people.
Thus, the final measure Cultural_Locations_per_10k was obtained using this formula:
\[ \text{Cultural\_Locations\_per\_10k} = \frac{\text{Total Number of Cultural Locations}}{\text{Population}} \times 10,000 \]