pip install researchpy bs4Iowa Culture App
Brief Description
Iowa Culture App provides detailed information on various arts, history and cultural destinations in Iowa.
Source
Source: Iowa Culture
A sample code to extract the data from Iowa Culture is illustrated below. A complete code is located in the html file Code for Web Scraping.
import json
import re
from bs4 import BeautifulSoup
import pandas as pd
import warnings
import researchpy as rp
import os
warnings.filterwarnings('ignore')The extract_info() function below extracts the dataset from the culture app website based on a provided html format. This function is specific to the nature of the html format presented on the culture app website.
def extract_info(data):
results = []
for place in data['p']:
info = {}
info['title'] = place['title']
info['latitude'] = place['lat']
info['longitude'] = place['lon']
info['image'] = f"/images/icons/markers/{place['icon']}"
# Extracting distance
distance_start = place['win'].find('<span class="distance">') + len('<span class="distance">')
distance_end = place['win'].find('</span>', distance_start)
info['distance'] = place['win'][distance_start:distance_end].strip()
# Extracting full address
address_start = place['win'].find('<strong>Address:</strong>') + len('<strong>Address:</strong>')
address_end = place['win'].find('</p>', address_start)
info['address'] = place['win'][address_start:address_end].strip().replace('<br />', ', ')
# Extracting phone number
phone_start = place['win'].find('<i class="fa fa-phone fa-fw"></i>')
if phone_start != -1:
phone_start += len('<i class="fa fa-phone fa-fw"></i>')
phone_end = place['win'].find('</p>', phone_start)
info['phone'] = place['win'][phone_start:phone_end].strip()
else:
info['phone'] = None
# Extracting website
website_start = place['win'].find('<i class="fa fa-globe fa-fw"></i><a href="')
if (website_start != -1):
website_start += len('<i class="fa fa-globe fa-fw"></i><a href="')
website_end = place['win'].find('"', website_start)
info['website'] = place['win'][website_start:website_end].strip()
else:
info['website'] = None
results.append(info)
return resultsextract_city_county(): is used to extract the the specific city and county from the address since the html format from the website does not explicitly give the either the city or county, but gives them in a single address.
extract_marker(): extracts the marker from the image path in the html file to know the specific category.
# Function to get city and county from the address
def extract_city_county(address):
parts = address.split(',')
city = None
county = None
for i, part in enumerate(parts):
if 'IA' in part:
city = parts[i-1].strip()
if 'County' in part:
county = part.strip().replace(' County', '')
return city, county
# Function to extract image path as marker
def extract_marker(image_path):
base_name = os.path.basename(image_path)
marker_name = os.path.splitext(base_name)[0]
return marker_nameThe functionality of extract_info() is tested in the chunk below
arts = {
"s": 1,
"p": [
{
"icon": "national-register-of-historic-places.png",
"title": "Christian Peterson Courtyard Sculptures and Dairy Industry Building",
"lat": "42.026731",
"lon": "-93.642860",
"win": "<div class=\"window-container scroll\">\n\t<div class=\"image-container\">\n\t\t<img src=\"/images/default/history.jpg\" class=\"scale\" />\n\t</div>\n\t<div class=\"window-content\">\n\t\t<div class=\"window-title\">\n\t\t\t<div class=\"pin\"><figure><img src=\"/images/icons/markers/national-register-of-historic-places.png\" alt=\"\" class=\"scale\" /></figure></div>\n\t\t\t<div class=\"rl-content\"><span class=\"title\">Christian Peterson Courtyard Sculptures and Dairy Industry Building</span><span class=\"city\"><a href=\"https://maps.google.com/maps?q=42.026731,-93.642860\" target=\"_blank\">Get Directions</a></span>\n\t\t\t<span class=\"distance\">0.1mi.</span></div>\n\t\t</div>\n\n\t\t<p>Christian Petersen was the only sculptor in the Public Works of Art Project in Iowa City in 1934, directed by Iowa Painter Grant Wood. His sculptures here are regarded as "significant artistic statements on agriculture, technology, and higher education in mid-1930s America." The Christian Peterson Courtyard Sculptures and Dairy Industry Building was added to the National Register of Historic Places in 1987.</p>\t\t\t\t\t\t\t\t<p class=\"loc-address\">\n\t\t\t<strong>Address:</strong>\n\t\t\tUnion Dr.<br />Ames, IA 50010<br />Story County\t\t</p>\n\t\t\t\t\n\t\t</div>\n\t</div>"
},
{
"icon": "public-art.png",
"title": "Escalieta I",
"lat": "42.025650",
"lon": "-93.644474",
"win": "<div class=\"window-container scroll\">\n\t<div class=\"image-container\">\n\t\t<img src=\"/images/default/arts.jpg\" class=\"scale\" />\n\t</div>\n\t<div class=\"window-content\">\n\t\t<div class=\"window-title\">\n\t\t\t<div class=\"pin\"><figure><img src=\"/images/icons/markers/public-art.png\" alt=\"\" class=\"scale\" /></figure></div>\n\t\t\t<div class=\"rl-content\"><span class=\"title\">Escalieta I</span><span class=\"city\"><a href=\"https://maps.google.com/maps?q=42.025650,-93.644474\" target=\"_blank\">Get Directions</a></span>\n\t\t\t<span class=\"distance\">0.1mi.</span></div>\n\t\t</div>\n\n\t\t<p>Manuel Neri created this marble sculpture in 2004. The sculpture references the female form, a goddess figure, in a state of transformation.</p>\t\t\t\t\t\t\t\t<p class=\"loc-address\">\n\t\t\t<strong>Address:</strong>\n\t\t\tWallace Rd.<br />Gerdin Business Building, Iowa State University<br />Ames, IA 50011<br />Story County\t\t</p>\n\t\t<p class=\"iw-contact\"><i class=\"fa fa-phone fa-fw\"></i> 515.242.6195</p>\t\t<p class=\"iw-contact\"><i class=\"fa fa-globe fa-fw\"></i><a href=\"http://www.publicartarchive.org/work/escalieta-i\" target=\"_blank\">Website</a></p>\n\t\t</div>\n\t</div>"
},
{
"icon": "public-art.png",
"title": "History of Dairying",
"lat": "42.026875",
"lon": "-93.642983",
"win": "<div class=\"window-container scroll\">\n\t<div class=\"image-container\">\n\t\t<img src=\"https://www.iowacultureapp.com/images/story-artpublic-historyofdairying.jpg\" class=\"scale\" />\n\t</div>\n\t<div class=\"window-content\">\n\t\t<div class=\"window-title\">\n\t\t\t<div class=\"pin\"><figure><img src=\"/images/icons/markers/public-art.png\" alt=\"\" class=\"scale\" /></figure></div>\n\t\t\t<div class=\"rl-content\"><span class=\"title\">History of Dairying</span><span class=\"city\"><a href=\"https://maps.google.com/maps?q=42.026875,-93.642983\" target=\"_blank\">Get Directions</a></span>\n\t\t\t<span class=\"distance\">0.1mi.</span></div>\n\t\t</div>\n\n\t\t<p>Christian Peterson, an artist-in-residence at Iowa State University from 1934 to 1955, was commissioned to create this sculptural mural, depicting the history of \r\n the dairy industry in Iowa and America, as part of the Depression-era Public Works of Art Project. The bas relief, measuring 84 inches x 972 inches, took eight months to complete. This terra cotta work was the beginning of Peterson's 21 year career as campus sculptor-in-residence.</p>\t\t\t\t\t\t\t\t<p class=\"loc-address\">\n\t\t\t<strong>Address:</strong>\n\t\t\tFood Sciences Building Courtyard<br />Iowa State University<br />Ames, IA 50010<br />Story County\t\t</p>\n\t\t<p class=\"iw-contact\"><i class=\"fa fa-phone fa-fw\"></i> 515.294.3342</p>\t\t\n\t\t</div>\n\t</div>"
}
]
}arts_data = extract_info(arts)
arts_data = pd.DataFrame(arts_data)
arts_data[['city', 'county']] = arts_data['address'].apply(lambda x: pd.Series(extract_city_county(x)))
arts_data['marker'] = arts_data['image'].apply(extract_marker)
arts_data| title | latitude | longitude | image | distance | address | phone | website | city | county | marker | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Christian Peterson Courtyard Sculptures and Da... | 42.026731 | -93.642860 | /images/icons/markers/national-register-of-his... | 0.1mi. | Union Dr., Ames, IA 50010, Story County | None | None | Ames | Story | national-register-of-historic-places |
| 1 | Escalieta I | 42.025650 | -93.644474 | /images/icons/markers/public-art.png | 0.1mi. | Wallace Rd., Gerdin Business Building, Iowa St... | 515.242.6195 | http://www.publicartarchive.org/work/escalieta-i | Ames | Story | public-art |
| 2 | History of Dairying | 42.026875 | -93.642983 | /images/icons/markers/public-art.png | 0.1mi. | Food Sciences Building Courtyard, Iowa State U... | 515.294.3342 | None | Ames | Story | public-art |
Cultural_Locations_per_10k
Measure Description
For our analysis, we focused on the total count of historic sites, museums, public art, monuments, and similar locations, collectively referred to as cultural locations. To assess the distribution of these cultural sites within each city or county, we computed the number of cultural locations per 10,000 people. This measure allows us to compare the availability and density of cultural resources across different areas.
Measure Calculation
After extracting the relevant data from the website, we tallied the total number of cultural locations for each city or county. To calculate our measure of interest (Cultural_Locations_per_10k), we used the following steps:
- Extracted city/county population data from the American Community Survey (ACS).
- Divided the total number of cultural locations in each city/county by the population of the corresponding city/county.
- Multiplied the results by 10,000 to express the number of cultural locations per 10,000 people.
Thus, the final measure Cultural_Locations_per_10k was obtained using this formula:
\[ \text{Cultural\_Locations\_per\_10k} = \frac{\text{Total Number of Cultural Locations}}{\text{Population}} \times 10,000 \]