Iowa Culture App

Author

Solomon Eshun

Published

July 11, 2024

Brief Description

Iowa Culture App provides detailed information on various arts, history and cultural destinations in Iowa.

Source

Source: Iowa Culture

A sample code to extract the data from Iowa Culture is illustrated below. A complete code is located in the html file Code for Web Scraping.

pip install researchpy bs4
import json
import re
from bs4 import BeautifulSoup
import pandas as pd
import warnings
import researchpy as rp
import os
warnings.filterwarnings('ignore')

The extract_info() function below extracts the dataset from the culture app website based on a provided html format. This function is specific to the nature of the html format presented on the culture app website.

def extract_info(data):
    results = []
    for place in data['p']:
        info = {}
        info['title'] = place['title']
        info['latitude'] = place['lat']
        info['longitude'] = place['lon']
        info['image'] = f"/images/icons/markers/{place['icon']}"

        # Extracting distance
        distance_start = place['win'].find('<span class="distance">') + len('<span class="distance">')
        distance_end = place['win'].find('</span>', distance_start)
        info['distance'] = place['win'][distance_start:distance_end].strip()

        # Extracting full address
        address_start = place['win'].find('<strong>Address:</strong>') + len('<strong>Address:</strong>')
        address_end = place['win'].find('</p>', address_start)
        info['address'] = place['win'][address_start:address_end].strip().replace('<br />', ', ')

        # Extracting phone number
        phone_start = place['win'].find('<i class="fa fa-phone fa-fw"></i>')
        if phone_start != -1:
            phone_start += len('<i class="fa fa-phone fa-fw"></i>')
            phone_end = place['win'].find('</p>', phone_start)
            info['phone'] = place['win'][phone_start:phone_end].strip()
        else:
            info['phone'] = None

        # Extracting website
        website_start = place['win'].find('<i class="fa fa-globe fa-fw"></i><a href="')
        if (website_start != -1):
            website_start += len('<i class="fa fa-globe fa-fw"></i><a href="')
            website_end = place['win'].find('"', website_start)
            info['website'] = place['win'][website_start:website_end].strip()
        else:
            info['website'] = None

        results.append(info)

    return results

extract_city_county(): is used to extract the the specific city and county from the address since the html format from the website does not explicitly give the either the city or county, but gives them in a single address.

extract_marker(): extracts the marker from the image path in the html file to know the specific category.

# Function to get city and county from the address
def extract_city_county(address):
    parts = address.split(',')
    city = None
    county = None
    for i, part in enumerate(parts):
        if 'IA' in part:
            city = parts[i-1].strip()
        if 'County' in part:
            county = part.strip().replace(' County', '')
    return city, county


# Function to extract image path as marker
def extract_marker(image_path):
    base_name = os.path.basename(image_path)
    marker_name = os.path.splitext(base_name)[0]
    return marker_name

The functionality of extract_info() is tested in the chunk below

arts = {
    "s": 1,
    "p": [
        {
            "icon": "national-register-of-historic-places.png",
            "title": "Christian Peterson Courtyard Sculptures and Dairy Industry Building",
            "lat": "42.026731",
            "lon": "-93.642860",
            "win": "<div class=\"window-container scroll\">\n\t<div class=\"image-container\">\n\t\t<img src=\"/images/default/history.jpg\" class=\"scale\" />\n\t</div>\n\t<div class=\"window-content\">\n\t\t<div class=\"window-title\">\n\t\t\t<div class=\"pin\"><figure><img src=\"/images/icons/markers/national-register-of-historic-places.png\" alt=\"\" class=\"scale\" /></figure></div>\n\t\t\t<div class=\"rl-content\"><span class=\"title\">Christian Peterson Courtyard Sculptures and Dairy Industry Building</span><span class=\"city\"><a href=\"https://maps.google.com/maps?q=42.026731,-93.642860\" target=\"_blank\">Get Directions</a></span>\n\t\t\t<span class=\"distance\">0.1mi.</span></div>\n\t\t</div>\n\n\t\t<p>Christian Petersen was the only sculptor in the Public Works of Art Project in Iowa City in 1934, directed by Iowa Painter Grant Wood. His sculptures here are regarded as &quot;significant artistic statements on agriculture, technology, and higher education in mid-1930s America.&quot; The Christian Peterson Courtyard Sculptures and Dairy Industry Building was added to the National Register of Historic Places in 1987.</p>\t\t\t\t\t\t\t\t<p class=\"loc-address\">\n\t\t\t<strong>Address:</strong>\n\t\t\tUnion Dr.<br />Ames, IA 50010<br />Story County\t\t</p>\n\t\t\t\t\n\t\t</div>\n\t</div>"
        },
        {
            "icon": "public-art.png",
            "title": "Escalieta I",
            "lat": "42.025650",
            "lon": "-93.644474",
            "win": "<div class=\"window-container scroll\">\n\t<div class=\"image-container\">\n\t\t<img src=\"/images/default/arts.jpg\" class=\"scale\" />\n\t</div>\n\t<div class=\"window-content\">\n\t\t<div class=\"window-title\">\n\t\t\t<div class=\"pin\"><figure><img src=\"/images/icons/markers/public-art.png\" alt=\"\" class=\"scale\" /></figure></div>\n\t\t\t<div class=\"rl-content\"><span class=\"title\">Escalieta I</span><span class=\"city\"><a href=\"https://maps.google.com/maps?q=42.025650,-93.644474\" target=\"_blank\">Get Directions</a></span>\n\t\t\t<span class=\"distance\">0.1mi.</span></div>\n\t\t</div>\n\n\t\t<p>Manuel Neri created this marble sculpture in 2004. The sculpture references the female form, a goddess figure, in a state of transformation.</p>\t\t\t\t\t\t\t\t<p class=\"loc-address\">\n\t\t\t<strong>Address:</strong>\n\t\t\tWallace Rd.<br />Gerdin Business Building, Iowa State University<br />Ames, IA 50011<br />Story County\t\t</p>\n\t\t<p class=\"iw-contact\"><i class=\"fa fa-phone fa-fw\"></i> 515.242.6195</p>\t\t<p class=\"iw-contact\"><i class=\"fa fa-globe fa-fw\"></i><a href=\"http://www.publicartarchive.org/work/escalieta-i\" target=\"_blank\">Website</a></p>\n\t\t</div>\n\t</div>"
        },
        {
            "icon": "public-art.png",
            "title": "History of Dairying",
            "lat": "42.026875",
            "lon": "-93.642983",
            "win": "<div class=\"window-container scroll\">\n\t<div class=\"image-container\">\n\t\t<img src=\"https://www.iowacultureapp.com/images/story-artpublic-historyofdairying.jpg\" class=\"scale\" />\n\t</div>\n\t<div class=\"window-content\">\n\t\t<div class=\"window-title\">\n\t\t\t<div class=\"pin\"><figure><img src=\"/images/icons/markers/public-art.png\" alt=\"\" class=\"scale\" /></figure></div>\n\t\t\t<div class=\"rl-content\"><span class=\"title\">History of Dairying</span><span class=\"city\"><a href=\"https://maps.google.com/maps?q=42.026875,-93.642983\" target=\"_blank\">Get Directions</a></span>\n\t\t\t<span class=\"distance\">0.1mi.</span></div>\n\t\t</div>\n\n\t\t<p>Christian Peterson, an artist-in-residence at Iowa State University from 1934 to 1955, was commissioned to create this sculptural mural, depicting the history of \r\n the dairy industry in Iowa and America, as part of the Depression-era Public Works of Art Project.  The bas relief, measuring 84 inches x 972 inches, took eight months to complete.  This terra cotta work was the beginning of Peterson&#039;s 21 year career as campus sculptor-in-residence.</p>\t\t\t\t\t\t\t\t<p class=\"loc-address\">\n\t\t\t<strong>Address:</strong>\n\t\t\tFood Sciences Building Courtyard<br />Iowa State University<br />Ames, IA 50010<br />Story County\t\t</p>\n\t\t<p class=\"iw-contact\"><i class=\"fa fa-phone fa-fw\"></i> 515.294.3342</p>\t\t\n\t\t</div>\n\t</div>"
        }
    ]
}
arts_data = extract_info(arts)
arts_data = pd.DataFrame(arts_data)
arts_data[['city', 'county']] = arts_data['address'].apply(lambda x: pd.Series(extract_city_county(x)))
arts_data['marker'] = arts_data['image'].apply(extract_marker)
arts_data
title latitude longitude image distance address phone website city county marker
0 Christian Peterson Courtyard Sculptures and Da... 42.026731 -93.642860 /images/icons/markers/national-register-of-his... 0.1mi. Union Dr., Ames, IA 50010, Story County None None Ames Story national-register-of-historic-places
1 Escalieta I 42.025650 -93.644474 /images/icons/markers/public-art.png 0.1mi. Wallace Rd., Gerdin Business Building, Iowa St... 515.242.6195 http://www.publicartarchive.org/work/escalieta-i Ames Story public-art
2 History of Dairying 42.026875 -93.642983 /images/icons/markers/public-art.png 0.1mi. Food Sciences Building Courtyard, Iowa State U... 515.294.3342 None Ames Story public-art

Cultural_Locations_per_10k

Measure Description

For our analysis, we focused on the total count of historic sites, museums, public art, monuments, and similar locations, collectively referred to as cultural locations. To assess the distribution of these cultural sites within each city or county, we computed the number of cultural locations per 10,000 people. This measure allows us to compare the availability and density of cultural resources across different areas.

Measure Calculation

After extracting the relevant data from the website, we tallied the total number of cultural locations for each city or county. To calculate our measure of interest (Cultural_Locations_per_10k), we used the following steps:

  • Extracted city/county population data from the American Community Survey (ACS).
  • Divided the total number of cultural locations in each city/county by the population of the corresponding city/county.
  • Multiplied the results by 10,000 to express the number of cultural locations per 10,000 people.

Thus, the final measure Cultural_Locations_per_10k was obtained using this formula:

\[ \text{Cultural\_Locations\_per\_10k} = \frac{\text{Total Number of Cultural Locations}}{\text{Population}} \times 10,000 \]