Travel Iowa

Author

Solomon Eshun

Published

July 11, 2024

Brief Description

Travel Iowa is the official tourism website for the state of Iowa, offering a wealth of information for those looking to explore. It includes details on new events and attractions, birding spots, stargazing locations, nostalgic soda fountains, and more​.

Source

Source: Travel Iowa

A sample code to extract the data from Travel Iowa is illustrated below. A complete code is located in the html file …… This process below is specific to the nature of the html format we were able to retrieve from the travel Iowa website.

pip install bs4
html_doc = """

<div id="top">
    <div id="ctl00_phMainContent_div3" class="listControls">
        <div class="results">
        </div>
    </div>

    
    <div ID="pnlControls" style="display:none;">
        <div class="listControls" id="div4">

            <div class="pager">
                <ul>
                    <li>
                        <div id="pnlListingPager">
                            Page:
                            <span id="dpListing">
                                        <span class="activePage">1</span>
                                <a class="previous" data-pager-page="0">previous</a>
                                <a class="next" data-pager-page="0">next</a>
                            </span>
                            of 1
                        </div>
                    </li>
                </ul>
            </div>
        </div>
    </div>

    <div class="grid">
        <div class="item">
            <div class="item-text">
                <div class="item-date">Jul&nbsp;<span>4 -&nbsp;7</span></div>
                <h3 class="item-title">
                    <a href="/calendar/flea-market-under-the-bridge/1647514">Flea Market Under the Bridge</a>
                </h3>
                <div class="item-city">Marquette</div>
                <div class="item-venue">
                    <span>Venue:</span> <a href="/calendar/flea-market-under-the-bridge/1647514">Flea Market</a>
                </div>
            </div>
        </div>
        <div class="item">
            <div class="item-text">
                <div class="item-date">Jul&nbsp;<span>1 -&nbsp;</span>Oct&nbsp;<span>31</span></div>
                <h3 class="item-title">
                    <a href="/calendar/historic-hills-scenic-byway-bale-trail/1643930">Historic Hills Scenic Byway Bale Trail</a>
                </h3>
                <div class="item-city">Fairfield</div>
                <div class="item-venue">
                    <span>Venue:</span> <a href="/calendar/historic-hills-scenic-byway-bale-trail/1643930">Historic Hills Scenic Byway</a>
                </div>
            </div>
        </div>
        <div class="end-grid-action">
            <button class="button" id="btnShowMoreEvents">Show More Events</button>
        </div>
    </div>
    <div ID="pnlControls2" style="display:none;">
        <div class="listControls" id="div1">

            <div class="pager">
                <ul>
                    <li>
                        <div id="pnlListingPager">
                            Page:
                            <span id="dpListing">
                                        <span class="activePage">1</span>
                                <a class="previous" data-pager-page="0">previous</a>
                                <a class="next" data-pager-page="0">next</a>
                            </span>
                            of 1
                        </div>
                    </li>
                </ul>
            </div>
        </div>
    </div>



</div>


"""
import pandas as pd
from bs4 import BeautifulSoup
pd.set_option('display.max_colwidth', None)
import warnings
warnings.filterwarnings('ignore')

# Base URL
base_url = "https://www.traveliowa.com"

# Parse the HTML content
soup = BeautifulSoup(html_doc, 'html.parser')

# Find all divs with class 'item'
items = soup.find_all('div', class_='item')

# Initialize a list to store the extracted data
data = []

# Extract data from each item
for item in items:
    date = item.find('div', class_='item-date').get_text(strip=True)
    title_tag = item.find('h3', class_='item-title').find('a')
    title = title_tag.get_text(strip=True)
    link = base_url + title_tag['href']
    city = item.find('div', class_='item-city').get_text(strip=True)

    # Check if the venue div and its a tag exist
    venue_div = item.find('div', class_='item-venue')
    if venue_div and venue_div.find('a'):
        venue = venue_div.find('a').get_text(strip=True)
    else:
        venue = 'Venue information not available'

    # Append the extracted data to the list
    data.append({
        'Date': date,
        'Title': title,
        'Link': link,
        'City': city,
        'Venue': venue
    })

# Create a DataFrame from the list of dictionaries
df = pd.DataFrame(data)

df
Date Title Link City Venue
0 Jul4 - 7 Flea Market Under the Bridge https://www.traveliowa.com/calendar/flea-market-under-the-bridge/1647514 Marquette Flea Market
1 Jul1 -Oct31 Historic Hills Scenic Byway Bale Trail https://www.traveliowa.com/calendar/historic-hills-scenic-byway-bale-trail/1643930 Fairfield Historic Hills Scenic Byway

Events_per_10k

Measure Description

In our analysis, we focused on the total number of past and future events (2021-2025) for each city. We calculated the total events per 10,000 people to evaluate their distribution across the cities.

Measure Calculation

After extracting the relevant data from the website, we tallied the total number of events for each city. To calculate our measure of interest (Events_per_10k), we used the following steps:

  • Extracted city population data from the American Community Survey (ACS).
  • Divided the total number of events in each city by the population of the corresponding city.
  • Multiplied the results by 10,000 to express the number of events per 10,000 people.

Thus, the final measure Events_per_10k was obtained using this formula:

\[ \text{Events\_per\_10k} = \frac{\text{Total Number of Events}}{\text{Population}} \times 10,000 \]