Methodology - Urban Reports

Data Sources

Urban Reports uses publicly available data from the City of Seattle's Open Data Portal. Our primary data source is the Customer Service Request (CSR) dataset, which contains records of all service requests submitted to the city since 2013.

Primary Dataset

Dataset	Customer Service Requests
Source	Seattle Open Data Portal
Records	2.2+ million service requests
Date Range	2013 – Present
Update Frequency	Daily (automated)

Data Fields Used

Service Request Number — Unique identifier for each request
Service Request Type — Category of issue (47+ types tracked)
Created Date — When the request was submitted
Status — Current state (Open, Closed, etc.)
Location — Street address or intersection
Latitude/Longitude — Geographic coordinates

Current Issues

When you view a zip code report, we display all open service requests for that zip code. To allow for fair comparison between zip codes of different sizes, we calculate the number of reports per acre based on each zip code's total area.

How It Works

All open reports for the selected zip code are retrieved
Reports are categorized by type (encampments, dumping/needles, litter)
Per-acre metrics are calculated by dividing total reports by the zip code's area in acres
This normalization allows meaningful comparison across neighborhoods of different sizes

Response Score

The Response Score shows the percentage of reports created in the past 28 days that are still open. A lower percentage indicates faster response times, while a higher percentage suggests a backlog of unresolved issues.

Calculation Method

We look at all reports created within the past 28 days for each category
We count how many of those reports are still open (not yet resolved)
The percentage is calculated as: (open reports / total reports) × 100
Each category (encampments, dumping/needles, litter) is calculated separately

Example: If 15 encampment reports were created in the past 28 days and 6 are still open, the Response Score would be 40%.

Risk Forecast

Our risk prediction model estimates the probability of new encampment-related issues appearing in a given area over the next 28 days. This is based on machine learning analysis of historical patterns.

Model Details

Algorithm	Gradient Boosting Classifier
Grid Size	200m × 200m cells
Prediction Window	28 days
Training Data	Historical encampment reports (2019–2024)

Features Used

Historical Count — Number of past reports in this cell
Recent Activity — Reports in the last 30/60/90 days
Temporal Patterns — Day of week, month, seasonal trends
Geographic Context — Proximity to highways, parks, commercial areas

Risk Categories

Low Risk (0–30%) — Below average probability of new issues
Medium Risk (30–60%) — Moderate probability, consistent with city average
High Risk (60%+) — Elevated probability based on historical patterns

Limitations

We believe in transparency about what our data can and cannot tell you:

Reporting Bias — Areas with more engaged residents may have higher report counts regardless of actual issue prevalence
Data Freshness — While we update daily, there may be a 24–48 hour lag between city data updates and our display
Prediction Uncertainty — Risk forecasts are probabilistic estimates, not guarantees
Geographic Coverage — Currently limited to Seattle city limits
Issue Types — Detailed analysis currently focuses on encampment reports; other issue types will be added

Data Updates

Our data pipeline runs automatically every day:

8:00 AM UTC — Full dataset download from Seattle Open Data Portal
Processing — Data is cleaned, geocoded, and aggregated
Caching — Results are cached for fast retrieval
Model Refresh — Risk predictions are updated weekly