Data Sources
Urban Reports uses publicly available data from the City of Seattle's Open Data Portal. Our primary data source is the Customer Service Request (CSR) dataset, which contains records of all service requests submitted to the city since 2013.
Primary Dataset
| Dataset | Customer Service Requests |
|---|---|
| Source | Seattle Open Data Portal |
| Records | 2.2+ million service requests |
| Date Range | 2013 - Present |
| Update Frequency | Daily (automated) |
Data Fields Used
- Service Request Number - Unique identifier for each request
- Service Request Type - Category of issue (47+ types tracked)
- Created Date - When the request was submitted
- Status - Current state (Open, Closed, etc.)
- Location - Street address or intersection
- Latitude/Longitude - Geographic coordinates
Current Issues
When you view a zip code report, we display all open service requests for that zip code. To allow for fair comparison between zip codes of different sizes, we calculate the number of reports per acre based on each zip code's total area.
How It Works
- All open reports for the selected zip code are retrieved
- Reports are categorized by type (encampments, dumping/needles, litter)
- Per-acre metrics are calculated by dividing total reports by the zip code's area in acres
- This normalization allows meaningful comparison across neighborhoods of different sizes
Response Score
The Response Score shows the percentage of reports created in the past 28 days that are still open. A lower percentage indicates faster response times, while a higher percentage suggests a backlog of unresolved issues.
Calculation Method
- We look at all reports created within the past 28 days for each category
- We count how many of those reports are still open (not yet resolved)
- The percentage is calculated as: (open reports / total reports) × 100
- Each category (encampments, dumping/needles, litter) is calculated separately
Example: If 15 encampment reports were created in the past 28 days and 6 are still open, the Response Score would be 40%.
Risk Forecast
Our risk prediction model estimates the probability of new encampment-related issues appearing in a given area over the next 28 days. This is based on machine learning analysis of historical patterns.
Model Details
| Algorithm | Gradient Boosting Classifier |
|---|---|
| Grid Size | 200m x 200m cells |
| Prediction Window | 28 days |
| Training Data | Historical encampment reports (2019-2024) |
Features Used
- Historical Count - Number of past reports in this cell
- Recent Activity - Reports in the last 30/60/90 days
- Temporal Patterns - Day of week, month, seasonal trends
- Geographic Context - Proximity to highways, parks, commercial areas
Risk Categories
- Low Risk (0-30%) - Below average probability of new issues
- Medium Risk (30-60%) - Moderate probability, consistent with city average
- High Risk (60%+) - Elevated probability based on historical patterns
Limitations
We believe in transparency about what our data can and cannot tell you:
- Reporting Bias - Areas with more engaged residents may have higher report counts regardless of actual issue prevalence
- Data Freshness - While we update daily, there may be a 24-48 hour lag between city data updates and our display
- Prediction Uncertainty - Risk forecasts are probabilistic estimates, not guarantees
- Geographic Coverage - Currently limited to Seattle city limits
- Issue Types - Detailed analysis currently focuses on encampment reports; other issue types will be added
Data Updates
Our data pipeline runs automatically every day:
- 8:00 AM UTC - Full dataset download from Seattle Open Data Portal
- Processing - Data is cleaned, geocoded, and aggregated
- Caching - Results are cached for fast retrieval
- Model Refresh - Risk predictions are updated weekly