Berlin & Munich Analysis
Authored by Ana
What makes an Airbnb listing successful in Berlin and Munich, and what can hosts do to improve their performance?
To answer this, we explored the following sub-questions:
Setting up an Airbnb listing involves a lot of guesswork: what to write, which amenities to highlight, how to price your space. This dashboard provides a view into the characteristics, text patterns, and amenities that help hosts like you earn more revenue and better ratings.
How To Use This Dashboard
Success Factors
What are the characteristics that determine whether a given listing is successful?
Review volume drives success
Metric: estimated annual revenue
Regional Observation
What are the top characteristics for listings in different regions?
Top amenities vary by region
Metric: avg rating >= 4.8, ranked by avg revenue
Interesting Trend
How do top characteristics vary between Berlin neighborhoods?
Priorities range from family-oriented to entertainment
Metric: avg rating >= 4.8, ranked by avg revenue
Only includes amenities found in highly rated listings (4.8+), sorted by highest earning
Only includes amenities found in highly rated listings (4.8+), sorted by highest earning
Authored by Ana
Success Factors
What are the characteristics that determine whether a given listing is successful?
Implication of Result
New hosts should prioritize getting early bookings and reviews over perfecting their rating. Consider competitive pricing or promotions initially to build review volume, as this is the strongest predictor of long-term revenue growth.
Regional Observation
What are the top amenity characteristics for listings in different regions?
Implication of Result
Amenity investment should be city-specific. Berlin guests value home comforts (laundry, family items), while Munich guests lean toward independence and comfort (private entrance, climate control). A one-size-fits-all approach across markets may underperform.
Interesting Trend
How do top characteristics vary between specific neighborhoods in Berlin?
Implication of Result
Hosts should research their specific neighborhood's top performers before investing in amenities. Entertainment amenities (HDTV) work in some areas while safety features (noise monitors) work in others. Copying a successful listing from a different neighborhood may not translate.
Authored by Ana
Wording Impact
How does the wording of a listing impact the number of visitors to that listing?
Detailed descriptions (100+ words) and descriptive neighborhood overviews lead to higher estimated annual revenue.
Implication of Result
Hosts with short or empty descriptions are leaving revenue on the table. Investing time in writing a thorough description (at least 100 words) and a neighborhood overview is one of the lowest-cost, highest-impact improvements a host can make.
Satisfaction Trend
Is there a correlation between neighborhood detail and guest ratings?
Yes, brief or minimal neighborhood overviews (0-10 words) correlate with lower average review scores compared to moderate detail.
Implication of Result
Guests who feel informed about the neighborhood before arrival tend to rate higher, likely because expectations are set accurately. Hosts should include local tips, transit info, and nearby attractions to reduce surprises and improve satisfaction.
Optimal Length
What is the ideal word count for listing titles?
Titles with 11-15 words achieve peak satisfaction ratings with an average rating of 4.9 or higher.
Implication of Result
Very short titles (1-3 words) miss the opportunity to set expectations. Hosts should aim for descriptive titles that include the property type, key feature, and location. Going beyond 15 words shows diminishing returns.
Authored by Ana
Success on Airbnb in Berlin and Munich is about doing the small things well. It's not one big differentiator. It's the combination of building a review base over time, investing in practical home amenities (washer, dryer), writing detailed descriptions and neighborhood overviews, and tailoring your listing to your specific neighborhood. Hosts who put effort into every touchpoint of the guest experience consistently outperform those who don't.
Authored by Ana
Business Question
What makes an Airbnb listing successful in Berlin and Munich, and what can hosts do to improve their performance?
Findings
1. Reviews Drive Revenue
While ratings stay flat (~4.77) regardless of review count, revenue increases with more reviews.
2. Practical Amenities Win
Free washer and dryer appear in the highest-earning, highly-rated listings
3. Neighborhoods Have Different Needs
Top amenities vary by neighborhood. What works in one area doesn't necessarily work in another
4. Detailed Descriptions Earn More
Listings with 100+ word descriptions lead to higher estimated annual revenue
5. Neighborhood Detail Matters
Brief or minimal neighborhood overviews (0-10 words) correlate with lower average review scores
6. Longer Titles Rate Higher
Titles with 11-15 words achieve peak satisfaction ratings with an average rating of 4.9 or higher
Conclusion
Success on Airbnb is the sum of many small efforts: building reviews, choosing practical amenities, writing detailed text, and tailoring to your neighborhood. There is no single shortcut.
Popular vs Profitable
Most popular amenities are not necessarily the ones associated with high revenue or high ratings.
Money Makers
Amenities like "Free Dryer" and "Free Washer" appear in highly rated, high revenue listings.
Guest Expectations
Wifi is the most popular amenity but doesn't appear in the revenue table, suggesting guests expect it as a baseline.
| # | Amenity | Rating | Listings | % |
|---|
Avg rating >= 4.8, ranked by avg revenue
| # | Amenity | Rating | Listings | Avg Rev |
|---|
613
Listings allow smoking
115
Neighborhoods
~Same
Avg rating vs non-smoking
Guests who book smoking-allowed listings appear to do so intentionally. Ratings are nearly identical to non-smoking listings.
Only includes amenities found in highly rated listings (4.8+), sorted by highest earning
Avg Rating: the average rating of all listings that include this amenity.
Avg Revenue: the average annual revenue of all listings that include this amenity.
Only includes amenities found in highly rated listings (4.8+), sorted by highest earning
Avg Rating: the average rating of all listings that include this amenity.
Avg Revenue: the average annual revenue of all listings that include this amenity.
Authored by Ana
Defining "Success"
Challenge
Solution
Excel Limitations for Amenity Analysis
Challenge
Solution
Unrated Listings Processed as Zero
Challenge
Solution
Defining "Success"
Challenge
Solution
Excel Limitations for Amenity Analysis
Challenge
Solution
Unrated Listings Processed as Zero
Challenge
Solution
Defining "Success"
Challenge
Ranking amenities by pure rating produced unintuitive results (BBQ grill at #1 with only 5 listings). Ranking by pure revenue ignored quality entirely.
Solution
Filter for amenities with avg rating >= 4.8 first, then rank by revenue. This balances quality with business impact and filters out low-sample outliers.
Excel Limitations for Amenity Analysis
Challenge
Each listing's amenities are stored as a single text field containing a list (["Wifi","Kitchen","Dryer"]). Opening this in Excel shows the entire list crammed into one cell, making individual amenity analysis impossible with standard pivot tables.
Solution
Used Python to read each list, break it apart into separate items, and handle formatting inconsistencies like extra quotes. This produced 3,092 unique amenities across all listings. Simpler analyses like review count binning worked well in Excel, but amenity analysis required Python (pandas).
Unrated Listings Stored as Zero
Challenge
3,465 listings had no rating data. These were initially stored as 0, which dragged down all average rating calculations across the dashboard.
Solution
Stored missing ratings as -1 and excluded them from all rating computations. This required regenerating the entire embedded dataset.
Bathroom Data Split Across Two Columns
Challenge
Bathroom information was split between "bathrooms" (66.3% fill rate, numeric) and "bathrooms_text" (99.9% fill rate, descriptive like "1.5 baths"). Neither column alone covered all listings.
Solution
Used bathrooms_text as the primary source and bathrooms as a fallback, recovering 10 additional listings that would otherwise have no bathroom data.
Mixed Language Content
Challenge
The data contains a mix of English and German text. Amenities are almost entirely English (only 2.7% German), but host-written text has much more: 22.5% of titles, 19.3% of descriptions, and 33.5% of neighborhood overviews contain German characters.
Solution
Word count analysis captures all text equally regardless of language, so no special handling was needed for the current dashboard. Any future keyword or sentiment analysis would need multilingual support.
React to Vanilla JS Rewrite
Challenge
The original codebase was a React/Vite app with duplicated code (the entire component was copy-pasted twice), causing build failures.
Solution
Rewrote the entire dashboard as a single HTML file with vanilla JS, eliminating the need for any build step or npm dependencies.
44 MB of CSV Data for a Static Site
Challenge
The raw CSV files totaled 44 MB, far too large to embed directly in a static HTML page.
Solution
Built a Python preprocessing pipeline that extracted only the needed fields, replaced text with word counts, and used dictionary encoding for amenities (3,092 unique) and neighborhoods (163 unique), compressing the data to 2.5 MB (0.7 MB gzipped).
GitHub Pages Jekyll Failures
Challenge
GitHub Pages runs Jekyll by default, which choked on the 2.5 MB data.js file. Deployments failed silently.
Solution
Added a .nojekyll file to bypass Jekyll processing entirely. Also disconnected a conflicting Vercel integration.
Analysis by Ana
| # | Analysis | Software | Method |
|---|---|---|---|
| 1 | Success Factors | Excel | Grouped listings by number of reviews and calculated the average revenue for each group |
| 2 | Regional Observation | Python | Split amenity lists apart, then found which amenities appear in the highest-earning listings per city |
| 3 | Interesting Trend | Python | Same as above, but compared top amenities across different neighborhoods |
| 4 | Wording Impact | Python | Counted words in each description, grouped by length, and compared average revenue across groups |
| 5 | Satisfaction Trend | Python | Checked whether listings with longer neighborhood descriptions tend to have higher ratings |
| 6 | Optimal Length | Excel | Grouped listing titles by word count and found which length gets the best ratings |
Authored by Ana
HTML Dashboard File
Access: Required downloading a file and opening it locally
Sharing: Difficult to share with others (must send the file)
Inflexible: Unable to update the data source
Online Dashboard
Access: Accessible via URL, no download needed
Sharing: Shareable link that works on any device
Flexible: Source data can be updated with a simple file upload. Visuals designed to answer specific business questions
The following files were available from Inside Airbnb for each city. This dashboard uses the listings files (highlighted).
Source: insideairbnb.com
| File | Berlin | Munich | Description |
|---|---|---|---|
| listings.csv | ✓ | ✓ | Detailed listing data with 79 columns (used in this dashboard) |
| calendar.csv | ✓ | ✓ | Daily availability and pricing for each listing |
| reviews.csv | ✓ | ✓ | Individual guest reviews with text and date |
| listings_summary.csv | ✓ | ✓ | Condensed listing data with fewer columns |
| reviews_summary.csv | ✓ | ✓ | Condensed review data with date and listing ID |
| neighbourhoods.csv | ✓ | ✓ | List of neighbourhood names and groupings |
This dashboard analyzes Airbnb listing data from two German cities, sourced from Inside Airbnb CSV exports. Both files share an identical schema of 79 columns.
Source: insideairbnb.com
| Berlin | Munich | |
|---|---|---|
| File | zip_berlin_listings.csv | zip_munich_listings.csv |
| Rows | 14,274 | 8,274 |
| Columns | 79 | 79 |
| Schema Match | Identical (all 79 columns shared) | |
Preliminary Analysis by Ana
| # | Column Name | Data Available | Used in Dashboard |
|---|
Data Cleanup by Ana
Bathroom Data
Ratings
Revenue
Ordered from easiest to most difficult analytical technique.
| # | Analysis | Software | Difficulty | Method |
|---|---|---|---|---|
| 1 | Success Factors | Excel | Easy | Pivot table with review count bins |
| 2 | Optimal Length | Excel | Easy | Binned title word counts, computed avg rating per bin |
| 3 | Satisfaction Trend | Python | Medium | Correlation between word count and rating |
| 4 | Wording Impact | Python | Medium | Word count binning + linear regression |
| 5 | Regional Observation | Python | Hard | Parsed amenities from single cell, grouped by city, ranked by revenue |
| 6 | Interesting Trend | Python | Hard | Same amenity parsing, filtered per neighborhood, compared across areas |
| # | File | Analysis | City | Download |
|---|---|---|---|---|
| 1 | success_factors.xlsx | Success Factors | Berlin & Munich | Download |
| 2 | optimal_length.xlsx | Optimal Length | Berlin & Munich | Download |
| 3 | satisfaction_trend.ipynb | Satisfaction Trend | Berlin & Munich | Download |
| 4 | wording_impact.ipynb | Wording Impact | Berlin & Munich | Download |
| 5 | regional_observation.ipynb | Regional Observation | Berlin & Munich | Download |
| 6 | interesting_trend.ipynb | Interesting Trend | Berlin & Munich | Download |
This dashboard analyzes Airbnb listing data from two German cities, sourced from Inside Airbnb CSV exports. Both files share an identical schema of 79 columns.
| Berlin | Munich | |
|---|---|---|
| File | zip_berlin_listings.csv | zip_munich_listings.csv |
| Rows | 14,274 | 8,274 |
| Columns | 79 | 79 |
| Schema Match | Identical (all 79 columns shared) | |
Bathroom Data
Two columns contain bathroom information: bathrooms_text (99.9% fill rate) is the primary source, with bathrooms (66.3% fill rate) used as a fallback. Of 22,548 total listings, 22 are missing bathrooms_text. Of those, 10 have data in the bathrooms column, recovering those listings. Only 12 listings (0.05%) have no bathroom data at all.
Ratings
The review_scores_rating column has a 76.8% fill rate. Listings without ratings (typically new listings with no reviews) are excluded from all rating calculations in the dashboard to avoid skewing averages.
Revenue
The estimated_revenue_l365d column has a 65.9% fill rate. Only listings with a valid price (and therefore revenue estimate) are included in the dashboard, resulting in 14,867 listings used for analysis.
| # | Column Name | Data Available | Used in Dashboard |
|---|
1. Success Factors
What are the characteristics that determine whether a given listing is successful?
Analytical Method
Software: Excel (Pivot Table)
Setup
Create a helper column to bin review counts into ranges: 0-10, 11-25, 26-50, 51-100, 101-200, 200+. Use this as the Row field in the pivot table.
Steps
2. Regional Observation
What are the top amenity characteristics for listings in different regions?
Analytical Method
Software: Python (pandas)
Setup
Each listing stores amenities as a list ("Wifi, Kitchen, Dryer"). Parse each listing's amenity list and split into individual items so each amenity can be analyzed separately.
Steps
3. Interesting Trend
How do top characteristics vary between specific neighborhoods in Berlin?
Analytical Method
Software: Python (pandas) or Tableau
Setup
Filter the dataset to a specific neighborhood, then parse each listing's amenity list into individual items for per-neighborhood analysis.
Steps (Python)
Alternative (Tableau)
Create a cross-tab with neighborhood and amenity as dimensions, avg rating and avg revenue as measures, filtered by rating >= 4.8. Compare neighborhoods side by side.
4. Wording Impact
How does the wording of a listing impact the number of visitors to that listing?
Analytical Method
Software: Python (pandas, scipy/statsmodels)
Setup
Count the number of words in each listing's description and neighborhood overview to create word count columns for analysis.
Steps
5. Satisfaction Trend
Is there a correlation between neighborhood detail and guest ratings?
Analytical Method
Software: Python (scipy/statsmodels) or Excel
Setup
Count words in each listing's neighborhood_overview to create a word count column. Pair with review_scores_rating for each listing.
Steps
6. Optimal Length
What is the ideal word count for listing titles?
Analytical Method
Software: Excel or Python
Setup
Count the number of words in each listing title to create a word count column. Define bin ranges: 1-3 (Short), 4-6 (Standard), 7-10 (Long), 11-15 (Descriptive), 15+ (Max Detail).
Steps
Alternative (Machine Learning)
A random forest or gradient boosting model could use title word count as a feature to predict ratings. Feature importance scores would confirm whether title length is a meaningful predictor.
| # | Analysis | Software | Method |
|---|---|---|---|
| 1 | Success Factors | Excel | Binned review counts into ranges and computed average revenue per bin using a pivot table |
| 2 | Regional Observation | Python | Parsed amenities stored in a single cell into individual items, then grouped by city to rank by revenue |
| 3 | Interesting Trend | Python | Same amenity parsing as above, filtered per neighborhood to compare top amenities across areas |
| 4 | Wording Impact | Python | Counted words in descriptions, binned into ranges, and computed average revenue per bin |
| 5 | Satisfaction Trend | Python | Correlated neighborhood overview word count with average rating across bins |
| 6 | Optimal Length | Excel | Binned title word counts into ranges and identified the bin with the highest average rating |
Wording Impact
How does the wording of a listing impact the number of visitors to that listing?
Ratings flat, but revenue rises with detail
Satisfaction Trend
Is there a correlation between neighborhood detail & performance?
Detail drives revenue, not ratings
Optimal Length
What is the ideal word count for listing titles?
Keep titles concise
Title: The name of the Airbnb listing as it appears in search results.
Though German is used across listings, the majority of text is written in English.
Titles with German
22.5%
5,083 of 22,548
Descriptions with German
19.3%
4,177 of 21,659
Neighborhood Overviews with German
33.5%
2,977 of 8,886
Words are ranked by how frequently they appear across all listings in the selected text field.
Only the top 50 most frequent words are shown.
| # | Word | Count |
|---|
| # | Word | Count |
|---|