Airbnb Dashboard

Business Question

Authored by Ana

What makes an Airbnb listing successful in Berlin and Munich, and what can hosts do to improve their performance?

To answer this, we explored the following sub-questions:

What drives revenue? Is it ratings, reviews, amenities, or something else?
Does listing quality matter? Do better descriptions and titles translate to higher earnings?
What should hosts invest in? Which amenities are associated with top-performing listings?

Insights for Hosts

Success Factors

What are the characteristics that determine whether a given listing is successful?

Review volume drives success

Listings with more reviews earn higher annual revenue
Ratings stay consistent (~4.77) regardless of review count, but revenue increases

Metric: estimated annual revenue

Regional Observation

What are the top characteristics for listings in different regions?

Top amenities vary by region

Berlin top 3: Free washer, Children's dinnerware, Free dryer
Munich top 3: Free washer, Private entrance, Portable fans

Metric: avg rating >= 4.8, ranked by avg revenue

Interesting Trend

How do top characteristics vary between Berlin neighborhoods?

Priorities range from family-oriented to entertainment

Tempelhofer Vorstadt #1: "55 inch HDTV"
Südliche Luisenstadt #1: "Noise decibel monitors"

Metric: avg rating >= 4.8, ranked by avg revenue

Revenue & Rating by Number of Reviews

Top 5 Amenities by City

Only includes amenities found in highly rated listings (4.8+), sorted by highest earning

Top 5 Amenities by Neighborhood

Only includes amenities found in highly rated listings (4.8+), sorted by highest earning

Questions About Characteristics

Authored by Ana

Success Factors

What are the characteristics that determine whether a given listing is successful?

Success is driven by review volume: listings with more reviews earn higher annual revenue
Ratings stay consistent (~4.77) regardless of review count, but revenue increases
Metric: estimated annual revenue

Implication of Result

New hosts should prioritize getting early bookings and reviews over perfecting their rating. Consider competitive pricing or promotions initially to build review volume, as this is the strongest predictor of long-term revenue growth.

Regional Observation

What are the top amenity characteristics for listings in different regions?

Berlin top 3: Free washer, Children's dinnerware, Free dryer
Munich top 3: Free washer, Private entrance, Portable fans
Metric: avg rating >= 4.8, ranked by avg revenue

Implication of Result

Amenity investment should be city-specific. Berlin guests value home comforts (laundry, family items), while Munich guests lean toward independence and comfort (private entrance, climate control). A one-size-fits-all approach across markets may underperform.

Interesting Trend

How do top characteristics vary between specific neighborhoods in Berlin?

Tempelhofer Vorstadt #1: "55 inch HDTV"
Südliche Luisenstadt #1: "Noise decibel monitors"
Top characteristics vary by neighborhood
Metric: avg rating >= 4.8, ranked by avg revenue

Implication of Result

Hosts should research their specific neighborhood's top performers before investing in amenities. Entertainment amenities (HDTV) work in some areas while safety features (noise monitors) work in others. Copying a successful listing from a different neighborhood may not translate.

Text Analysis Questions

Authored by Ana

Wording Impact

How does the wording of a listing impact the number of visitors to that listing?

Detailed descriptions (100+ words) and descriptive neighborhood overviews lead to higher estimated annual revenue.

Implication of Result

Hosts with short or empty descriptions are leaving revenue on the table. Investing time in writing a thorough description (at least 100 words) and a neighborhood overview is one of the lowest-cost, highest-impact improvements a host can make.

Satisfaction Trend

Is there a correlation between neighborhood detail and guest ratings?

Yes, brief or minimal neighborhood overviews (0-10 words) correlate with lower average review scores compared to moderate detail.

Implication of Result

Guests who feel informed about the neighborhood before arrival tend to rate higher, likely because expectations are set accurately. Hosts should include local tips, transit info, and nearby attractions to reduce surprises and improve satisfaction.

Optimal Length

What is the ideal word count for listing titles?

Titles with 11-15 words achieve peak satisfaction ratings with an average rating of 4.9 or higher.

Implication of Result

Very short titles (1-3 words) miss the opportunity to set expectations. Hosts should aim for descriptive titles that include the property type, key feature, and location. Going beyond 15 words shows diminishing returns.

Conclusion

Authored by Ana

Success on Airbnb in Berlin and Munich is about doing the small things well. It's not one big differentiator. It's the combination of building a review base over time, investing in practical home amenities (washer, dryer), writing detailed descriptions and neighborhood overviews, and tailoring your listing to your specific neighborhood. Hosts who put effort into every touchpoint of the guest experience consistently outperform those who don't.

Summary

Authored by Ana

Business Question

What makes an Airbnb listing successful in Berlin and Munich, and what can hosts do to improve their performance?

Findings

1. Reviews Drive Revenue

While ratings stay flat (~4.77) regardless of review count, revenue increases with more reviews.

2. Practical Amenities Win

Free washer and dryer appear in the highest-earning, highly-rated listings

3. Neighborhoods Have Different Needs

Top amenities vary by neighborhood. What works in one area doesn't necessarily work in another

4. Detailed Descriptions Earn More

Listings with 100+ word descriptions lead to higher estimated annual revenue

5. Neighborhood Detail Matters

Brief or minimal neighborhood overviews (0-10 words) correlate with lower average review scores

6. Longer Titles Rate Higher

Titles with 11-15 words achieve peak satisfaction ratings with an average rating of 4.9 or higher

Conclusion

Success on Airbnb is the sum of many small efforts: building reviews, choosing practical amenities, writing detailed text, and tailoring to your neighborhood. There is no single shortcut.

Insights for Hosts

Popular vs Profitable

Most popular amenities are not necessarily the ones associated with high revenue or high ratings.

Money Makers

Amenities like "Free Dryer" and "Free Washer" appear in highly rated, high revenue listings.

Guest Expectations

Wifi is the most popular amenity but doesn't appear in the revenue table, suggesting guests expect it as a baseline.

Amenities By Popularity

#	Amenity	Rating	Listings	%

Amenities By Revenue (Highly Rated)

Avg rating >= 4.8, ranked by avg revenue

#	Amenity	Rating	Listings	Avg Rev

Interesting Findings About Amenities

Smoking Allowed

613

Listings allow smoking

115

Neighborhoods

~Same

Avg rating vs non-smoking

Guests who book smoking-allowed listings appear to do so intentionally. Ratings are nearly identical to non-smoking listings.

Top Amenities: Berlin vs. Munich

Only includes amenities found in highly rated listings (4.8+), sorted by highest earning

Avg Rating: the average rating of all listings that include this amenity.

Avg Revenue: the average annual revenue of all listings that include this amenity.

Key Challenges

Authored by Ana

Defining "Success"

Challenge

Ranking amenities by pure rating produced unintuitive results (BBQ grill at #1 with only 5 listings).
Ranking by pure revenue ignored quality entirely.

Solution

Filter for amenities with avg rating >= 4.8 first, then rank by revenue.

Excel Limitations for Amenity Analysis

Challenge

Each listing's amenities are stored as a single text field containing a list (["Wifi","Kitchen","Dryer"]).
Opening this in Excel shows the entire list crammed into one cell, making individual amenity analysis impossible with standard pivot tables.

Solution

Used Python to read each list, break it apart into separate items, and handle formatting inconsistencies like extra quotes.
This produced 3,092 unique amenities across all listings.

Unrated Listings Processed as Zero

Challenge

3,465 listings had no rating data.
These were initially processed as 0, which dragged down all average rating calculations across the dashboard.

Solution

Excluded missing ratings from all rating computations.

Key Challenges

Defining "Success"

Challenge

Ranking amenities by pure rating produced unintuitive results (BBQ grill at #1 with only 5 listings).
Ranking by pure revenue ignored quality entirely.

Solution

Filter for amenities with avg rating >= 4.8 first, then rank by revenue.

Excel Limitations for Amenity Analysis

Challenge

Each listing's amenities are stored as a single text field containing a list (["Wifi","Kitchen","Dryer"]).
Opening this in Excel shows the entire list crammed into one cell, making individual amenity analysis impossible with standard pivot tables.

Solution

Used Python to read each list, break it apart into separate items, and handle formatting inconsistencies like extra quotes.
This produced 3,092 unique amenities across all listings.

Unrated Listings Processed as Zero

Challenge

3,465 listings had no rating data.
These were initially processed as 0, which dragged down all average rating calculations across the dashboard.

Solution

Excluded missing ratings from all rating computations.

Analytical Decisions

Defining "Success"

Challenge

Ranking amenities by pure rating produced unintuitive results (BBQ grill at #1 with only 5 listings). Ranking by pure revenue ignored quality entirely.

Solution

Filter for amenities with avg rating >= 4.8 first, then rank by revenue. This balances quality with business impact and filters out low-sample outliers.

Excel Limitations for Amenity Analysis

Challenge

Each listing's amenities are stored as a single text field containing a list (["Wifi","Kitchen","Dryer"]). Opening this in Excel shows the entire list crammed into one cell, making individual amenity analysis impossible with standard pivot tables.

Solution

Used Python to read each list, break it apart into separate items, and handle formatting inconsistencies like extra quotes. This produced 3,092 unique amenities across all listings. Simpler analyses like review count binning worked well in Excel, but amenity analysis required Python (pandas).

Data Cleaning & Transformation

Unrated Listings Stored as Zero

Challenge

3,465 listings had no rating data. These were initially stored as 0, which dragged down all average rating calculations across the dashboard.

Solution

Stored missing ratings as -1 and excluded them from all rating computations. This required regenerating the entire embedded dataset.

Bathroom Data Split Across Two Columns

Challenge

Bathroom information was split between "bathrooms" (66.3% fill rate, numeric) and "bathrooms_text" (99.9% fill rate, descriptive like "1.5 baths"). Neither column alone covered all listings.

Solution

Used bathrooms_text as the primary source and bathrooms as a fallback, recovering 10 additional listings that would otherwise have no bathroom data.

Mixed Language Content

Challenge

The data contains a mix of English and German text. Amenities are almost entirely English (only 2.7% German), but host-written text has much more: 22.5% of titles, 19.3% of descriptions, and 33.5% of neighborhood overviews contain German characters.

Solution

Word count analysis captures all text equally regardless of language, so no special handling was needed for the current dashboard. Any future keyword or sentiment analysis would need multilingual support.

Technical & Deployment

React to Vanilla JS Rewrite

Challenge

The original codebase was a React/Vite app with duplicated code (the entire component was copy-pasted twice), causing build failures.

Solution

Rewrote the entire dashboard as a single HTML file with vanilla JS, eliminating the need for any build step or npm dependencies.

44 MB of CSV Data for a Static Site

Challenge

The raw CSV files totaled 44 MB, far too large to embed directly in a static HTML page.

Solution

Built a Python preprocessing pipeline that extracted only the needed fields, replaced text with word counts, and used dictionary encoding for amenities (3,092 unique) and neighborhoods (163 unique), compressing the data to 2.5 MB (0.7 MB gzipped).

GitHub Pages Jekyll Failures

Challenge

GitHub Pages runs Jekyll by default, which choked on the 2.5 MB data.js file. Deployments failed silently.

Solution

Added a .nojekyll file to bypass Jekyll processing entirely. Also disconnected a conflicting Vercel integration.

Analytical Techniques Used

Analysis by Ana

#	Analysis	Software	Method
1	Success Factors	Excel	Grouped listings by number of reviews and calculated the average revenue for each group
2	Regional Observation	Python	Split amenity lists apart, then found which amenities appear in the highest-earning listings per city
3	Interesting Trend	Python	Same as above, but compared top amenities across different neighborhoods
4	Wording Impact	Python	Counted words in each description, grouped by length, and compared average revenue across groups
5	Satisfaction Trend	Python	Checked whether listings with longer neighborhood descriptions tend to have higher ratings
6	Optimal Length	Excel	Grouped listing titles by word count and found which length gets the best ratings

New Fine Tunes to Project Scope

Authored by Ana

Original Scope: The original project scope included an HTML file dashboard that had to be downloaded and opened locally
Shared Capability: Both dashboards were interactive, with filters to explore the data by city, property type, bedrooms, and bathrooms
Data Issue: When we discovered the wrong data file was used (a smaller subset of the full data), the HTML dashboard could not be corrected due to hardcoding
Task Handoff: The original team members were transitioned off the dashboard task due to limited availability and unwillingness to investigate and correct the data source issue. They were unresponsive for days and left the team without visibility into progress until the deadline, at which point the issue could have been caught earlier had they been transparent about the shortcuts they were taking. A team member stepped in to build the online dashboard from scratch to meet the project's needs

HTML Dashboard File

✗

Access: Required downloading a file and opening it locally

✗

Sharing: Difficult to share with others (must send the file)

✗

Inflexible: Unable to update the data source

Online Dashboard

✓

Access: Accessible via URL, no download needed

✓

Sharing: Shareable link that works on any device

✓

Flexible: Source data can be updated with a simple file upload. Visuals designed to answer specific business questions

Data Sources Available

The following files were available from Inside Airbnb for each city. This dashboard uses the listings files (highlighted).

Source: insideairbnb.com

File	Berlin	Munich	Description
listings.csv	✓	✓	Detailed listing data with 79 columns (used in this dashboard)
calendar.csv	✓	✓	Daily availability and pricing for each listing
reviews.csv	✓	✓	Individual guest reviews with text and date
listings_summary.csv	✓	✓	Condensed listing data with fewer columns
reviews_summary.csv	✓	✓	Condensed review data with date and listing ID
neighbourhoods.csv	✓	✓	List of neighbourhood names and groupings

Data Sources Used

This dashboard analyzes Airbnb listing data from two German cities, sourced from Inside Airbnb CSV exports. Both files share an identical schema of 79 columns.

Source: insideairbnb.com

	Berlin	Munich
File	zip_berlin_listings.csv	zip_munich_listings.csv
Rows	14,274	8,274
Columns	79	79
Schema Match	Identical (all 79 columns shared)

All 79 Columns & Their Usage

Preliminary Analysis by Ana

#	Column Name	Data Available	Used in Dashboard

Data Quality & Coverage

Data Cleanup by Ana

Bathroom Data

Two columns contain bathroom information: bathrooms_text (99.9% fill rate) is the primary source, with bathrooms (66.3% fill rate) used as a fallback.
Of 22,548 total listings, 22 are missing bathrooms_text.
Of those, 10 have data in the bathrooms column, recovering those listings.
Only 12 listings (0.05%) have no bathroom data at all.

Ratings

The review_scores_rating column has a 76.8% fill rate.
Listings without ratings (typically new listings with no reviews) are excluded from all rating calculations in the dashboard to avoid skewing averages.

Revenue

The estimated_revenue_l365d column has a 65.9% fill rate.
Only listings with a valid price (and therefore revenue estimate) are included in the dashboard, resulting in 14,867 listings used for analysis.

Analysis Difficulty

Ordered from easiest to most difficult analytical technique.

#	Analysis	Software	Difficulty	Method
1	Success Factors	Excel	Easy	Pivot table with review count bins
2	Optimal Length	Excel	Easy	Binned title word counts, computed avg rating per bin
3	Satisfaction Trend	Python	Medium	Correlation between word count and rating
4	Wording Impact	Python	Medium	Word count binning + linear regression
5	Regional Observation	Python	Hard	Parsed amenities from single cell, grouped by city, ranked by revenue
6	Interesting Trend	Python	Hard	Same amenity parsing, filtered per neighborhood, compared across areas

List of Files

#	File	Analysis	City	Download
1	success_factors.xlsx	Success Factors	Berlin & Munich	Download
2	optimal_length.xlsx	Optimal Length	Berlin & Munich	Download
3	satisfaction_trend.ipynb	Satisfaction Trend	Berlin & Munich	Download
4	wording_impact.ipynb	Wording Impact	Berlin & Munich	Download
5	regional_observation.ipynb	Regional Observation	Berlin & Munich	Download
6	interesting_trend.ipynb	Interesting Trend	Berlin & Munich	Download

Data Sources

This dashboard analyzes Airbnb listing data from two German cities, sourced from Inside Airbnb CSV exports. Both files share an identical schema of 79 columns.

	Berlin	Munich
File	zip_berlin_listings.csv	zip_munich_listings.csv
Rows	14,274	8,274
Columns	79	79
Schema Match	Identical (all 79 columns shared)

About The Data

Bathroom Data

Two columns contain bathroom information: bathrooms_text (99.9% fill rate) is the primary source, with bathrooms (66.3% fill rate) used as a fallback. Of 22,548 total listings, 22 are missing bathrooms_text. Of those, 10 have data in the bathrooms column, recovering those listings. Only 12 listings (0.05%) have no bathroom data at all.

Ratings

The review_scores_rating column has a 76.8% fill rate. Listings without ratings (typically new listings with no reviews) are excluded from all rating calculations in the dashboard to avoid skewing averages.

Revenue

The estimated_revenue_l365d column has a 65.9% fill rate. Only listings with a valid price (and therefore revenue estimate) are included in the dashboard, resulting in 14,867 listings used for analysis.

All 79 Columns

#	Column Name	Data Available	Used in Dashboard

Columns Used Per Analysis Question

1. Success Factors

What are the characteristics that determine whether a given listing is successful?

number_of_reviews estimated_revenue_l365d review_scores_rating

Analytical Method

Software: Excel (Pivot Table)

Setup

Create a helper column to bin review counts into ranges: 0-10, 11-25, 26-50, 51-100, 101-200, 200+. Use this as the Row field in the pivot table.

Steps

Add the review count bin column as the Row field
Add estimated_revenue_l365d as a Value field, set to "Average"
Add review_scores_rating as a second Value field, set to "Average"
Read the resulting table to see avg revenue and avg rating per review count bin

2. Regional Observation

What are the top amenity characteristics for listings in different regions?

amenities review_scores_rating estimated_revenue_l365d neighbourhood_cleansed

Analytical Method

Software: Python (pandas)

Setup

Each listing stores amenities as a list ("Wifi, Kitchen, Dryer"). Parse each listing's amenity list and split into individual items so each amenity can be analyzed separately.

Steps

Split each listing's amenity list into individual amenity rows
Group by city and amenity
Calculate average rating and average revenue per amenity per city
Filter for amenities with avg rating >= 4.8
Rank by average revenue to find top performers

3. Interesting Trend

How do top characteristics vary between specific neighborhoods in Berlin?

amenities neighbourhood_cleansed review_scores_rating estimated_revenue_l365d

Analytical Method

Software: Python (pandas) or Tableau

Setup

Filter the dataset to a specific neighborhood, then parse each listing's amenity list into individual items for per-neighborhood analysis.

Steps (Python)

Filter listings to a single neighborhood
Split each listing's amenity list into individual amenity rows
Calculate average rating and average revenue per amenity
Filter for amenities with avg rating >= 4.8
Rank by average revenue
Repeat for other neighborhoods and compare results

Alternative (Tableau)

Create a cross-tab with neighborhood and amenity as dimensions, avg rating and avg revenue as measures, filtered by rating >= 4.8. Compare neighborhoods side by side.

4. Wording Impact

How does the wording of a listing impact the number of visitors to that listing?

description neighborhood_overview estimated_revenue_l365d

Analytical Method

Software: Python (pandas, scipy/statsmodels)

Setup

Count the number of words in each listing's description and neighborhood overview to create word count columns for analysis.

Steps

Create word count columns for description and neighborhood_overview
Binned aggregation: group word counts into ranges (0-10, 11-50, 51-100, 101-200, 200+), compute average revenue per bin
Linear regression: use word count as the predictor and revenue as the target to quantify the relationship
Check statistical significance (p-value) to confirm the relationship is not due to chance

5. Satisfaction Trend

Is there a correlation between neighborhood detail and guest ratings?

neighborhood_overview review_scores_rating

Analytical Method

Software: Python (scipy/statsmodels) or Excel

Setup

Count words in each listing's neighborhood_overview to create a word count column. Pair with review_scores_rating for each listing.

Steps

Correlation analysis (Pearson/Spearman) between word count and rating
Regression analysis to quantify how much each additional word predicts rating changes
Excel scatter plot with trendline to visually confirm the relationship

6. Optimal Length

What is the ideal word count for listing titles?

name review_scores_rating

Analytical Method

Software: Excel or Python

Setup

Count the number of words in each listing title to create a word count column. Define bin ranges: 1-3 (Short), 4-6 (Standard), 7-10 (Long), 11-15 (Descriptive), 15+ (Max Detail).

Steps

Create a word count column for each listing title
Assign each listing to a bin based on its word count
Calculate the average rating per bin
Identify which bin has the highest average rating

Alternative (Machine Learning)

A random forest or gradient boosting model could use title word count as a feature to predict ratings. Feature importance scores would confirm whether title length is a meaningful predictor.

Analytical Techniques Used

#	Analysis	Software	Method
1	Success Factors	Excel	Binned review counts into ranges and computed average revenue per bin using a pivot table
2	Regional Observation	Python	Parsed amenities stored in a single cell into individual items, then grouped by city to rank by revenue
3	Interesting Trend	Python	Same amenity parsing as above, filtered per neighborhood to compare top amenities across areas
4	Wording Impact	Python	Counted words in descriptions, binned into ranges, and computed average revenue per bin
5	Satisfaction Trend	Python	Correlated neighborhood overview word count with average rating across bins
6	Optimal Length	Excel	Binned title word counts into ranges and identified the bin with the highest average rating

Word Count vs Revenue & Rating

Explore & Compare

Compare by:

vs

Narrow by:

Insights for Hosts

Wording Impact

How does the wording of a listing impact the number of visitors to that listing?

Ratings flat, but revenue rises with detail

Description length does not significantly impact guest ratings
Listings with more detailed descriptions do tend to earn higher revenue

Satisfaction Trend

Is there a correlation between neighborhood detail & performance?

Detail drives revenue, not ratings

Neighborhood overview length does not significantly impact guest ratings
Listings with more detailed overviews do tend to earn higher revenue

Optimal Length

What is the ideal word count for listing titles?

Keep titles concise

Ratings stay flat across most title lengths
Titles over 15 words show diminishing returns, with ratings beginning to drop off

Revenue & Rating vs. Title Word Count

Title: The name of the Airbnb listing as it appears in search results.

Language Used in Listings

Though German is used across listings, the majority of text is written in English.

Titles with German

22.5%

5,083 of 22,548

Descriptions with German

19.3%

4,177 of 21,659

Neighborhood Overviews with German

33.5%

2,977 of 8,886

Most Common Words

Words are ranked by how frequently they appear across all listings in the selected text field.

Only the top 50 most frequent words are shown.

English Words

#	Word	Count

German Words

#	Word	Count

Airbnb Dashboard

Business Question

Hello hosts!

Insights for Hosts

Revenue & Rating by Number of Reviews

Top 5 Amenities by City

Top 5 Amenities by Neighborhood

Questions About Characteristics

Text Analysis Questions

Conclusion

Summary

Insights for Hosts

Amenities By Popularity

Amenities By Revenue (Highly Rated)

Interesting Findings About Amenities

Smoking Allowed

Top Amenities: Berlin vs. Munich

Key Challenges

Key Challenges

Analytical Decisions

Data Cleaning & Transformation

Technical & Deployment

Analytical Techniques Used

New Fine Tunes to Project Scope

Data Sources Available

Data Sources Used

All 79 Columns & Their Usage

Data Quality & Coverage

Analysis Difficulty

List of Files

Data Sources

About The Data

All 79 Columns

Columns Used Per Analysis Question

Analytical Techniques Used

Word Count vs Revenue & Rating

Explore & Compare

Insights for Hosts

Revenue & Rating vs. Title Word Count

Language Used in Listings

Most Common Words

English Words

German Words