Blog Web Scraper built by Felix Hildebrandt as final thesis for Web Analytics in 2019. The fetched data was further analysed by Lukas Brueggemann as an extended group project for a Big Data science course.
NOTE: Code Commentary appears in German.
By default, the web scraper is adapted to the blog of Kuechenchaotin, a known German Food and Travel webpage. On demand, it could be customized for any other domain as this project is just a showcase.
The extended analysis based on the sample blog can be found within the
/metricsfolder of this repository. The structure is similar to this table of contents in the main description file and includes subfolders for internal and external analytics.
- Link to General Metrics Documentation
- Link to Internal Metrics Analysis
- Link to External Metrics Analysis
As stated, the tool can be used to measure value, sucess and outcome of different web blogs. Based on the script, following core value gains can be fetched:
- Conversation Rate in
Comments per Post - Outcome in
Post per Month - Content Created in
Words per Post - Blog Value
- Applied Business Models
- Social Media Communities
- Communication Pillars
- Alignment and Media Design
Based on the sample data with over 600 posts, there can be done various predictions, evaluations, and assessmentsand:
- Comment Count Predictions
- Comment Frequency
- Interaction Trendline
- Publication Date Measures
- External Link Extraction
- Post Category Analysis
Further, external sources like SimilarWeb was used to combine internal and external metrics with traffic and search data from social media listings or referrals:
- Referring Traffic
- Search Visits
- Search Engagement
- Channel Analytics
- Demographics
- Geographics
- Browsing Categories
- Total Visits
- Lukas Brueggemann

