Robust Multithreaded Web Crawler
University project: a multithreaded web crawler with a responsive WPF UI and MSSQL-backed persistence.
Description
A robust WPF-based crawler that traverses websites, extracts anchor links, and stores them in MSSQL while maintaining UI responsiveness via multithreading.
Objective / Aim
Demonstrate scalable crawling, concurrency, and structured persistence with solid OOP design.
Role & Responsibilities
Engineered the crawler core, integrated MSSQL, built the WPF UI, implemented thread management, and added domain filters.
Features / Implementation
Threaded crawl workers; HTML parsing for anchors; domain allowlist; MSSQL schema for results; progress and control via WPF.
Impact / Results
Delivered a reusable crawling framework showcasing advanced OOP and concurrency in a desktop setting.