We're sorry but this app doesn't work properly without JavaScript enabled. Please enable it to continue.

This lesson's interactive features are locked, please to keep using them

Welcome

In this project, we'll build a Web Crawler in Python! To rank well in Google Search, websites need to internally link pages one to another. For example, a blog post about the benefits of haircuts should probably link to my post about the best places to get haircuts.

We're going to write a Python CLI application that generates an "internal links" report for any website on the internet by crawling each page of the site.

for Windows Users

If you try to complete this course without WSL 2 installed on Windows, you're gonna have a bad time.

Learning Goals

  • Get hands-on practice with local Python development and tooling
  • Practice making HTTP requests in Python
  • Learn how to parse HTML
  • Practice unit testing

Python Setup

Before we dive into the project, let's make sure you are all set up properly. You will need:

  1. Python 3.10+ installed (see the bookbot project for help if you don't already have it)
  2. uv project and package manager
  3. If you're in VS Code, I recommend installing the Python extension. It's not required, but it makes working with Python a lot easier.
  4. You will need the Boot.dev CLI installed, and you'll need to be logged in.

Assignment

Type python --version to make sure that you have Python installed.

Also, confirm that you have uv installed by typing uv --version.

Submit the CLI tests. There's no penalty on failure for this lesson.