This chapter gives a brief introduction to using some modules within the urllib package and an open-source Python library called BeautifulSoup. The first is used to make URLs and to read data from a Web server, the later to parse the results and extract data from the HTML.

Objectives

Upon completion of this chapter’s exercises, you should be able to:

  • Open a request and make a connection to a remote web server.
  • Read a stream of bytes from a request.
  • Build requests that send data to the remote server using the GET method.
  • Build requests that send data to the remote server using the POST method.
  • Use a library to select HTML tags based on their CSS selector.
  • Get and display attributes and inner text of HTML tags.

Download PDF of Chapter