Warehouse codebase

Warehouse uses the Pyramid web framework, the SQLAlchemy ORM, and Postgres for its database. Warehouse’s front end uses Jinja2 templates.

The production deployment for Warehouse is in progress and currently does not use any containers, although we may change that in the future. In the development environment, we use several Docker containers, and use Docker Compose to manage running the containers and the connections between them. In the future we will probably reduce that number to two containers, one of which contains static files for the website, and the other which contains the Python web application code running in a virtual environment and the database.

Since Warehouse was built on top of an existing database and developers had to fit our ORM to the existing tables, some of the code in the ORM may not look like code from the SQLAlchemy documentation. There are some places where joins are done using name-based logic instead of a foreign key (but this may change in the future).

Warehouse also uses hybrid URL traversal and dispatch. Using factory classes, resources are provided directly to the views based on the URL pattern. This method of handling URLs may be unfamiliar to developers used to other web frameworks, such as Django or Flask. This article has a helpful discussion of the differences between URL dispatch and traversal in Pyramid.

Usage assumptions and concepts

See PyPI help to understand projects, releases, and packages.

Warehouse is specifically the codebase for the official Python Package Index, and thus focuses on architecture and features for PyPI and Test PyPI. People and groups who want to run their own package indexes usually use other tools, like devpi.

Warehouse serves three main classes of users:

  1. People who are not logged in. This accounts for the majority of browser traffic and all API download traffic.
  2. Owners/maintainers of one or more projects. This accounts for almost all writes. A user must create and use a PyPI account to maintain or own a project, and there is no particular functionality available to a logged-in user other than to manage projects they own/maintain. As of March 2018, PyPI had about 270,000 users, and Test PyPI had about 30,000 users.
  3. PyPI application administrators, e.g., Ernest W. Durbin III, Dustin Ingram, and Donald Stufft, who add classifiers, ban spam/malware projects, help users with account recovery, and so on. There are under ten such admins.

Since reads are much more common than writes (much more goes out than goes in), we try to cache as much as possible. This is a big reason that, although we have supported localization in the past, we currently don’t.

File and directory structure

The top-level directory of the Warehouse repo contains files including:

  • CONTRIBUTING.rst (the contribution guide)
  • README.rst
  • requirements.txt for the Warehouse virtual environment
  • Dockerfile: creates the Docker containers that Warehouse runs in
  • docker-compose.yml file configures Docker Compose
  • setup.cfg for test configuration
  • runtime.txt for Heroku
  • Makefile: commands to spin up Docker Compose and the Docker containers, run the linter and other tests, etc.
  • files associated with Warehouse’s front end, e.g., Gulpfile.babel.js

Directories within the repository:

  • bin/ - high-level scripts for Docker, Travis, and Makefile commands
  • dev/ - assets for developer environment
  • tests/ - tests
  • warehouse/ - code in modules
    • legacy/ - most of the read-only APIs implemented here
    • forklift/ - Upload API
    • accounts/ - user accounts
    • admin/ - application-administrator-specific
    • cache/ - caching
    • classifiers/ - frame trove classifiers
    • cli/ - entry scripts and the interactive shell
    • i18n/ - internationalization
    • locales/ - internationalization
    • manage/ - logged-in user functionality (i.e., manage account & owned/maintained projects)
    • migrations/ - changes to the database schema
    • packaging/ - models
    • rate_limiting/ - rate limiting to prevent abuse
    • rss/ - RSS feeds: Feeds
    • sitemap/ - site maps
    • utils/ - various utilities Warehouse uses