Project Structure
A well-organized project is easier to navigate, test, and maintain. Python gives you flexible tools for structuring code — from single scripts to large packages with dozens of modules. This guide covers the patterns that work best at every scale.
Single Script vs Module vs Package
Python code can be organized at three levels of complexity:
Single Script
A standalone .py file. Perfect for small utilities, automation tasks, and scripts under a few hundred lines:
backup_files.py
Module
A single .py file that's designed to be imported by other code. The difference from a script is intent — a module exposes functions and classes for reuse:
utils.py # import utils
validators.py # import validators
Package
A directory containing an __init__.py file and one or more modules. Packages let you group related functionality and create namespaces:
mypackage/
__init__.py
models.py
views.py
utils.py
Click "Run" to execute your code__init__.py and What It Does
The __init__.py file serves several purposes:
- Marks a directory as a Python package — without it (in older Python), the directory isn't importable
- Runs initialization code when the package is first imported
- Controls the public API by defining what gets exported
# mypackage/__init__.py
# Import key items so users can do: from mypackage import User
from mypackage.models import User, Admin
from mypackage.utils import validate_email
# Define what "from mypackage import *" exports
__all__ = ["User", "Admin", "validate_email"]
# Package-level constants
__version__ = "1.0.0"
Empty __init__.py
For simple packages, an empty __init__.py is fine. It just marks the directory as a package:
# mypackage/__init__.py
# (empty file)
Users then import directly from submodules:
from mypackage.models import User
from mypackage.utils import validate_email
Note: Since Python 3.3, "namespace packages" allow packages without
__init__.py. But explicit__init__.pyis still recommended for clarity.
The src/ Layout
The src/ layout is the recommended structure for installable Python packages. It prevents a common bug where tests accidentally import the local source instead of the installed package:
my-project/
src/
mypackage/
__init__.py
models.py
services.py
utils.py
tests/
test_models.py
test_services.py
test_utils.py
pyproject.toml
README.md
Why src/?
Without src/, when you run tests from the project root, Python might import the local mypackage/ directory instead of the installed version. The src/ directory prevents this because src/ isn't automatically on sys.path.
Flat Layout (Alternative)
Some projects skip src/ and put the package at the root. This is simpler but has the import pitfall described above:
my-project/
mypackage/
__init__.py
models.py
tests/
test_models.py
pyproject.toml
Recommendation: Use
src/for libraries and packages you'll distribute. The flat layout is fine for applications that won't be installed as packages.
Tests Folder Structure
Mirror your source structure in your tests directory. This makes it easy to find the test for any module:
src/
mypackage/
__init__.py
models/
__init__.py
user.py
product.py
services/
__init__.py
auth.py
payment.py
utils.py
tests/
__init__.py
models/
__init__.py
test_user.py
test_product.py
services/
__init__.py
test_auth.py
test_payment.py
test_utils.py
conftest.py # Shared pytest fixtures
Key conventions:
- Test files are named
test_<module>.py(pytest discovers them automatically) - Test classes are named
Test<ClassName> - Test functions are named
test_<behavior> conftest.pyholds shared fixtures and test configuration
# tests/models/test_user.py
from mypackage.models.user import User
class TestUser:
def test_create_user(self):
user = User("Alice", "alice@example.com")
assert user.name == "Alice"
def test_user_display_name(self):
user = User("alice bob", "alice@example.com")
assert user.display_name() == "Alice Bob"
pyproject.toml Basics
pyproject.toml is the modern standard for Python project configuration. It replaces setup.py, setup.cfg, and various tool-specific config files:
[project]
name = "mypackage"
version = "1.0.0"
description = "A short description of my package"
readme = "README.md"
requires-python = ">=3.10"
license = {text = "MIT"}
authors = [
{name = "Your Name", email = "you@example.com"},
]
dependencies = [
"requests>=2.28",
"pydantic>=2.0",
]
[project.optional-dependencies]
dev = [
"pytest>=7.0",
"mypy>=1.0",
"black>=23.0",
]
[build-system]
requires = ["setuptools>=68.0"]
build-backend = "setuptools.backends._legacy:_Backend"
[tool.setuptools.packages.find]
where = ["src"]
# Tool configuration
[tool.black]
line-length = 88
[tool.mypy]
strict = true
[tool.pytest.ini_options]
testpaths = ["tests"]
Key Sections
| Section | Purpose |
|---|---|
[project] |
Package metadata, dependencies |
[build-system] |
How to build the package |
[tool.*] |
Configuration for tools (Black, mypy, pytest, etc.) |
[project.scripts] |
CLI entry points |
[project.optional-dependencies] |
Extra dependency groups |
Common Project Layouts
CLI Application
weather-cli/
src/
weather/
__init__.py
cli.py # Click/argparse entry point
api.py # External API client
models.py # Data models
formatters.py # Output formatting
config.py # Configuration handling
tests/
test_cli.py
test_api.py
test_formatters.py
pyproject.toml
README.md
# In pyproject.toml
[project.scripts]
weather = "weather.cli:main"
Library / Reusable Package
python-slugify/
src/
slugify/
__init__.py # Public API exports
core.py # Main slugification logic
special.py # Language-specific rules
unicode_data.py # Unicode lookup tables
tests/
test_core.py
test_special.py
docs/
index.md
api.md
pyproject.toml
README.md
LICENSE
Web Application (Flask/FastAPI)
mywebapp/
src/
app/
__init__.py # App factory
models/
__init__.py
user.py
post.py
routes/
__init__.py
auth.py
api.py
services/
__init__.py
email.py
storage.py
templates/
base.html
index.html
static/
style.css
config.py
tests/
conftest.py
test_auth.py
test_api.py
migrations/
001_initial.sql
pyproject.toml
README.md
Organizing Class Hierarchies Across Files
When a project uses inheritance and polymorphism heavily, how you organize classes across files has a significant impact on readability and maintainability.
Where to Put Base Classes
Base classes and abstract classes should live in their own module, separate from their implementations. This prevents circular imports and makes the hierarchy clear:
src/
mypackage/
base.py # Abstract base classes
mixins.py # Reusable mixin classes
implementations/
__init__.py
csv_processor.py
json_processor.py
xml_processor.py
# base.py
from abc import ABC, abstractmethod
class BaseProcessor(ABC):
"""Abstract base for all data processors."""
@abstractmethod
def parse(self, raw_data: str) -> dict:
pass
@abstractmethod
def validate(self, data: dict) -> bool:
pass
def process(self, raw_data: str) -> dict:
"""Template method: parse, validate, then return."""
data = self.parse(raw_data)
if not self.validate(data):
raise ValueError("Validation failed")
return data
# implementations/json_processor.py
import json
from mypackage.base import BaseProcessor
class JsonProcessor(BaseProcessor):
"""Process JSON data files."""
def parse(self, raw_data: str) -> dict:
return json.loads(raw_data)
def validate(self, data: dict) -> bool:
return "id" in data and "name" in data
Where to Put Mixins
Mixins — small classes designed for multiple inheritance — belong in a dedicated mixins.py module or a mixins/ package:
# mixins.py
import logging
class LoggingMixin:
"""Add structured logging to any class."""
def get_logger(self):
return logging.getLogger(self.__class__.__name__)
def log_info(self, message):
self.get_logger().info(message)
class CachingMixin:
"""Add simple in-memory caching."""
def __init_subclass__(cls, **kwargs):
super().__init_subclass__(**kwargs)
cls._cache = {}
def cache_get(self, key):
return self._cache.get(key)
def cache_set(self, key, value):
self._cache[key] = value
Then combine them in implementation classes:
# implementations/cached_json_processor.py
from mypackage.base import BaseProcessor
from mypackage.mixins import LoggingMixin, CachingMixin
class CachedJsonProcessor(LoggingMixin, CachingMixin, BaseProcessor):
"""JSON processor with logging and caching.
MRO: CachedJsonProcessor -> LoggingMixin -> CachingMixin
-> BaseProcessor -> ABC -> object
"""
def parse(self, raw_data):
self.log_info(f"Parsing {len(raw_data)} bytes")
cached = self.cache_get(raw_data)
if cached:
self.log_info("Cache hit")
return cached
result = json.loads(raw_data)
self.cache_set(raw_data, result)
return result
Deep Inheritance vs Composition
A common question is whether to use inheritance or composition. Both have trade-offs:
Deep inheritance (A -> B -> C -> D) creates tight coupling. Changes to A can break B, C, and D. The MRO becomes harder to reason about with each level, especially with multiple inheritance.
Composition (A has a B, C, D) is more flexible. Each component can be swapped independently.
# Deep inheritance — tightly coupled
class Animal:
pass
class Mammal(Animal):
pass
class DomesticMammal(Mammal):
pass
class Dog(DomesticMammal): # 4 levels deep — hard to change
pass
# Composition — loosely coupled
class Dog:
def __init__(self):
self.movement = WalkingMovement()
self.sound = BarkingSound()
self.diet = OmnivoreDiet()
General guidance:
- Prefer composition when objects have a "has-a" relationship
- Use inheritance when objects have a clear "is-a" relationship
- Keep inheritance hierarchies shallow (2-3 levels max)
- Use mixins for cross-cutting concerns (logging, caching, serialization)
- Use ABCs to define interfaces that multiple unrelated classes must implement
- Consider Protocols (structural subtyping) when you want duck typing with type safety
Click "Run" to execute your codeVirtual Environments
Always use a virtual environment to isolate your project's dependencies from the system Python and from other projects:
# Create a virtual environment
python -m venv .venv
# Activate it
# On macOS/Linux:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Or with pyproject.toml:
pip install -e ".[dev]"
# Deactivate when done
deactivate
Your project's .gitignore should exclude the virtual environment directory:
# .gitignore
.venv/
__pycache__/
*.pyc
dist/
*.egg-info/
Tip: Modern tools like
uv,pdm, andpoetrymanage virtual environments automatically. They create and activate environments as part of their workflow, so you don't have to manage them manually.
The if __name__ == "__main__" Pattern
This pattern lets a file work both as an importable module and a runnable script:
# my_module.py
def greet(name):
return f"Hello, {name}!"
def main():
"""Entry point when run as a script."""
print(greet("World"))
if __name__ == "__main__":
main()
When you run python my_module.py, Python sets __name__ to "__main__", so main() executes. When you import my_module, __name__ is "my_module", so main() doesn't run.
Click "Run" to execute your codePutting It All Together
Here's a complete, well-structured project skeleton:
my-awesome-project/
src/
mypackage/
__init__.py # Package API, version
base.py # Abstract base classes
mixins.py # Reusable mixins
models/
__init__.py
user.py
product.py
services/
__init__.py
auth.py
payment.py
utils/
__init__.py
validation.py
formatting.py
config.py # Settings and configuration
tests/
__init__.py
conftest.py # Shared fixtures
models/
test_user.py
test_product.py
services/
test_auth.py
test_payment.py
utils/
test_validation.py
docs/
index.md
pyproject.toml
README.md
LICENSE
.gitignore
Start simple — a single file is fine for small projects. As your code grows, split it into modules. When you have enough modules, organize them into packages. The structure should serve the code, not the other way around.