Modules in Python are files that contain reusable code such as functions, classes, and variables. Instead of writing everything in one notebook or one script, Python lets you organise related code into modules so your work becomes cleaner and easier to maintain. In data analysis, modules help you reuse logic across projects, keep notebooks simpler, and avoid copying the same code repeatedly.
A module can be a built-in Python module, a third-party module, or a module you create yourself. Built-in modules come with Python, such as math (for mathematical functions), datetime (for dates and time), and os (for working with files and folders). Third-party modules are installed separately and include tools like pandas, numpy, and matplotlib, which are widely used for data analysis.
To use a module, you import it. Importing means you bring the module’s code into your current program so you can access its features. For example, you might import pandas to work with tables, or import matplotlib to create charts. You can import a module with its full name, or use an alias such as import pandas as pd to make your code shorter and easier to type. You can also import only specific functions from a module if you do not need everything.
Modules improve clarity. For example, you can create a module that contains all your data cleaning functions, and another module for visualisation functions. Then, in each new analysis, you only import what you need. This also helps with consistency, because the same logic is applied across different datasets and reports.
Learning modules is important because real-world data work often involves multiple files and repeated workflows. Once you understand how to import and organise code into modules, your projects become more structured, reusable, and professional.

