Line Plot
This module provides flexible functionality for creating line plots from pandas DataFrames.
It focuses on visualizing sequences that are ordered or sequential but not necessarily categorical, such as "days since an event" or "months since a competitor opened." However, while this module can handle datetime values on the x-axis, the plots.time_line module has additional features that make working with datetimes easier, such as easily resampling the data to alternate time frames.
The sequences used in this module can include values like "days since an event" (e.g., -2, -1, 0, 1, 2) or "months
since a competitor store opened." This module is not intended for use with actual datetime values. If a datetime
or datetime-like column is passed as x_col
, a warning will be triggered, suggesting the use of the
plots.time_line
module.
Core Features
- Plotting Sequences or Indexes: Plot one or more value columns (
value_col
) with support for sequences like -2, -1, 0, 1, 2 (e.g., months since an event), using either the index or a specified x-axis column (x_col
). - Custom X-Axis or Index: Use any column as the x-axis (
x_col
) or plot based on the index if no x-axis column is specified. - Multiple Lines: Create separate lines for each unique value in
group_col
(e.g., categories or product types). - Comprehensive Customization: Easily customize plot titles, axis labels, and legends, with the option to move the legend outside the plot.
- Pre-Aggregated Data: The data must be pre-aggregated before plotting, as no aggregation occurs within the module.
Use Cases
- Daily Trends: Plot trends such as daily revenue or user activity, for example, tracking revenue since the start of the year.
- Event Impact: Visualize how metrics (e.g., revenue, sales, or traffic) change before and after an important event, such as a competitor store opening or a product launch.
- Category Comparison: Compare metrics across multiple categories over time, for example, tracking total revenue for the top categories before and after an event like the introduction of a new competitor.
Limitations and Handling of Temporal Data
- Limited Handling of Temporal Data: This module can plot simple time-based sequences, such as "days since an event," but it cannot manipulate or directly handle datetime or date-like columns. It is not optimized for actual datetime values.
If a datetime column is passed or more complex temporal plotting is needed, a warning will suggest using the
plots.time_line
module, which is specifically designed for working with temporal data and performing time-based manipulation. - Pre-Aggregated Data Required: The module does not perform any data aggregation, so all data must be pre-aggregated before being passed in for plotting.
plot(df, value_col, x_label=None, y_label=None, title=None, x_col=None, group_col=None, ax=None, source_text=None, legend_title=None, move_legend_outside=False, **kwargs)
Plots the value_col
over the specified x_col
or index, creating a separate line for each unique value in group_col
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame
|
The dataframe to plot. |
required |
value_col |
str or list of str
|
The column(s) to plot. |
required |
x_label |
str
|
The x-axis label. |
None
|
y_label |
str
|
The y-axis label. |
None
|
title |
str
|
The title of the plot. |
None
|
x_col |
str
|
The column to be used as the x-axis. If None, the index is used. |
None
|
group_col |
str
|
The column used to define different lines. |
None
|
legend_title |
str
|
The title of the legend. |
None
|
ax |
Axes
|
Matplotlib axes object to plot on. |
None
|
source_text |
str
|
The source text to add to the plot. |
None
|
move_legend_outside |
bool
|
Move the legend outside the plot. |
False
|
**kwargs |
dict[str, any]
|
Additional keyword arguments for Pandas' |
{}
|
Returns:
Name | Type | Description |
---|---|---|
SubplotBase |
SubplotBase
|
The matplotlib axes object. |
Raises:
Type | Description |
---|---|
ValueError
|
If |