Venn Plot
This module provides functionality for creating Venn and Euler diagrams from pandas DataFrames.
It is designed to visualize relationships between sets, highlighting intersections and differences between them.
Core Features
- Supports 2-set and 3-set Diagrams: Allows visualization of up to three overlapping sets.
- Venn and Euler Diagrams: Uses Venn diagrams by default; switches to Euler diagrams when
vary_size=True
. - Customizable Colors and Labels: Automatically assigns colors and labels for subset representation.
- Dynamic Sizing: Adjusts circle sizes for Euler diagrams to reflect proportions.
- Title and Source Attribution: Optionally adds a title and source text.
Use Cases
- Set Comparisons: Identify shared and unique elements across two or three sets.
- Proportional Representation: Euler diagrams ensure area-accurate representation.
- Data Overlap Visualization: Helps in understanding relationships within categorical data.
Limitations and Warnings
- Only Supports 2 or 3 Sets: Does not extend to Venn diagrams with more than three sets.
- Pre-Aggregated Data Required: The module does not perform data aggregation; input data should already be structured correctly.
plot(df, labels, title=None, source_text=None, vary_size=False, figsize=None, ax=None, subset_label_formatter=None, **kwargs)
Plots a Venn or Euler diagram using subset sizes extracted from a DataFrame.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
DataFrame
|
DataFrame with 'groups' and 'percent' columns. |
required |
labels |
list[str]
|
Labels for the sets in the diagram. |
required |
title |
str
|
Title of the plot. Defaults to None. |
None
|
source_text |
str
|
Source text for attribution. Defaults to None. |
None
|
vary_size |
bool
|
Whether to vary circle size based on subset sizes. Defaults to False. |
False
|
figsize |
tuple[int, int]
|
Size of the plot. Defaults to None. |
None
|
ax |
Axes
|
Matplotlib axes object to plot on. Defaults to None. |
None
|
subset_label_formatter |
callable
|
Function to format subset labels. Defaults to None. |
None
|
**kwargs |
dict[str, any]
|
Additional keyword arguments. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
SubplotBase |
SubplotBase
|
The matplotlib axes object with the plotted diagram. |
Raises:
Type | Description |
---|---|
ValueError
|
If the number of sets is not 2 or 3. |