Table¶

The Table class is used to represent and work on a table or dataframe as a list of dict. It derives from the pytups.Tuplist class and provides powerful data manipulation capabilities similar to pandas DataFrames but with a more lightweight and flexible approach.

Key Features¶

Lightweight data structure based on Python dictionaries
Powerful data manipulation methods inspired by R’s dplyr
Easy conversion to/from JSON format
Integration with pandas DataFrames
Support for various join operations
Flexible data reshaping capabilities

Basic Example¶

For example the following table:

a	b
1	False
2	True
3	False

will be represented as:

[{"a": 1, "b": False}, {"a": 2, "b": True}, {"a": 3, "b": False}]

Common Operations¶

Data Manipulation¶

The Table class provides various methods for data manipulation:

mutate(): Add or modify columns
select(): Choose specific columns
drop(): Remove columns
filter(): Filter rows based on conditions
group_by(): Group data by columns
summarise(): Aggregate data after grouping

# Example of data manipulation
table = Table([{"a": 1, "b": 2}, {"a": 3, "b": 4}])

# Add a new column 'c' as sum of 'a' and 'b'
result = table.mutate(c=lambda v: v["a"] + v["b"])
# Result: [{"a": 1, "b": 2, "c": 3}, {"a": 3, "b": 4, "c": 7}]

Join Operations¶

The class supports various types of joins similar to SQL:

left_join(): Keep all rows from the left table
right_join(): Keep all rows from the right table
inner_join(): Keep only matching rows
full_join(): Keep all rows from both tables
auto_join(): Join a table with itself

# Example of join operation
table1 = Table([{"id": 1, "value": "A"}, {"id": 2, "value": "B"}])
table2 = Table([{"id": 1, "data": 100}, {"id": 3, "data": 300}])

result = table1.left_join(table2, by="id")
# Result: [{"id": 1, "value": "A", "data": 100}, {"id": 2, "value": "B", "data": None}]

Data Reshaping¶

Support for reshaping data between wide and long formats:

pivot_longer(): Convert wide format to long format
pivot_wider(): Convert long format to wide format

# Example of pivot operations
table = Table([{"id": 1, "x": 10, "y": 20}, {"id": 2, "x": 30, "y": 40}])

long = table.pivot_longer(["x", "y"], names_to="variable", value_to="value")
# Result: [{"id": 1, "variable": "x", "value": 10}, {"id": 1, "variable": "y", "value": 20},
#          {"id": 2, "variable": "x", "value": 30}, {"id": 2, "variable": "y", "value": 40}]

Import/Export¶

The Table class supports various formats:

JSON files (to_json/from_json)
Excel files (to_excel/from_excel)
CSV files (to_csv/from_csv)
Pandas DataFrames (to_pandas/from_pandas)

For more information on the underlying Tuplist class, see the pytups documentation.

Table class API¶

class mango.table.Table(*args: Any, **kwargs: Any)¶

Bases: TupList

Enhanced table class extending TupList with data manipulation capabilities.

A Table is a list of dictionaries that provides a rich set of methods for data manipulation, filtering, joining, and transformation operations. Inspired by R’s dplyr and data.table packages.

Parameters:

data (list[dict], optional) – Initial data as a list of dictionaries or None for empty table
check (bool) – Whether to validate that all rows are dictionaries

Example:

>>> # Create empty table
>>> table = Table()
>>>
>>> # Create table from list of dicts
>>> data = [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]
>>> table = Table(data)
>>> print(table)
Table (2 rows, 2 columns):
0 {'name': 'Alice', 'age': 30}
1 {'name': 'Bob', 'age': 25}

vapply(func: Callable, *args, **kwargs) → Table¶

Apply a function to each element (row) of the table.

Maps the given function to each row in the table, returning a new table with the transformed values.

Parameters:

func (Callable) – Function to apply to each row
args – Additional positional arguments for the function
kwargs – Additional keyword arguments for the function

Returns:

New table with transformed values

Return type:

Table

Example:

>>> table = Table([{"x": 1, "y": 2}, {"x": 3, "y": 4}])
>>> result = table.vapply(lambda row: {"sum": row["x"] + row["y"]})
>>> print(result)
[{'sum': 3}, {'sum': 7}]

kapply(func, *args, **kwargs) → Table¶

Apply a function to each key (column name) of the table.

Maps the given function to each column name in the table, returning a new table with transformed column names.

Parameters:

func (Callable) – Function to apply to each column name
args – Additional positional arguments for the function
kwargs – Additional keyword arguments for the function

Returns:

New table with transformed column names

Return type:

Table

Example:

>>> table = Table([{"name": "Alice", "age": 30}])
>>> result = table.kapply(lambda key: key.upper())
>>> print(result)
[{'NAME': 'Alice', 'AGE': 30}]

kvapply(func, *args, **kwargs) → Table¶

Apply a function to each element with its index.

Maps the given function to each row in the table along with its index, returning a new table with transformed values.

Parameters:

func (Callable) – Function to apply to each (index, row) pair
args – Additional positional arguments for the function
kwargs – Additional keyword arguments for the function

Returns:

New table with transformed values

Return type:

Table

Example:

>>> table = Table([{"x": 1}, {"x": 2}])
>>> result = table.kvapply(lambda idx, row: {"index": idx, "value": row["x"]})
>>> print(result)
[{'index': 0, 'value': 1}, {'index': 1, 'value': 2}]

copy_shallow() → Table¶

Create a shallow copy of the table.

Creates a new table with the same structure but shares references to the same row objects. Changes to row contents will affect both tables.

Returns:: New table with shallow copy of the data
Return type:: Table

Example:

>>> table = Table([{"name": "Alice", "age": 30}])
>>> copy = table.copy_shallow()
>>> copy[0]["age"] = 31
>>> print(table[0]["age"])  # Also changed to 31
31

copy_deep() → Table¶

Create a deep copy of the table.

Creates a new table with completely independent copies of all data. Changes to the original table will not affect the copy.

Returns:: New table with deep copy of the data
Return type:: Table

Example:

>>> table = Table([{"name": "Alice", "age": 30}])
>>> copy = table.copy_deep()
>>> copy[0]["age"] = 31
>>> print(table[0]["age"])  # Still 30
30

vfilter(function: Callable) → Table¶

Filter rows based on a condition (internal method).

Keeps only the rows for which the provided function returns True. This is the internal implementation used by the public filter method.

Parameters:: function (Callable[[dict], bool]) – Function that takes a row (dict) and returns a boolean
Returns:: New table containing only rows that satisfy the condition
Return type:: Table

unique() → Table¶

Remove duplicate rows from the table.

Eliminates rows with identical values across all columns. When duplicates are found, the first occurrence is kept.

Returns:: New table with duplicate rows removed
Return type:: Table
Raises:: NotImplementedError – If table structure is not compatible

Example:

>>> table = Table([
...     {"name": "Alice", "age": 30},
...     {"name": "Bob", "age": 25},
...     {"name": "Alice", "age": 30}  # Duplicate
... ])
>>> result = table.unique()
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]

unique2() → Table¶

Remove duplicate rows from the table (alternative implementation).

Alternative implementation of unique() that eliminates rows with identical values across all columns. When duplicates are found, the first occurrence is kept.

Returns:: New table with duplicate rows removed
Return type:: Table
Raises:: NotImplementedError – If table structure is not compatible

Example:

>>> table = Table([
...     {"name": "Alice", "age": 30},
...     {"name": "Bob", "age": 25},
...     {"name": "Alice", "age": 30}  # Duplicate
... ])
>>> result = table.unique2()
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]

sorted(**kwargs) → Table¶

Sort the table according to a custom key function.

Sorts the table using a custom key function. This method is limited to simple data types and may not work with complex table structures.

Parameters:: kwargs (dict) – Arguments for the sorted function
Returns:: New table with sorted rows
Return type:: Table
Raises:: NotImplementedError – If table structure is not compatible

Example:

>>> table = Table([{"value": 3}, {"value": 1}, {"value": 2}])
>>> result = table.sorted(key=lambda x: x["value"])
>>> print(result)
[{'value': 1}, {'value': 2}, {'value': 3}]

take(*args, use_numpy=False) → pytups.TupList¶

Extract values from specified columns of the table.

Returns a TupList containing tuples of values from the specified columns. Each tuple represents the values from one row for the selected columns.

Parameters:

args (str) – Names of columns to extract
use_numpy (bool) – Whether to use numpy methods for extraction

Returns:

TupList of tuples containing column values

Return type:

TupList

Example:

>>> table = Table([{"a": 1, "b": 2}, {"a": 3, "b": 4}])
>>> result = table.take("a", "b")
>>> print(result)
[(1, 2), (3, 4)]

show_rows(n1, n2=None) → str¶

Display specific rows of the table as a formatted string.

Shows the specified row or range of rows with their indices. Used internally by the string representation of the table.

Parameters:

n1 (int) – Starting row index
n2 (int, optional) – Ending row index (if None, shows only row n1)

Returns:

Formatted string showing the specified rows

Return type:

str

Example:

>>> table = Table([{"name": "Alice"}, {"name": "Bob"}, {"name": "Charlie"}])
>>> result = table.show_rows(0, 2)
>>> print(result)
0 {'name': 'Alice'}
1 {'name': 'Bob'}

head(n=10) → Table¶

Return the first n rows of the table.

Creates a new table containing only the first n rows. If the table has fewer than n rows, returns the entire table.

Parameters:: n (int) – Number of rows to return (default: 10)
Returns:: New table with the first n rows
Return type:: Table

Example:

>>> table = Table([
...     {"name": "Alice"}, {"name": "Bob"}, {"name": "Charlie"},
...     {"name": "David"}, {"name": "Eve"}
... ])
>>> result = table.head(3)
>>> print(result)
[{'name': 'Alice'}, {'name': 'Bob'}, {'name': 'Charlie'}]

peek(n=3, name=None) → Table¶

Display a preview of the table showing first, middle, and last rows.

Shows the first n rows, middle n rows, and last n rows of the table. Useful for getting an overview of large tables without printing everything.

Parameters:

n (int) – Number of rows to show in each section (default: 3)
name (str, optional) – Optional name or message to display with the table

Returns:

The original table (unchanged)

Return type:

Table

Example:

>>> table = Table([{"id": i, "value": i*2} for i in range(10)])
>>> table.peek(2, "Sample Data")
Sample Data: Table (10 rows, 2 columns):
0 {'id': 0, 'value': 0}
1 {'id': 1, 'value': 2}
...
4 {'id': 4, 'value': 8}
5 {'id': 5, 'value': 10}
...
8 {'id': 8, 'value': 16}
9 {'id': 9, 'value': 18}

mutate(**kwargs) → Table¶

Add or modify columns in the table.

Creates new columns or modifies existing ones using various methods: - Single values applied to all rows - Lists of values for each row - Functions that operate on row data

Note: All changes are applied to the original table structure and do not take into account other changes made in the same call.

Parameters:: kwargs (dict) – Named arguments with column names and their values
Returns:: New table with modified columns
Return type:: Table

Example:

>>> table = Table([{"x": 1, "y": 2}, {"x": 3, "y": 4}])
>>> result = table.mutate(
...     z=10,  # Constant value
...     sum=lambda row: row["x"] + row["y"],  # Function
...     product=[2, 12]  # List of values
... )
>>> print(result)
[{'x': 1, 'y': 2, 'z': 10, 'sum': 3, 'product': 2},
 {'x': 3, 'y': 4, 'z': 10, 'sum': 7, 'product': 12}]

group_by(col) → pytups.SuperDict¶

Group rows of the table by specified column values.

Groups the table rows based on the values in the specified column(s). Returns a SuperDict where keys are the unique values and values are lists of rows that have that value.

Parameters:: col (str or list[str]) – Column name or list of column names to group by
Returns:: SuperDict with grouped data
Return type:: SuperDict

Example:

>>> table = Table([
...     {"name": "Alice", "city": "Madrid"},
...     {"name": "Bob", "city": "Barcelona"},
...     {"name": "Charlie", "city": "Madrid"}
... ])
>>> result = table.group_by("city")
>>> print(result)
{'Madrid': [{'name': 'Alice', 'city': 'Madrid'}, {'name': 'Charlie', 'city': 'Madrid'}],
 'Barcelona': [{'name': 'Bob', 'city': 'Barcelona'}]}

group_mutate(group_by, **func) → Table¶

Group by specified columns and apply functions to other columns.

Groups the table by the specified columns and applies aggregation functions to the remaining columns. Similar to SQL GROUP BY with aggregate functions.

Parameters:

group_by (str or list[str]) – Column name or list of column names to group by
func (dict[str, Callable]) – Functions to apply to columns (e.g., a=sum, b=mean)

Returns:

New table with grouped and aggregated data

Return type:

Table

Example:

>>> table = Table([
...     {"city": "Madrid", "sales": 100},
...     {"city": "Madrid", "sales": 150},
...     {"city": "Barcelona", "sales": 200}
... ])
>>> result = table.group_mutate("city", sales=sum)
>>> print(result)
[{'city': 'Madrid', 'sales': 250}, {'city': 'Barcelona', 'sales': 200}]

sum_all(group_by=None) → Table¶

Group by specified columns and sum all numeric columns.

Groups the table by the specified columns and sums all numeric columns in each group. Non-numeric columns are ignored.

Parameters:: group_by (str or list[str], optional) – Column name or list of column names to group by
Returns:: New table with grouped and summed data
Return type:: Table

Example:

>>> table = Table([
...     {"category": "A", "value1": 10, "value2": 20},
...     {"category": "A", "value1": 15, "value2": 25},
...     {"category": "B", "value1": 5, "value2": 10}
... ])
>>> result = table.sum_all("category")
>>> print(result)
[{'category': 'A', 'value1': 25, 'value2': 45},
 {'category': 'B', 'value1': 5, 'value2': 10}]

summarise(group_by=None, default: Callable = None, **func) → Table¶

Group by specified columns and apply aggregation functions.

Groups the table by specified columns and applies custom aggregation functions to other columns. More flexible than group_mutate as it allows specifying a default function for non-explicitly handled columns.

Parameters:

group_by (str or list[str], optional) – Column name or list of column names to group by
default (Callable, optional) – Default function to apply to columns not explicitly specified
func (dict[str, Callable]) – Functions to apply to specific columns

Returns:

New table with grouped and summarized data

Return type:

Table

Example:

>>> table = Table([
...     {"category": "A", "value": 10, "count": 1},
...     {"category": "A", "value": 15, "count": 2},
...     {"category": "B", "value": 5, "count": 1}
... ])
>>> result = table.summarise("category", value=sum, count=sum)
>>> print(result)
[{'category': 'A', 'value': 25, 'count': 3},
 {'category': 'B', 'value': 5, 'count': 1}]

join(table2, by=None, suffix=None, jtype='full', empty=None, if_empty_table_1=None, if_empty_table_2=None) → Table¶

Join two tables using various join types.

Performs table joins inspired by R’s dplyr join functions. Supports different join types and flexible column matching strategies.

Parameters:

table2 (Table or list[dict]) – Second table to join with
by (list, dict, or None) – Column specification for joining
suffix (list[str], optional) – Suffixes for disambiguating column names
jtype (str) – Type of join (“full”, “left”, “right”, “inner”)
empty – Value to use for empty cells created by the join
if_empty_table_1 – Replacement if table 1 is empty
if_empty_table_2 – Replacement if table 2 is empty

Returns:

New table containing the joined data

Return type:

Table

Example:

>>> table1 = Table([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
>>> table2 = Table([{"id": 1, "age": 30}, {"id": 3, "age": 25}])
>>> result = table1.join(table2, by="id", jtype="left")
>>> print(result)
[{'id': 1, 'name': 'Alice', 'age': 30}, {'id': 2, 'name': 'Bob', 'age': None}]

left_join(table2, by=None, suffix=None, empty=None, if_empty_table_1=None, if_empty_table_2=None) → Table¶

Perform a left join with another table.

Returns all rows from the left table (self) and matching rows from the right table (table2). Rows from the left table without matches will have None values for columns from the right table.

Parameters:

table2 (Table or list[dict]) – Second table to join with
by (list, dict, or None) – Column specification for joining
suffix (list[str], optional) – Suffixes for disambiguating column names
empty – Value to use for empty cells created by the join
if_empty_table_1 – Replacement if table 1 is empty
if_empty_table_2 – Replacement if table 2 is empty

Returns:

New table containing the left join result

Return type:

Table

Example:

>>> table1 = Table([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
>>> table2 = Table([{"id": 1, "age": 30}, {"id": 3, "age": 25}])
>>> result = table1.left_join(table2, by="id")
>>> print(result)
[{'id': 1, 'name': 'Alice', 'age': 30}, {'id': 2, 'name': 'Bob', 'age': None}]

right_join(table2, by=None, suffix=None, empty=None, if_empty_table_1=None, if_empty_table_2=None) → Table¶

Perform a right join with another table.

Returns all rows from the right table (table2) and matching rows from the left table (self). Rows from the right table without matches will have None values for columns from the left table.

Parameters:

table2 (Table or list[dict]) – Second table to join with
by (list, dict, or None) – Column specification for joining
suffix (list[str], optional) – Suffixes for disambiguating column names
empty – Value to use for empty cells created by the join
if_empty_table_1 – Replacement if table 1 is empty
if_empty_table_2 – Replacement if table 2 is empty

Returns:

New table containing the right join result

Return type:

Table

Example:

>>> table1 = Table([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
>>> table2 = Table([{"id": 1, "age": 30}, {"id": 3, "age": 25}])
>>> result = table1.right_join(table2, by="id")
>>> print(result)
[{'id': 1, 'name': 'Alice', 'age': 30}, {'id': 3, 'name': None, 'age': 25}]

full_join(table2, by=None, suffix=None, empty=None, if_empty_table_1=None, if_empty_table_2=None) → Table¶

Perform a full outer join with another table.

Returns all rows from both tables, with None values where there are no matches. This is the default join type and combines left and right joins.

Parameters:

table2 (Table or list[dict]) – Second table to join with
by (list, dict, or None) – Column specification for joining
suffix (list[str], optional) – Suffixes for disambiguating column names
empty – Value to use for empty cells created by the join
if_empty_table_1 – Replacement if table 1 is empty
if_empty_table_2 – Replacement if table 2 is empty

Returns:

New table containing the full join result

Return type:

Table

Example:

>>> table1 = Table([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
>>> table2 = Table([{"id": 1, "age": 30}, {"id": 3, "age": 25}])
>>> result = table1.full_join(table2, by="id")
>>> print(result)
[{'id': 1, 'name': 'Alice', 'age': 30},
 {'id': 2, 'name': 'Bob', 'age': None},
 {'id': 3, 'name': None, 'age': 25}]

inner_join(table2, by=None, suffix=None, empty=None, if_empty_table_1=None, if_empty_table_2=None) → Table¶

Perform an inner join with another table.

Returns only rows that have matching values in both tables. Rows without matches in either table are excluded from the result.

Parameters:

table2 (Table or list[dict]) – Second table to join with
by (list, dict, or None) – Column specification for joining
suffix (list[str], optional) – Suffixes for disambiguating column names
empty – Value to use for empty cells created by the join
if_empty_table_1 – Replacement if table 1 is empty
if_empty_table_2 – Replacement if table 2 is empty

Returns:

New table containing only matching rows

Return type:

Table

Example:

>>> table1 = Table([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
>>> table2 = Table([{"id": 1, "age": 30}, {"id": 3, "age": 25}])
>>> result = table1.inner_join(table2, by="id")
>>> print(result)
[{'id': 1, 'name': 'Alice', 'age': 30}]

auto_join(by=None, suffix=None, empty=None) → Table¶

Join a table with itself to create combinations.

Performs a self-join to create all possible combinations of rows. Useful for creating Cartesian products or finding relationships within the same table.

Parameters:

by (list, dict, or None) – Column specification for the self-join
suffix (list[str], optional) – Suffixes to add to column names to distinguish them
empty – Value to use for empty cells created by the join

Returns:

New table with all combinations

Return type:

Table

Example:

>>> table = Table([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
>>> result = table.auto_join()
>>> print(result)
[{'id': 1, 'name': 'Alice', 'id_2': 1, 'name_2': 'Alice'},
 {'id': 1, 'name': 'Alice', 'id_2': 2, 'name_2': 'Bob'},
 {'id': 2, 'name': 'Bob', 'id_2': 1, 'name_2': 'Alice'},
 {'id': 2, 'name': 'Bob', 'id_2': 2, 'name_2': 'Bob'}]

select(*args) → Table¶

Select specific columns from the table.

Creates a new table containing only the specified columns. Maintains the original row order.

Parameters:: args (str) – Names of columns to select
Returns:: New table with only the selected columns
Return type:: Table
Raises:: KeyError – If any specified column doesn’t exist

Example:

>>> table = Table([
...     {"id": 1, "name": "Alice", "age": 30, "city": "Madrid"},
...     {"id": 2, "name": "Bob", "age": 25, "city": "Barcelona"}
... ])
>>> result = table.select("name", "age")
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]

drop(*args) → Table¶

Remove specific columns from the table.

Creates a new table with the specified columns removed. Maintains the original row order.

Parameters:: args (str) – Names of columns to remove
Returns:: New table without the specified columns
Return type:: Table

Example:

>>> table = Table([
...     {"id": 1, "name": "Alice", "age": 30, "city": "Madrid"},
...     {"id": 2, "name": "Bob", "age": 25, "city": "Barcelona"}
... ])
>>> result = table.drop("id", "city")
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]

rename(**kwargs) → Table¶

Rename columns in the table.

Changes column names using a mapping of old names to new names. Maintains the original row order and data.

Parameters:: kwargs (dict[str, str]) – Mapping of old column names to new names
Returns:: New table with renamed columns
Return type:: Table
Raises:: KeyError – If any old column name doesn’t exist

Example:

>>> table = Table([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
>>> result = table.rename(id="user_id", name="full_name")
>>> print(result)
[{'user_id': 1, 'full_name': 'Alice'}, {'user_id': 2, 'full_name': 'Bob'}]

filter(func) → Table¶

Filter rows based on a condition.

Keeps only the rows for which the provided function returns True. Returns an empty table if the original table is empty.

Parameters:: func (Callable[[dict], bool]) – Function that takes a row (dict) and returns a boolean
Returns:: New table containing only rows that satisfy the condition
Return type:: Table

Example:

>>> table = Table([
...     {"name": "Alice", "age": 30},
...     {"name": "Bob", "age": 25},
...     {"name": "Charlie", "age": 35}
... ])
>>> result = table.filter(lambda row: row["age"] > 28)
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Charlie', 'age': 35}]

get_col_names(fast=False) → list¶

Get the names of all columns in the table.

Returns a list of column names. By default, scans all rows to ensure all possible columns are included. Use fast=True for better performance if you’re certain the first row contains all columns.

Parameters:: fast (bool) – If True, only check the first row for column names
Returns:: List of column names
Return type:: list[str]
Raises:: IndexError – If table is empty

Example:

>>> table = Table([{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}])
>>> columns = table.get_col_names()
>>> print(columns)
['name', 'age']

to_columns() → pytups.SuperDict¶

Convert table to column-oriented format.

Transforms the table from row-oriented (list of dicts) to column-oriented (dict of lists) format. Each column becomes a key with a list of values.

Returns:: SuperDict with column names as keys and lists of values
Return type:: SuperDict

Example:

>>> table = Table([
...     {"name": "Alice", "age": 30},
...     {"name": "Bob", "age": 25}
... ])
>>> result = table.to_columns()
>>> print(result)
{'name': ['Alice', 'Bob'], 'age': [30, 25]}

classmethod from_columns(dct) → Table¶

Create a table from a column-oriented dictionary.

Transforms a dictionary of lists (column-oriented) into a table (row-oriented list of dictionaries).

Parameters:: dct (dict[str, list]) – Dictionary with column names as keys and lists of values
Returns:: New table with row-oriented data
Return type:: Table

Example:

>>> data = {"name": ["Alice", "Bob"], "age": [30, 25]}
>>> result = Table.from_columns(data)
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]

get_index(cond) → list¶

Get indices of rows that satisfy a condition.

Returns a list of row indices where the condition function returns True. Useful for identifying which rows match specific criteria.

Parameters:: cond (Callable[[dict], bool]) – Function that takes a row (dict) and returns a boolean
Returns:: List of row indices that satisfy the condition
Return type:: list[int]

Example:

>>> table = Table([
...     {"name": "Alice", "age": 30},
...     {"name": "Bob", "age": 25},
...     {"name": "Charlie", "age": 35}
... ])
>>> indices = table.get_index(lambda row: row["age"] > 28)
>>> print(indices)
[0, 2]

replace(replacement=None, to_replace=None, fast=False) → Table¶

Replace specific values in the table.

Replaces specified values with new values in the table. Can replace values across all columns or target specific columns.

Parameters:

replacement (any or dict[str, any]) – New values to use as replacements
to_replace (any or dict[str, any]) – Values to be replaced
fast (bool) – If True, assume first row contains all columns

Returns:

New table with replaced values

Return type:

Table

Example:

>>> table = Table([
...     {"age": 25, "city": "Madrid"},
...     {"age": 30, "city": "Barcelona"}
... ])
>>> result = table.replace(
...     replacement={"age": 35, "city": "Paris"},
...     to_replace={"age": 25, "city": "Madrid"}
... )
>>> print(result)
[{'age': 35, 'city': 'Paris'}, {'age': 30, 'city': 'Barcelona'}]

replace_empty(replacement=None, fast=False) → Table¶

Replace empty values in the table.

Replaces empty values (None, empty strings, etc.) with specified replacement values. Can use different replacements for different columns.

Parameters:

replacement (any or dict[str, any]) – Values to use for replacing empty values
fast (bool) – If True, assume first row contains all columns

Returns:

New table with empty values replaced

Return type:

Table

Example:

>>> table = Table([
...     {"name": "Alice", "age": None},
...     {"name": "", "age": 25}
... ])
>>> result = table.replace_empty(replacement={"name": "Unknown", "age": 0})
>>> print(result)
[{'name': 'Alice', 'age': 0}, {'name': 'Unknown', 'age': 25}]

replace_nan(replacement=None) → Table¶

Replace NaN values in the table.

Replaces NaN (Not a Number) values with specified replacement values. Useful for cleaning numeric data with missing values.

Parameters:: replacement (any or dict[str, any]) – Values to use for replacing NaN values
Returns:: New table with NaN values replaced
Return type:: Table

Example:

>>> import math
>>> table = Table([
...     {"value": 10.5, "score": math.nan},
...     {"value": math.nan, "score": 85.0}
... ])
>>> result = table.replace_nan(replacement=0)
>>> print(result)
[{'value': 10.5, 'score': 0}, {'value': 0, 'score': 85.0}]

pivot_longer(cols, names_to='variable', value_to='value') → Table¶

Transform table from wide to long format.

“Lengthens” data by increasing the number of rows and decreasing the number of columns. The inverse transformation of pivot_wider().

Parameters:

cols (list[str]) – List of column names to pivot
names_to (str) – Name for the new column containing variable names
value_to (str) – Name for the new column containing values

Returns:

New table in long format

Return type:

Table

Example:

>>> table = Table([
...     {"id": 1, "Q1": 100, "Q2": 150, "Q3": 200},
...     {"id": 2, "Q1": 120, "Q2": 180, "Q3": 220}
... ])
>>> result = table.pivot_longer(["Q1", "Q2", "Q3"], "quarter", "sales")
>>> print(result)
[{'id': 1, 'quarter': 'Q1', 'sales': 100},
 {'id': 1, 'quarter': 'Q2', 'sales': 150},
 {'id': 1, 'quarter': 'Q3', 'sales': 200},
 {'id': 2, 'quarter': 'Q1', 'sales': 120},
 {'id': 2, 'quarter': 'Q2', 'sales': 180},
 {'id': 2, 'quarter': 'Q3', 'sales': 220}]

pivot_wider(names_from='variable', value_from='value', id_cols=None, values_fill=None) → Table¶

Transform table from long to wide format.

“Widens” data by increasing the number of columns and decreasing the number of rows. The inverse transformation of pivot_longer().

Parameters:

names_from (str) – Name of the column containing variable names
value_from (str) – Name of the column containing values
id_cols (list[str], optional) – Columns that uniquely identify each observation
values_fill (any or dict, optional) – Value or dict to fill missing values

Returns:

New table in wide format

Return type:

Table

Example:

>>> table = Table([
...     {"id": 1, "quarter": "Q1", "sales": 100},
...     {"id": 1, "quarter": "Q2", "sales": 150},
...     {"id": 2, "quarter": "Q1", "sales": 120}
... ])
>>> result = table.pivot_wider("quarter", "sales", "id")
>>> print(result)
[{'id': 1, 'Q1': 100, 'Q2': 150}, {'id': 2, 'Q1': 120, 'Q2': None}]

drop_empty(cols=None, fast=False) → Table¶

Remove rows with empty values in specified columns.

Drops rows where the specified columns contain empty values (None, empty strings, etc.).

Parameters:

cols (str or list[str], optional) – Column name(s) to check for empty values
fast (bool) – If True, assume first row contains all columns

Returns:

New table with empty rows removed

Return type:

Table

Example:

>>> table = Table([
...     {"name": "Alice", "age": 30},
...     {"name": "", "age": 25},
...     {"name": "Bob", "age": None}
... ])
>>> result = table.drop_empty("name")
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': None}]

lag_col(col, i=1, replace=False) → Table¶

Create a lagged version of a column.

Shifts the values of a column by a specified number of steps. Useful for time series analysis and creating features from previous values.

Parameters:

col (str) – Name of the column to lag
i (int) – Number of steps to lag (default: 1)
replace (bool) – If True, replace original column; if False, create new column

Returns:

New table with lagged column

Return type:

Table

Example:

>>> table = Table([
...     {"date": "2023-01", "sales": 100},
...     {"date": "2023-02", "sales": 150},
...     {"date": "2023-03", "sales": 200}
... ])
>>> result = table.lag_col("sales", 1)
>>> print(result)
[{'date': '2023-01', 'sales': 100, 'lag_sales_1': None},
 {'date': '2023-02', 'sales': 150, 'lag_sales_1': 100},
 {'date': '2023-03', 'sales': 200, 'lag_sales_1': 150}]

distinct(columns) → Table¶

Keep only unique combinations of values in specified columns.

Removes duplicate rows based on the values in the specified columns. When duplicates are found, the first occurrence is kept.

Parameters:: columns (str or list[str]) – Column name(s) to check for uniqueness
Returns:: New table with duplicate rows removed
Return type:: Table

Example:

>>> table = Table([
...     {"name": "Alice", "city": "Madrid"},
...     {"name": "Bob", "city": "Barcelona"},
...     {"name": "Alice", "city": "Madrid"}  # Duplicate
... ])
>>> result = table.distinct("name")
>>> print(result)
[{'name': 'Alice', 'city': 'Madrid'}, {'name': 'Bob', 'city': 'Barcelona'}]

order_by(columns, reverse=False) → Table¶

Sort the table by specified columns.

Reorders the table rows based on the values in the specified columns. Supports both ascending and descending order.

Parameters:

columns (str or list[str]) – Column name(s) to sort by
reverse (bool) – If True, sort in descending order

Returns:

New table with sorted rows

Return type:

Table

Example:

>>> table = Table([
...     {"name": "Charlie", "age": 35},
...     {"name": "Alice", "age": 30},
...     {"name": "Bob", "age": 25}
... ])
>>> result = table.order_by("age")
>>> print(result)
[{'name': 'Bob', 'age': 25},
 {'name': 'Alice', 'age': 30},
 {'name': 'Charlie', 'age': 35}]

drop_nested() → Table¶

Remove columns containing nested data structures.

Drops columns that contain nested values (dictionaries, lists, or tuples). Assumes homogeneous table structure and checks only the first row.

Returns:: New table with nested columns removed
Return type:: Table

Example:

>>> table = Table([
...     {"name": "Alice", "age": 30, "hobbies": ["reading", "swimming"]},
...     {"name": "Bob", "age": 25, "hobbies": ["gaming"]}
... ])
>>> result = table.drop_nested()
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]

check_type()¶

Validate that the table is a list of dictionaries.

Checks that the table structure is correct (list of dictionaries). Raises TypeError if the structure is invalid.

Returns:: None
Return type:: None
Raises:: TypeError – If table is not a list of dictionaries

Example:

>>> table = Table([{"name": "Alice"}, {"name": "Bob"}])
>>> table.check_type()  # No error
>>>
>>> invalid_table = Table([1, 2, 3])
>>> invalid_table.check_type()  # Raises TypeError

to_set2(columns) → pytups.TupList¶

Create a list of unique combinations from specified columns.

Extracts unique combinations of values from the specified columns and returns them as a TupList.

Parameters:: columns (str or list[str]) – Column name(s) to extract unique combinations from
Returns:: TupList containing unique combinations
Return type:: TupList
Raises:: KeyError – If specified columns don’t exist in the table

Example:

>>> table = Table([
...     {"name": "Alice", "city": "Madrid"},
...     {"name": "Bob", "city": "Barcelona"},
...     {"name": "Alice", "city": "Madrid"}  # Duplicate
... ])
>>> result = table.to_set2(["name", "city"])
>>> print(result)
[('Alice', 'Madrid'), ('Bob', 'Barcelona')]

to_param(keys, value, is_list=False) → pytups.SuperDict¶

Create a parameter dictionary from specified columns.

Creates a SuperDict using specified columns as keys and values. Useful for creating lookup dictionaries or parameter mappings.

Parameters:

keys (str or list[str]) – Column name(s) to use as dictionary keys
value (str) – Column name to use as dictionary values
is_list (bool) – If True, values can be lists; if False, expects unique keys

Returns:

SuperDict with keys and values from specified columns

Return type:

SuperDict

Raises:

KeyError – If specified columns don’t exist
ValueError – If keys are not unique and is_list=False

Example:

>>> table = Table([
...     {"id": 1, "name": "Alice", "age": 30},
...     {"id": 2, "name": "Bob", "age": 25}
... ])
>>> result = table.to_param("id", "name")
>>> print(result)
{1: 'Alice', 2: 'Bob'}

is_unique(columns) → bool¶

Check if combinations of values in specified columns are unique.

Determines whether the combination of values in the specified columns forms a unique key for each row.

Parameters:: columns (str or list[str]) – Column name(s) to check for uniqueness
Returns:: True if all combinations are unique
Return type:: bool

Example:

>>> table = Table([
...     {"id": 1, "name": "Alice"},
...     {"id": 2, "name": "Bob"},
...     {"id": 1, "name": "Charlie"}  # Duplicate id
... ])
>>> print(table.is_unique("id"))
False
>>> print(table.is_unique("name"))
True

add_row(**kwargs) → Table¶

Add a new row to the table.

Adds a new row with the specified values. Missing columns are filled with None values to maintain table structure.

Parameters:: kwargs (dict) – Column values in the format column_name=value
Returns:: New table with the added row
Return type:: Table

Example:

>>> table = Table([{"name": "Alice", "age": 30}])
>>> result = table.add_row(name="Bob", age=25)
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]

rbind(table: list) → Table¶

Bind another table by rows (row bind).

Combines the current table with another table by stacking rows. Missing columns are filled with None values.

Parameters:: table (list[dict] or Table) – Another table to bind (list of dicts or Table)
Returns:: New table with combined rows
Return type:: Table

Example:

>>> table1 = Table([{"name": "Alice", "age": 30}])
>>> table2 = [{"name": "Bob", "city": "Madrid"}]
>>> result = table1.rbind(table2)
>>> print(result)
[{'name': 'Alice', 'age': 30, 'city': None},
 {'name': 'Bob', 'age': None, 'city': 'Madrid'}]

col_apply(columns, func: Callable, **kwargs) → Table¶

Apply a function to specified columns.

Applies a function to the values in specified columns, transforming the data according to the function logic.

Parameters:

columns (str or list[str]) – Column name(s) to apply the function to
func (Callable) – Function to apply to column values
kwargs – Additional keyword arguments for the function

Returns:

New table with transformed columns

Return type:

Table

Example:

>>> table = Table([{"value": 10}, {"value": 20}])
>>> result = table.col_apply("value", lambda x: x * 2)
>>> print(result)
[{'value': 20}, {'value': 40}]

classmethod format_dataset(dataset) → dict¶

Convert dataset dictionary to use Table objects for list values.

Processes a dataset dictionary, converting any list values to Table objects while keeping other values unchanged.

Parameters:: dataset (dict) – Dictionary containing data in various formats
Returns:: Dictionary with list values converted to Table objects
Return type:: dict

Example:

>>> data = {
...     "users": [{"name": "Alice"}, {"name": "Bob"}],
...     "config": {"setting": "value"}
... }
>>> result = Table.format_dataset(data)
>>> print(type(result["users"]))
<class 'mango.table.pytups_table.Table'>

classmethod dataset_from_json(path, **kwargs) → dict¶

Load a JSON file and convert list values to Table objects.

Loads a JSON file and processes it using format_dataset to convert any list values to Table objects.

Parameters:

path (str) – Path to the JSON file
kwargs – Additional arguments for load_json

Returns:

Dictionary with list values converted to Table objects

Return type:

dict

Example:

>>> # Assuming data.json contains: {"users": [{"name": "Alice"}]}
>>> result = Table.dataset_from_json("data.json")
>>> print(type(result["users"]))
<class 'mango.table.pytups_table.Table'>

classmethod from_pandas(df) → Table¶

Create a Table from a pandas DataFrame.

Converts a pandas DataFrame to a Table object, preserving all data and column names. Requires pandas to be installed.

Parameters:: df (pandas.DataFrame) – Pandas DataFrame to convert
Returns:: New Table object with the DataFrame data
Return type:: Table
Raises:: ImportError – If pandas is not installed

Example:

>>> import pandas as pd
>>> df = pd.DataFrame({"name": ["Alice", "Bob"], "age": [30, 25]})
>>> table = Table.from_pandas(df)
>>> print(table)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]

to_pandas() → pandas.DataFrame¶

Convert the Table to a pandas DataFrame.

Creates a pandas DataFrame from the Table data, preserving all columns and rows. Requires pandas to be installed.

Returns:: Pandas DataFrame with the table data
Return type:: pandas.DataFrame
Raises:: ImportError – If pandas is not installed

Example:

>>> table = Table([{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}])
>>> df = table.to_pandas()
>>> print(df)
   name  age
0  Alice   30
1    Bob   25

pk_save(file)¶

Save the table to a pickle file.

Serializes the table object and saves it to a pickle file. Automatically adds .pickle extension if not present.

Parameters:: file (str) – Path to the pickle file
Returns:: None
Return type:: None

Example:

>>> table = Table([{"name": "Alice", "age": 30}])
>>> table.pk_save("my_table")
# Creates my_table.pickle file

classmethod pk_load(file) → Table¶

Load a Table from a pickle file.

Deserializes a Table object from a pickle file. Automatically adds .pickle extension if not present.

Parameters:: file (str) – Path to the pickle file
Returns:: Table object loaded from the file
Return type:: Table

Example:

>>> table = Table.pk_load("my_table")
>>> print(table)
[{'name': 'Alice', 'age': 30}]

to_json(path)¶

Export the table to a JSON file.

Saves the table data to a JSON file. Numeric column names are automatically converted to strings for JSON compatibility.

Parameters:: path (str) – Path to the JSON file
Returns:: None
Return type:: None

Example:

>>> table = Table([{"name": "Alice", "age": 30}])
>>> table.to_json("output.json")
# Creates output.json with table data

classmethod from_json(path, **kwargs) → Table¶

Create a table from a JSON file.

Loads data from a JSON file and creates a Table object. The JSON file should contain a list of dictionaries.

Parameters:

path (str) – Path to the JSON file
kwargs – Additional arguments for load_json

Returns:

New table with data from the JSON file

Return type:

Table

Example:

>>> # Assuming data.json contains: [{"name": "Alice", "age": 30}]
>>> table = Table.from_json("data.json")
>>> print(table)
[{'name': 'Alice', 'age': 30}]

apply(func: Callable, *args, **kwargs)¶

Apply a function to the entire table.

Passes the entire table as the first argument to the specified function. Useful for chaining custom functions that operate on the whole table.

Parameters:

func (Callable) – Function that takes the table as its first argument
args – Additional positional arguments for the function
kwargs – Additional keyword arguments for the function

Returns:

Result of the function call

Return type:

any

Example:

>>> table = Table([{"value": 10}, {"value": 20}])
>>> def double_table(t):
...     return t.mutate(value=lambda row: row["value"] * 2)
>>> result = table.apply(double_table)
>>> print(result)
[{'value': 20}, {'value': 40}]

classmethod dataset_from_excel(path, sheets=None) → dict¶

Load an Excel file and return a dictionary of Table objects.

Reads an Excel file and converts each sheet to a Table object. Returns a dictionary with sheet names as keys and Table objects as values.

Parameters:

path (str) – Path to the Excel file
sheets (list[str], optional) – List of sheet names to read (all sheets if None)

Returns:

Dictionary with sheet names and Table objects

Return type:

dict

Example:

>>> # Assuming data.xlsx has sheets "users" and "orders"
>>> result = Table.dataset_from_excel("data.xlsx")
>>> print(list(result.keys()))
['users', 'orders']

to_excel(path, sheet_name=None)¶

Export the table to an Excel file.

Saves the table data to an Excel file with the specified sheet name. If no sheet name is provided, uses “Sheet1” as default.

Parameters:

path (str) – Path to the Excel file
sheet_name (str, optional) – Name of the Excel sheet

Returns:

None

Return type:

None

Example:

>>> table = Table([{"name": "Alice", "age": 30}])
>>> table.to_excel("output.xlsx", "Users")
# Creates output.xlsx with "Users" sheet

classmethod from_csv(path, sep=',', encoding=None) → Table¶

Load a table from a CSV file.

Reads a CSV file and creates a Table object from the data. Supports custom separators and encodings.

Parameters:

path (str) – Path to the CSV file
sep (str) – Column separator (detected automatically if None)
encoding (str, optional) – File encoding

Returns:

New table with data from the CSV file

Return type:

Table

Example:

>>> # Assuming data.csv contains: name,age\nAlice,30\nBob,25
>>> table = Table.from_csv("data.csv")
>>> print(table)
[{'name': 'Alice', 'age': '30'}, {'name': 'Bob', 'age': '25'}]

to_csv(path, sep=',', encoding=None)¶

Export the table to a CSV file.

Saves the table data to a CSV file with the specified separator and encoding.

Parameters:

path (str) – Path to the CSV file
sep (str) – Column separator for the CSV file
encoding (str, optional) – File encoding

Returns:

None

Return type:

None

Example:

>>> table = Table([{"name": "Alice", "age": 30}])
>>> table.to_csv("output.csv", sep=";")
# Creates output.csv with semicolon separator