Table

The Table class is used to represent and work on a table or dataframe as a list of dict. It derives from the pytups.Tuplist class and provides powerful data manipulation capabilities similar to pandas DataFrames but with a more lightweight and flexible approach.

Key Features

  • Lightweight data structure based on Python dictionaries

  • Powerful data manipulation methods inspired by R’s dplyr

  • Easy conversion to/from JSON format

  • Integration with pandas DataFrames

  • Support for various join operations

  • Flexible data reshaping capabilities

Basic Example

For example the following table:

a

b

1

False

2

True

3

False

will be represented as:

[{"a": 1, "b": False}, {"a": 2, "b": True}, {"a": 3, "b": False}]

Common Operations

Data Manipulation

The Table class provides various methods for data manipulation:

  • mutate(): Add or modify columns

  • select(): Choose specific columns

  • drop(): Remove columns

  • filter(): Filter rows based on conditions

  • group_by(): Group data by columns

  • summarise(): Aggregate data after grouping

# Example of data manipulation
table = Table([{"a": 1, "b": 2}, {"a": 3, "b": 4}])

# Add a new column 'c' as sum of 'a' and 'b'
result = table.mutate(c=lambda v: v["a"] + v["b"])
# Result: [{"a": 1, "b": 2, "c": 3}, {"a": 3, "b": 4, "c": 7}]

Join Operations

The class supports various types of joins similar to SQL:

  • left_join(): Keep all rows from the left table

  • right_join(): Keep all rows from the right table

  • inner_join(): Keep only matching rows

  • full_join(): Keep all rows from both tables

  • auto_join(): Join a table with itself

# Example of join operation
table1 = Table([{"id": 1, "value": "A"}, {"id": 2, "value": "B"}])
table2 = Table([{"id": 1, "data": 100}, {"id": 3, "data": 300}])

result = table1.left_join(table2, by="id")
# Result: [{"id": 1, "value": "A", "data": 100}, {"id": 2, "value": "B", "data": None}]

Data Reshaping

Support for reshaping data between wide and long formats:

  • pivot_longer(): Convert wide format to long format

  • pivot_wider(): Convert long format to wide format

# Example of pivot operations
table = Table([{"id": 1, "x": 10, "y": 20}, {"id": 2, "x": 30, "y": 40}])

long = table.pivot_longer(["x", "y"], names_to="variable", value_to="value")
# Result: [{"id": 1, "variable": "x", "value": 10}, {"id": 1, "variable": "y", "value": 20},
#          {"id": 2, "variable": "x", "value": 30}, {"id": 2, "variable": "y", "value": 40}]

Import/Export

The Table class supports various formats:

  • JSON files (to_json/from_json)

  • Excel files (to_excel/from_excel)

  • CSV files (to_csv/from_csv)

  • Pandas DataFrames (to_pandas/from_pandas)

For more information on the underlying Tuplist class, see the pytups documentation.

Table class API

class mango.table.Table(*args: Any, **kwargs: Any)

Bases: TupList

Enhanced table class extending TupList with data manipulation capabilities.

A Table is a list of dictionaries that provides a rich set of methods for data manipulation, filtering, joining, and transformation operations. Inspired by R’s dplyr and data.table packages.

Parameters:
  • data (list[dict], optional) – Initial data as a list of dictionaries or None for empty table

  • check (bool) – Whether to validate that all rows are dictionaries

Example:
>>> # Create empty table
>>> table = Table()
>>>
>>> # Create table from list of dicts
>>> data = [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]
>>> table = Table(data)
>>> print(table)
Table (2 rows, 2 columns):
0 {'name': 'Alice', 'age': 30}
1 {'name': 'Bob', 'age': 25}
vapply(func: Callable, *args, **kwargs) Table

Apply a function to each element (row) of the table.

Maps the given function to each row in the table, returning a new table with the transformed values.

Parameters:
  • func (Callable) – Function to apply to each row

  • args – Additional positional arguments for the function

  • kwargs – Additional keyword arguments for the function

Returns:

New table with transformed values

Return type:

Table

Example:
>>> table = Table([{"x": 1, "y": 2}, {"x": 3, "y": 4}])
>>> result = table.vapply(lambda row: {"sum": row["x"] + row["y"]})
>>> print(result)
[{'sum': 3}, {'sum': 7}]
kapply(func, *args, **kwargs) Table

Apply a function to each key (column name) of the table.

Maps the given function to each column name in the table, returning a new table with transformed column names.

Parameters:
  • func (Callable) – Function to apply to each column name

  • args – Additional positional arguments for the function

  • kwargs – Additional keyword arguments for the function

Returns:

New table with transformed column names

Return type:

Table

Example:
>>> table = Table([{"name": "Alice", "age": 30}])
>>> result = table.kapply(lambda key: key.upper())
>>> print(result)
[{'NAME': 'Alice', 'AGE': 30}]
kvapply(func, *args, **kwargs) Table

Apply a function to each element with its index.

Maps the given function to each row in the table along with its index, returning a new table with transformed values.

Parameters:
  • func (Callable) – Function to apply to each (index, row) pair

  • args – Additional positional arguments for the function

  • kwargs – Additional keyword arguments for the function

Returns:

New table with transformed values

Return type:

Table

Example:
>>> table = Table([{"x": 1}, {"x": 2}])
>>> result = table.kvapply(lambda idx, row: {"index": idx, "value": row["x"]})
>>> print(result)
[{'index': 0, 'value': 1}, {'index': 1, 'value': 2}]
copy_shallow() Table

Create a shallow copy of the table.

Creates a new table with the same structure but shares references to the same row objects. Changes to row contents will affect both tables.

Returns:

New table with shallow copy of the data

Return type:

Table

Example:
>>> table = Table([{"name": "Alice", "age": 30}])
>>> copy = table.copy_shallow()
>>> copy[0]["age"] = 31
>>> print(table[0]["age"])  # Also changed to 31
31
copy_deep() Table

Create a deep copy of the table.

Creates a new table with completely independent copies of all data. Changes to the original table will not affect the copy.

Returns:

New table with deep copy of the data

Return type:

Table

Example:
>>> table = Table([{"name": "Alice", "age": 30}])
>>> copy = table.copy_deep()
>>> copy[0]["age"] = 31
>>> print(table[0]["age"])  # Still 30
30
vfilter(function: Callable) Table

Filter rows based on a condition (internal method).

Keeps only the rows for which the provided function returns True. This is the internal implementation used by the public filter method.

Parameters:

function (Callable[[dict], bool]) – Function that takes a row (dict) and returns a boolean

Returns:

New table containing only rows that satisfy the condition

Return type:

Table

unique() Table

Remove duplicate rows from the table.

Eliminates rows with identical values across all columns. When duplicates are found, the first occurrence is kept.

Returns:

New table with duplicate rows removed

Return type:

Table

Raises:

NotImplementedError – If table structure is not compatible

Example:
>>> table = Table([
...     {"name": "Alice", "age": 30},
...     {"name": "Bob", "age": 25},
...     {"name": "Alice", "age": 30}  # Duplicate
... ])
>>> result = table.unique()
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]
unique2() Table

Remove duplicate rows from the table (alternative implementation).

Alternative implementation of unique() that eliminates rows with identical values across all columns. When duplicates are found, the first occurrence is kept.

Returns:

New table with duplicate rows removed

Return type:

Table

Raises:

NotImplementedError – If table structure is not compatible

Example:
>>> table = Table([
...     {"name": "Alice", "age": 30},
...     {"name": "Bob", "age": 25},
...     {"name": "Alice", "age": 30}  # Duplicate
... ])
>>> result = table.unique2()
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]
sorted(**kwargs) Table

Sort the table according to a custom key function.

Sorts the table using a custom key function. This method is limited to simple data types and may not work with complex table structures.

Parameters:

kwargs (dict) – Arguments for the sorted function

Returns:

New table with sorted rows

Return type:

Table

Raises:

NotImplementedError – If table structure is not compatible

Example:
>>> table = Table([{"value": 3}, {"value": 1}, {"value": 2}])
>>> result = table.sorted(key=lambda x: x["value"])
>>> print(result)
[{'value': 1}, {'value': 2}, {'value': 3}]
take(*args, use_numpy=False) pytups.TupList

Extract values from specified columns of the table.

Returns a TupList containing tuples of values from the specified columns. Each tuple represents the values from one row for the selected columns.

Parameters:
  • args (str) – Names of columns to extract

  • use_numpy (bool) – Whether to use numpy methods for extraction

Returns:

TupList of tuples containing column values

Return type:

TupList

Example:
>>> table = Table([{"a": 1, "b": 2}, {"a": 3, "b": 4}])
>>> result = table.take("a", "b")
>>> print(result)
[(1, 2), (3, 4)]
show_rows(n1, n2=None) str

Display specific rows of the table as a formatted string.

Shows the specified row or range of rows with their indices. Used internally by the string representation of the table.

Parameters:
  • n1 (int) – Starting row index

  • n2 (int, optional) – Ending row index (if None, shows only row n1)

Returns:

Formatted string showing the specified rows

Return type:

str

Example:
>>> table = Table([{"name": "Alice"}, {"name": "Bob"}, {"name": "Charlie"}])
>>> result = table.show_rows(0, 2)
>>> print(result)
0 {'name': 'Alice'}
1 {'name': 'Bob'}
head(n=10) Table

Return the first n rows of the table.

Creates a new table containing only the first n rows. If the table has fewer than n rows, returns the entire table.

Parameters:

n (int) – Number of rows to return (default: 10)

Returns:

New table with the first n rows

Return type:

Table

Example:
>>> table = Table([
...     {"name": "Alice"}, {"name": "Bob"}, {"name": "Charlie"},
...     {"name": "David"}, {"name": "Eve"}
... ])
>>> result = table.head(3)
>>> print(result)
[{'name': 'Alice'}, {'name': 'Bob'}, {'name': 'Charlie'}]
peek(n=3, name=None) Table

Display a preview of the table showing first, middle, and last rows.

Shows the first n rows, middle n rows, and last n rows of the table. Useful for getting an overview of large tables without printing everything.

Parameters:
  • n (int) – Number of rows to show in each section (default: 3)

  • name (str, optional) – Optional name or message to display with the table

Returns:

The original table (unchanged)

Return type:

Table

Example:
>>> table = Table([{"id": i, "value": i*2} for i in range(10)])
>>> table.peek(2, "Sample Data")
Sample Data: Table (10 rows, 2 columns):
0 {'id': 0, 'value': 0}
1 {'id': 1, 'value': 2}
...
4 {'id': 4, 'value': 8}
5 {'id': 5, 'value': 10}
...
8 {'id': 8, 'value': 16}
9 {'id': 9, 'value': 18}
mutate(**kwargs) Table

Add or modify columns in the table.

Creates new columns or modifies existing ones using various methods: - Single values applied to all rows - Lists of values for each row - Functions that operate on row data

Note: All changes are applied to the original table structure and do not take into account other changes made in the same call.

Parameters:

kwargs (dict) – Named arguments with column names and their values

Returns:

New table with modified columns

Return type:

Table

Example:
>>> table = Table([{"x": 1, "y": 2}, {"x": 3, "y": 4}])
>>> result = table.mutate(
...     z=10,  # Constant value
...     sum=lambda row: row["x"] + row["y"],  # Function
...     product=[2, 12]  # List of values
... )
>>> print(result)
[{'x': 1, 'y': 2, 'z': 10, 'sum': 3, 'product': 2},
 {'x': 3, 'y': 4, 'z': 10, 'sum': 7, 'product': 12}]
group_by(col) pytups.SuperDict

Group rows of the table by specified column values.

Groups the table rows based on the values in the specified column(s). Returns a SuperDict where keys are the unique values and values are lists of rows that have that value.

Parameters:

col (str or list[str]) – Column name or list of column names to group by

Returns:

SuperDict with grouped data

Return type:

SuperDict

Example:
>>> table = Table([
...     {"name": "Alice", "city": "Madrid"},
...     {"name": "Bob", "city": "Barcelona"},
...     {"name": "Charlie", "city": "Madrid"}
... ])
>>> result = table.group_by("city")
>>> print(result)
{'Madrid': [{'name': 'Alice', 'city': 'Madrid'}, {'name': 'Charlie', 'city': 'Madrid'}],
 'Barcelona': [{'name': 'Bob', 'city': 'Barcelona'}]}
group_mutate(group_by, **func) Table

Group by specified columns and apply functions to other columns.

Groups the table by the specified columns and applies aggregation functions to the remaining columns. Similar to SQL GROUP BY with aggregate functions.

Parameters:
  • group_by (str or list[str]) – Column name or list of column names to group by

  • func (dict[str, Callable]) – Functions to apply to columns (e.g., a=sum, b=mean)

Returns:

New table with grouped and aggregated data

Return type:

Table

Example:
>>> table = Table([
...     {"city": "Madrid", "sales": 100},
...     {"city": "Madrid", "sales": 150},
...     {"city": "Barcelona", "sales": 200}
... ])
>>> result = table.group_mutate("city", sales=sum)
>>> print(result)
[{'city': 'Madrid', 'sales': 250}, {'city': 'Barcelona', 'sales': 200}]
sum_all(group_by=None) Table

Group by specified columns and sum all numeric columns.

Groups the table by the specified columns and sums all numeric columns in each group. Non-numeric columns are ignored.

Parameters:

group_by (str or list[str], optional) – Column name or list of column names to group by

Returns:

New table with grouped and summed data

Return type:

Table

Example:
>>> table = Table([
...     {"category": "A", "value1": 10, "value2": 20},
...     {"category": "A", "value1": 15, "value2": 25},
...     {"category": "B", "value1": 5, "value2": 10}
... ])
>>> result = table.sum_all("category")
>>> print(result)
[{'category': 'A', 'value1': 25, 'value2': 45},
 {'category': 'B', 'value1': 5, 'value2': 10}]
summarise(group_by=None, default: Callable = None, **func) Table

Group by specified columns and apply aggregation functions.

Groups the table by specified columns and applies custom aggregation functions to other columns. More flexible than group_mutate as it allows specifying a default function for non-explicitly handled columns.

Parameters:
  • group_by (str or list[str], optional) – Column name or list of column names to group by

  • default (Callable, optional) – Default function to apply to columns not explicitly specified

  • func (dict[str, Callable]) – Functions to apply to specific columns

Returns:

New table with grouped and summarized data

Return type:

Table

Example:
>>> table = Table([
...     {"category": "A", "value": 10, "count": 1},
...     {"category": "A", "value": 15, "count": 2},
...     {"category": "B", "value": 5, "count": 1}
... ])
>>> result = table.summarise("category", value=sum, count=sum)
>>> print(result)
[{'category': 'A', 'value': 25, 'count': 3},
 {'category': 'B', 'value': 5, 'count': 1}]
join(table2, by=None, suffix=None, jtype='full', empty=None, if_empty_table_1=None, if_empty_table_2=None) Table

Join two tables using various join types.

Performs table joins inspired by R’s dplyr join functions. Supports different join types and flexible column matching strategies.

Parameters:
  • table2 (Table or list[dict]) – Second table to join with

  • by (list, dict, or None) – Column specification for joining

  • suffix (list[str], optional) – Suffixes for disambiguating column names

  • jtype (str) – Type of join (“full”, “left”, “right”, “inner”)

  • empty – Value to use for empty cells created by the join

  • if_empty_table_1 – Replacement if table 1 is empty

  • if_empty_table_2 – Replacement if table 2 is empty

Returns:

New table containing the joined data

Return type:

Table

Example:
>>> table1 = Table([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
>>> table2 = Table([{"id": 1, "age": 30}, {"id": 3, "age": 25}])
>>> result = table1.join(table2, by="id", jtype="left")
>>> print(result)
[{'id': 1, 'name': 'Alice', 'age': 30}, {'id': 2, 'name': 'Bob', 'age': None}]
left_join(table2, by=None, suffix=None, empty=None, if_empty_table_1=None, if_empty_table_2=None) Table

Perform a left join with another table.

Returns all rows from the left table (self) and matching rows from the right table (table2). Rows from the left table without matches will have None values for columns from the right table.

Parameters:
  • table2 (Table or list[dict]) – Second table to join with

  • by (list, dict, or None) – Column specification for joining

  • suffix (list[str], optional) – Suffixes for disambiguating column names

  • empty – Value to use for empty cells created by the join

  • if_empty_table_1 – Replacement if table 1 is empty

  • if_empty_table_2 – Replacement if table 2 is empty

Returns:

New table containing the left join result

Return type:

Table

Example:
>>> table1 = Table([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
>>> table2 = Table([{"id": 1, "age": 30}, {"id": 3, "age": 25}])
>>> result = table1.left_join(table2, by="id")
>>> print(result)
[{'id': 1, 'name': 'Alice', 'age': 30}, {'id': 2, 'name': 'Bob', 'age': None}]
right_join(table2, by=None, suffix=None, empty=None, if_empty_table_1=None, if_empty_table_2=None) Table

Perform a right join with another table.

Returns all rows from the right table (table2) and matching rows from the left table (self). Rows from the right table without matches will have None values for columns from the left table.

Parameters:
  • table2 (Table or list[dict]) – Second table to join with

  • by (list, dict, or None) – Column specification for joining

  • suffix (list[str], optional) – Suffixes for disambiguating column names

  • empty – Value to use for empty cells created by the join

  • if_empty_table_1 – Replacement if table 1 is empty

  • if_empty_table_2 – Replacement if table 2 is empty

Returns:

New table containing the right join result

Return type:

Table

Example:
>>> table1 = Table([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
>>> table2 = Table([{"id": 1, "age": 30}, {"id": 3, "age": 25}])
>>> result = table1.right_join(table2, by="id")
>>> print(result)
[{'id': 1, 'name': 'Alice', 'age': 30}, {'id': 3, 'name': None, 'age': 25}]
full_join(table2, by=None, suffix=None, empty=None, if_empty_table_1=None, if_empty_table_2=None) Table

Perform a full outer join with another table.

Returns all rows from both tables, with None values where there are no matches. This is the default join type and combines left and right joins.

Parameters:
  • table2 (Table or list[dict]) – Second table to join with

  • by (list, dict, or None) – Column specification for joining

  • suffix (list[str], optional) – Suffixes for disambiguating column names

  • empty – Value to use for empty cells created by the join

  • if_empty_table_1 – Replacement if table 1 is empty

  • if_empty_table_2 – Replacement if table 2 is empty

Returns:

New table containing the full join result

Return type:

Table

Example:
>>> table1 = Table([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
>>> table2 = Table([{"id": 1, "age": 30}, {"id": 3, "age": 25}])
>>> result = table1.full_join(table2, by="id")
>>> print(result)
[{'id': 1, 'name': 'Alice', 'age': 30},
 {'id': 2, 'name': 'Bob', 'age': None},
 {'id': 3, 'name': None, 'age': 25}]
inner_join(table2, by=None, suffix=None, empty=None, if_empty_table_1=None, if_empty_table_2=None) Table

Perform an inner join with another table.

Returns only rows that have matching values in both tables. Rows without matches in either table are excluded from the result.

Parameters:
  • table2 (Table or list[dict]) – Second table to join with

  • by (list, dict, or None) – Column specification for joining

  • suffix (list[str], optional) – Suffixes for disambiguating column names

  • empty – Value to use for empty cells created by the join

  • if_empty_table_1 – Replacement if table 1 is empty

  • if_empty_table_2 – Replacement if table 2 is empty

Returns:

New table containing only matching rows

Return type:

Table

Example:
>>> table1 = Table([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
>>> table2 = Table([{"id": 1, "age": 30}, {"id": 3, "age": 25}])
>>> result = table1.inner_join(table2, by="id")
>>> print(result)
[{'id': 1, 'name': 'Alice', 'age': 30}]
auto_join(by=None, suffix=None, empty=None) Table

Join a table with itself to create combinations.

Performs a self-join to create all possible combinations of rows. Useful for creating Cartesian products or finding relationships within the same table.

Parameters:
  • by (list, dict, or None) – Column specification for the self-join

  • suffix (list[str], optional) – Suffixes to add to column names to distinguish them

  • empty – Value to use for empty cells created by the join

Returns:

New table with all combinations

Return type:

Table

Example:
>>> table = Table([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
>>> result = table.auto_join()
>>> print(result)
[{'id': 1, 'name': 'Alice', 'id_2': 1, 'name_2': 'Alice'},
 {'id': 1, 'name': 'Alice', 'id_2': 2, 'name_2': 'Bob'},
 {'id': 2, 'name': 'Bob', 'id_2': 1, 'name_2': 'Alice'},
 {'id': 2, 'name': 'Bob', 'id_2': 2, 'name_2': 'Bob'}]
select(*args) Table

Select specific columns from the table.

Creates a new table containing only the specified columns. Maintains the original row order.

Parameters:

args (str) – Names of columns to select

Returns:

New table with only the selected columns

Return type:

Table

Raises:

KeyError – If any specified column doesn’t exist

Example:
>>> table = Table([
...     {"id": 1, "name": "Alice", "age": 30, "city": "Madrid"},
...     {"id": 2, "name": "Bob", "age": 25, "city": "Barcelona"}
... ])
>>> result = table.select("name", "age")
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]
drop(*args) Table

Remove specific columns from the table.

Creates a new table with the specified columns removed. Maintains the original row order.

Parameters:

args (str) – Names of columns to remove

Returns:

New table without the specified columns

Return type:

Table

Example:
>>> table = Table([
...     {"id": 1, "name": "Alice", "age": 30, "city": "Madrid"},
...     {"id": 2, "name": "Bob", "age": 25, "city": "Barcelona"}
... ])
>>> result = table.drop("id", "city")
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]
rename(**kwargs) Table

Rename columns in the table.

Changes column names using a mapping of old names to new names. Maintains the original row order and data.

Parameters:

kwargs (dict[str, str]) – Mapping of old column names to new names

Returns:

New table with renamed columns

Return type:

Table

Raises:

KeyError – If any old column name doesn’t exist

Example:
>>> table = Table([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
>>> result = table.rename(id="user_id", name="full_name")
>>> print(result)
[{'user_id': 1, 'full_name': 'Alice'}, {'user_id': 2, 'full_name': 'Bob'}]
filter(func) Table

Filter rows based on a condition.

Keeps only the rows for which the provided function returns True. Returns an empty table if the original table is empty.

Parameters:

func (Callable[[dict], bool]) – Function that takes a row (dict) and returns a boolean

Returns:

New table containing only rows that satisfy the condition

Return type:

Table

Example:
>>> table = Table([
...     {"name": "Alice", "age": 30},
...     {"name": "Bob", "age": 25},
...     {"name": "Charlie", "age": 35}
... ])
>>> result = table.filter(lambda row: row["age"] > 28)
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Charlie', 'age': 35}]
get_col_names(fast=False) list

Get the names of all columns in the table.

Returns a list of column names. By default, scans all rows to ensure all possible columns are included. Use fast=True for better performance if you’re certain the first row contains all columns.

Parameters:

fast (bool) – If True, only check the first row for column names

Returns:

List of column names

Return type:

list[str]

Raises:

IndexError – If table is empty

Example:
>>> table = Table([{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}])
>>> columns = table.get_col_names()
>>> print(columns)
['name', 'age']
to_columns() pytups.SuperDict

Convert table to column-oriented format.

Transforms the table from row-oriented (list of dicts) to column-oriented (dict of lists) format. Each column becomes a key with a list of values.

Returns:

SuperDict with column names as keys and lists of values

Return type:

SuperDict

Example:
>>> table = Table([
...     {"name": "Alice", "age": 30},
...     {"name": "Bob", "age": 25}
... ])
>>> result = table.to_columns()
>>> print(result)
{'name': ['Alice', 'Bob'], 'age': [30, 25]}
classmethod from_columns(dct) Table

Create a table from a column-oriented dictionary.

Transforms a dictionary of lists (column-oriented) into a table (row-oriented list of dictionaries).

Parameters:

dct (dict[str, list]) – Dictionary with column names as keys and lists of values

Returns:

New table with row-oriented data

Return type:

Table

Example:
>>> data = {"name": ["Alice", "Bob"], "age": [30, 25]}
>>> result = Table.from_columns(data)
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]
get_index(cond) list

Get indices of rows that satisfy a condition.

Returns a list of row indices where the condition function returns True. Useful for identifying which rows match specific criteria.

Parameters:

cond (Callable[[dict], bool]) – Function that takes a row (dict) and returns a boolean

Returns:

List of row indices that satisfy the condition

Return type:

list[int]

Example:
>>> table = Table([
...     {"name": "Alice", "age": 30},
...     {"name": "Bob", "age": 25},
...     {"name": "Charlie", "age": 35}
... ])
>>> indices = table.get_index(lambda row: row["age"] > 28)
>>> print(indices)
[0, 2]
replace(replacement=None, to_replace=None, fast=False) Table

Replace specific values in the table.

Replaces specified values with new values in the table. Can replace values across all columns or target specific columns.

Parameters:
  • replacement (any or dict[str, any]) – New values to use as replacements

  • to_replace (any or dict[str, any]) – Values to be replaced

  • fast (bool) – If True, assume first row contains all columns

Returns:

New table with replaced values

Return type:

Table

Example:
>>> table = Table([
...     {"age": 25, "city": "Madrid"},
...     {"age": 30, "city": "Barcelona"}
... ])
>>> result = table.replace(
...     replacement={"age": 35, "city": "Paris"},
...     to_replace={"age": 25, "city": "Madrid"}
... )
>>> print(result)
[{'age': 35, 'city': 'Paris'}, {'age': 30, 'city': 'Barcelona'}]
replace_empty(replacement=None, fast=False) Table

Replace empty values in the table.

Replaces empty values (None, empty strings, etc.) with specified replacement values. Can use different replacements for different columns.

Parameters:
  • replacement (any or dict[str, any]) – Values to use for replacing empty values

  • fast (bool) – If True, assume first row contains all columns

Returns:

New table with empty values replaced

Return type:

Table

Example:
>>> table = Table([
...     {"name": "Alice", "age": None},
...     {"name": "", "age": 25}
... ])
>>> result = table.replace_empty(replacement={"name": "Unknown", "age": 0})
>>> print(result)
[{'name': 'Alice', 'age': 0}, {'name': 'Unknown', 'age': 25}]
replace_nan(replacement=None) Table

Replace NaN values in the table.

Replaces NaN (Not a Number) values with specified replacement values. Useful for cleaning numeric data with missing values.

Parameters:

replacement (any or dict[str, any]) – Values to use for replacing NaN values

Returns:

New table with NaN values replaced

Return type:

Table

Example:
>>> import math
>>> table = Table([
...     {"value": 10.5, "score": math.nan},
...     {"value": math.nan, "score": 85.0}
... ])
>>> result = table.replace_nan(replacement=0)
>>> print(result)
[{'value': 10.5, 'score': 0}, {'value': 0, 'score': 85.0}]
pivot_longer(cols, names_to='variable', value_to='value') Table

Transform table from wide to long format.

“Lengthens” data by increasing the number of rows and decreasing the number of columns. The inverse transformation of pivot_wider().

Parameters:
  • cols (list[str]) – List of column names to pivot

  • names_to (str) – Name for the new column containing variable names

  • value_to (str) – Name for the new column containing values

Returns:

New table in long format

Return type:

Table

Example:
>>> table = Table([
...     {"id": 1, "Q1": 100, "Q2": 150, "Q3": 200},
...     {"id": 2, "Q1": 120, "Q2": 180, "Q3": 220}
... ])
>>> result = table.pivot_longer(["Q1", "Q2", "Q3"], "quarter", "sales")
>>> print(result)
[{'id': 1, 'quarter': 'Q1', 'sales': 100},
 {'id': 1, 'quarter': 'Q2', 'sales': 150},
 {'id': 1, 'quarter': 'Q3', 'sales': 200},
 {'id': 2, 'quarter': 'Q1', 'sales': 120},
 {'id': 2, 'quarter': 'Q2', 'sales': 180},
 {'id': 2, 'quarter': 'Q3', 'sales': 220}]
pivot_wider(names_from='variable', value_from='value', id_cols=None, values_fill=None) Table

Transform table from long to wide format.

“Widens” data by increasing the number of columns and decreasing the number of rows. The inverse transformation of pivot_longer().

Parameters:
  • names_from (str) – Name of the column containing variable names

  • value_from (str) – Name of the column containing values

  • id_cols (list[str], optional) – Columns that uniquely identify each observation

  • values_fill (any or dict, optional) – Value or dict to fill missing values

Returns:

New table in wide format

Return type:

Table

Example:
>>> table = Table([
...     {"id": 1, "quarter": "Q1", "sales": 100},
...     {"id": 1, "quarter": "Q2", "sales": 150},
...     {"id": 2, "quarter": "Q1", "sales": 120}
... ])
>>> result = table.pivot_wider("quarter", "sales", "id")
>>> print(result)
[{'id': 1, 'Q1': 100, 'Q2': 150}, {'id': 2, 'Q1': 120, 'Q2': None}]
drop_empty(cols=None, fast=False) Table

Remove rows with empty values in specified columns.

Drops rows where the specified columns contain empty values (None, empty strings, etc.).

Parameters:
  • cols (str or list[str], optional) – Column name(s) to check for empty values

  • fast (bool) – If True, assume first row contains all columns

Returns:

New table with empty rows removed

Return type:

Table

Example:
>>> table = Table([
...     {"name": "Alice", "age": 30},
...     {"name": "", "age": 25},
...     {"name": "Bob", "age": None}
... ])
>>> result = table.drop_empty("name")
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': None}]
lag_col(col, i=1, replace=False) Table

Create a lagged version of a column.

Shifts the values of a column by a specified number of steps. Useful for time series analysis and creating features from previous values.

Parameters:
  • col (str) – Name of the column to lag

  • i (int) – Number of steps to lag (default: 1)

  • replace (bool) – If True, replace original column; if False, create new column

Returns:

New table with lagged column

Return type:

Table

Example:
>>> table = Table([
...     {"date": "2023-01", "sales": 100},
...     {"date": "2023-02", "sales": 150},
...     {"date": "2023-03", "sales": 200}
... ])
>>> result = table.lag_col("sales", 1)
>>> print(result)
[{'date': '2023-01', 'sales': 100, 'lag_sales_1': None},
 {'date': '2023-02', 'sales': 150, 'lag_sales_1': 100},
 {'date': '2023-03', 'sales': 200, 'lag_sales_1': 150}]
distinct(columns) Table

Keep only unique combinations of values in specified columns.

Removes duplicate rows based on the values in the specified columns. When duplicates are found, the first occurrence is kept.

Parameters:

columns (str or list[str]) – Column name(s) to check for uniqueness

Returns:

New table with duplicate rows removed

Return type:

Table

Example:
>>> table = Table([
...     {"name": "Alice", "city": "Madrid"},
...     {"name": "Bob", "city": "Barcelona"},
...     {"name": "Alice", "city": "Madrid"}  # Duplicate
... ])
>>> result = table.distinct("name")
>>> print(result)
[{'name': 'Alice', 'city': 'Madrid'}, {'name': 'Bob', 'city': 'Barcelona'}]
order_by(columns, reverse=False) Table

Sort the table by specified columns.

Reorders the table rows based on the values in the specified columns. Supports both ascending and descending order.

Parameters:
  • columns (str or list[str]) – Column name(s) to sort by

  • reverse (bool) – If True, sort in descending order

Returns:

New table with sorted rows

Return type:

Table

Example:
>>> table = Table([
...     {"name": "Charlie", "age": 35},
...     {"name": "Alice", "age": 30},
...     {"name": "Bob", "age": 25}
... ])
>>> result = table.order_by("age")
>>> print(result)
[{'name': 'Bob', 'age': 25},
 {'name': 'Alice', 'age': 30},
 {'name': 'Charlie', 'age': 35}]
drop_nested() Table

Remove columns containing nested data structures.

Drops columns that contain nested values (dictionaries, lists, or tuples). Assumes homogeneous table structure and checks only the first row.

Returns:

New table with nested columns removed

Return type:

Table

Example:
>>> table = Table([
...     {"name": "Alice", "age": 30, "hobbies": ["reading", "swimming"]},
...     {"name": "Bob", "age": 25, "hobbies": ["gaming"]}
... ])
>>> result = table.drop_nested()
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]
check_type()

Validate that the table is a list of dictionaries.

Checks that the table structure is correct (list of dictionaries). Raises TypeError if the structure is invalid.

Returns:

None

Return type:

None

Raises:

TypeError – If table is not a list of dictionaries

Example:
>>> table = Table([{"name": "Alice"}, {"name": "Bob"}])
>>> table.check_type()  # No error
>>>
>>> invalid_table = Table([1, 2, 3])
>>> invalid_table.check_type()  # Raises TypeError
to_set2(columns) pytups.TupList

Create a list of unique combinations from specified columns.

Extracts unique combinations of values from the specified columns and returns them as a TupList.

Parameters:

columns (str or list[str]) – Column name(s) to extract unique combinations from

Returns:

TupList containing unique combinations

Return type:

TupList

Raises:

KeyError – If specified columns don’t exist in the table

Example:
>>> table = Table([
...     {"name": "Alice", "city": "Madrid"},
...     {"name": "Bob", "city": "Barcelona"},
...     {"name": "Alice", "city": "Madrid"}  # Duplicate
... ])
>>> result = table.to_set2(["name", "city"])
>>> print(result)
[('Alice', 'Madrid'), ('Bob', 'Barcelona')]
to_param(keys, value, is_list=False) pytups.SuperDict

Create a parameter dictionary from specified columns.

Creates a SuperDict using specified columns as keys and values. Useful for creating lookup dictionaries or parameter mappings.

Parameters:
  • keys (str or list[str]) – Column name(s) to use as dictionary keys

  • value (str) – Column name to use as dictionary values

  • is_list (bool) – If True, values can be lists; if False, expects unique keys

Returns:

SuperDict with keys and values from specified columns

Return type:

SuperDict

Raises:
  • KeyError – If specified columns don’t exist

  • ValueError – If keys are not unique and is_list=False

Example:
>>> table = Table([
...     {"id": 1, "name": "Alice", "age": 30},
...     {"id": 2, "name": "Bob", "age": 25}
... ])
>>> result = table.to_param("id", "name")
>>> print(result)
{1: 'Alice', 2: 'Bob'}
is_unique(columns) bool

Check if combinations of values in specified columns are unique.

Determines whether the combination of values in the specified columns forms a unique key for each row.

Parameters:

columns (str or list[str]) – Column name(s) to check for uniqueness

Returns:

True if all combinations are unique

Return type:

bool

Example:
>>> table = Table([
...     {"id": 1, "name": "Alice"},
...     {"id": 2, "name": "Bob"},
...     {"id": 1, "name": "Charlie"}  # Duplicate id
... ])
>>> print(table.is_unique("id"))
False
>>> print(table.is_unique("name"))
True
add_row(**kwargs) Table

Add a new row to the table.

Adds a new row with the specified values. Missing columns are filled with None values to maintain table structure.

Parameters:

kwargs (dict) – Column values in the format column_name=value

Returns:

New table with the added row

Return type:

Table

Example:
>>> table = Table([{"name": "Alice", "age": 30}])
>>> result = table.add_row(name="Bob", age=25)
>>> print(result)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]
rbind(table: list) Table

Bind another table by rows (row bind).

Combines the current table with another table by stacking rows. Missing columns are filled with None values.

Parameters:

table (list[dict] or Table) – Another table to bind (list of dicts or Table)

Returns:

New table with combined rows

Return type:

Table

Example:
>>> table1 = Table([{"name": "Alice", "age": 30}])
>>> table2 = [{"name": "Bob", "city": "Madrid"}]
>>> result = table1.rbind(table2)
>>> print(result)
[{'name': 'Alice', 'age': 30, 'city': None},
 {'name': 'Bob', 'age': None, 'city': 'Madrid'}]
col_apply(columns, func: Callable, **kwargs) Table

Apply a function to specified columns.

Applies a function to the values in specified columns, transforming the data according to the function logic.

Parameters:
  • columns (str or list[str]) – Column name(s) to apply the function to

  • func (Callable) – Function to apply to column values

  • kwargs – Additional keyword arguments for the function

Returns:

New table with transformed columns

Return type:

Table

Example:
>>> table = Table([{"value": 10}, {"value": 20}])
>>> result = table.col_apply("value", lambda x: x * 2)
>>> print(result)
[{'value': 20}, {'value': 40}]
classmethod format_dataset(dataset) dict

Convert dataset dictionary to use Table objects for list values.

Processes a dataset dictionary, converting any list values to Table objects while keeping other values unchanged.

Parameters:

dataset (dict) – Dictionary containing data in various formats

Returns:

Dictionary with list values converted to Table objects

Return type:

dict

Example:
>>> data = {
...     "users": [{"name": "Alice"}, {"name": "Bob"}],
...     "config": {"setting": "value"}
... }
>>> result = Table.format_dataset(data)
>>> print(type(result["users"]))
<class 'mango.table.pytups_table.Table'>
classmethod dataset_from_json(path, **kwargs) dict

Load a JSON file and convert list values to Table objects.

Loads a JSON file and processes it using format_dataset to convert any list values to Table objects.

Parameters:
  • path (str) – Path to the JSON file

  • kwargs – Additional arguments for load_json

Returns:

Dictionary with list values converted to Table objects

Return type:

dict

Example:
>>> # Assuming data.json contains: {"users": [{"name": "Alice"}]}
>>> result = Table.dataset_from_json("data.json")
>>> print(type(result["users"]))
<class 'mango.table.pytups_table.Table'>
classmethod from_pandas(df) Table

Create a Table from a pandas DataFrame.

Converts a pandas DataFrame to a Table object, preserving all data and column names. Requires pandas to be installed.

Parameters:

df (pandas.DataFrame) – Pandas DataFrame to convert

Returns:

New Table object with the DataFrame data

Return type:

Table

Raises:

ImportError – If pandas is not installed

Example:
>>> import pandas as pd
>>> df = pd.DataFrame({"name": ["Alice", "Bob"], "age": [30, 25]})
>>> table = Table.from_pandas(df)
>>> print(table)
[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]
to_pandas() pandas.DataFrame

Convert the Table to a pandas DataFrame.

Creates a pandas DataFrame from the Table data, preserving all columns and rows. Requires pandas to be installed.

Returns:

Pandas DataFrame with the table data

Return type:

pandas.DataFrame

Raises:

ImportError – If pandas is not installed

Example:
>>> table = Table([{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}])
>>> df = table.to_pandas()
>>> print(df)
   name  age
0  Alice   30
1    Bob   25
pk_save(file)

Save the table to a pickle file.

Serializes the table object and saves it to a pickle file. Automatically adds .pickle extension if not present.

Parameters:

file (str) – Path to the pickle file

Returns:

None

Return type:

None

Example:
>>> table = Table([{"name": "Alice", "age": 30}])
>>> table.pk_save("my_table")
# Creates my_table.pickle file
classmethod pk_load(file) Table

Load a Table from a pickle file.

Deserializes a Table object from a pickle file. Automatically adds .pickle extension if not present.

Parameters:

file (str) – Path to the pickle file

Returns:

Table object loaded from the file

Return type:

Table

Example:
>>> table = Table.pk_load("my_table")
>>> print(table)
[{'name': 'Alice', 'age': 30}]
to_json(path)

Export the table to a JSON file.

Saves the table data to a JSON file. Numeric column names are automatically converted to strings for JSON compatibility.

Parameters:

path (str) – Path to the JSON file

Returns:

None

Return type:

None

Example:
>>> table = Table([{"name": "Alice", "age": 30}])
>>> table.to_json("output.json")
# Creates output.json with table data
classmethod from_json(path, **kwargs) Table

Create a table from a JSON file.

Loads data from a JSON file and creates a Table object. The JSON file should contain a list of dictionaries.

Parameters:
  • path (str) – Path to the JSON file

  • kwargs – Additional arguments for load_json

Returns:

New table with data from the JSON file

Return type:

Table

Example:
>>> # Assuming data.json contains: [{"name": "Alice", "age": 30}]
>>> table = Table.from_json("data.json")
>>> print(table)
[{'name': 'Alice', 'age': 30}]
apply(func: Callable, *args, **kwargs)

Apply a function to the entire table.

Passes the entire table as the first argument to the specified function. Useful for chaining custom functions that operate on the whole table.

Parameters:
  • func (Callable) – Function that takes the table as its first argument

  • args – Additional positional arguments for the function

  • kwargs – Additional keyword arguments for the function

Returns:

Result of the function call

Return type:

any

Example:
>>> table = Table([{"value": 10}, {"value": 20}])
>>> def double_table(t):
...     return t.mutate(value=lambda row: row["value"] * 2)
>>> result = table.apply(double_table)
>>> print(result)
[{'value': 20}, {'value': 40}]
classmethod dataset_from_excel(path, sheets=None) dict

Load an Excel file and return a dictionary of Table objects.

Reads an Excel file and converts each sheet to a Table object. Returns a dictionary with sheet names as keys and Table objects as values.

Parameters:
  • path (str) – Path to the Excel file

  • sheets (list[str], optional) – List of sheet names to read (all sheets if None)

Returns:

Dictionary with sheet names and Table objects

Return type:

dict

Example:
>>> # Assuming data.xlsx has sheets "users" and "orders"
>>> result = Table.dataset_from_excel("data.xlsx")
>>> print(list(result.keys()))
['users', 'orders']
to_excel(path, sheet_name=None)

Export the table to an Excel file.

Saves the table data to an Excel file with the specified sheet name. If no sheet name is provided, uses “Sheet1” as default.

Parameters:
  • path (str) – Path to the Excel file

  • sheet_name (str, optional) – Name of the Excel sheet

Returns:

None

Return type:

None

Example:
>>> table = Table([{"name": "Alice", "age": 30}])
>>> table.to_excel("output.xlsx", "Users")
# Creates output.xlsx with "Users" sheet
classmethod from_csv(path, sep=',', encoding=None) Table

Load a table from a CSV file.

Reads a CSV file and creates a Table object from the data. Supports custom separators and encodings.

Parameters:
  • path (str) – Path to the CSV file

  • sep (str) – Column separator (detected automatically if None)

  • encoding (str, optional) – File encoding

Returns:

New table with data from the CSV file

Return type:

Table

Example:
>>> # Assuming data.csv contains: name,age\nAlice,30\nBob,25
>>> table = Table.from_csv("data.csv")
>>> print(table)
[{'name': 'Alice', 'age': '30'}, {'name': 'Bob', 'age': '25'}]
to_csv(path, sep=',', encoding=None)

Export the table to a CSV file.

Saves the table data to a CSV file with the specified separator and encoding.

Parameters:
  • path (str) – Path to the CSV file

  • sep (str) – Column separator for the CSV file

  • encoding (str, optional) – File encoding

Returns:

None

Return type:

None

Example:
>>> table = Table([{"name": "Alice", "age": 30}])
>>> table.to_csv("output.csv", sep=";")
# Creates output.csv with semicolon separator