Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a Jupyter Widget for ITables #319

Merged
merged 36 commits into from
Sep 22, 2024
Merged

Implement a Jupyter Widget for ITables #319

merged 36 commits into from
Sep 22, 2024

Conversation

mwouts
Copy link
Owner

@mwouts mwouts commented Sep 12, 2024

I have used AnyWidget to provide the widget, as suggested at #267 (comment)

Closes #267
Closes #250

TODO

  • Make sure that the selected rows are consistent with the original number of rows (before downsampling)
  • The row selection could be preserved when update is called with selected_rows=None (the default)
  • Decide which traits should be public/private; use traits and setter/getters instead of update

Copy link

Thank you for making this pull request.

Did you know? You can try it on Binder: Binder:lab.

Also, the version of ITables developed in this PR can be installed with pip:

pip install git+https://github.com/mwouts/itables.git@try_anywidget

(this requires nodejs, see more at Developing ITables)

@codecov-commenter
Copy link

codecov-commenter commented Sep 12, 2024

Codecov Report

Attention: Patch coverage is 80.43478% with 36 lines in your changes missing coverage. Please review.

Project coverage is 93.65%. Comparing base (20546b2) to head (a177e4a).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/itables/javascript.py 67.39% 15 Missing ⚠️
src/itables/widget/__init__.py 90.47% 8 Missing ⚠️
tests/sample_python_apps/itables_in_a_shiny_app.py 41.66% 7 Missing ⚠️
src/itables/shiny.py 73.91% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #319      +/-   ##
==========================================
- Coverage   95.88%   93.65%   -2.24%     
==========================================
  Files          27       28       +1     
  Lines        1191     1339     +148     
==========================================
+ Hits         1142     1254     +112     
- Misses         49       85      +36     
Flag Coverage Δ
93.65% <80.43%> (-2.24%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mwouts mwouts force-pushed the try_anywidget branch 2 times, most recently from 4f0f7c8 to d22ac3c Compare September 16, 2024 22:40
@jgunstone
Copy link

jgunstone commented Sep 17, 2024

just tried to run through the docs (on binder and locally) and got this error:

import ipywidgets as widgets

from itables import show
from itables.sample_dfs import get_dict_of_test_dfs

sample_dfs = get_dict_of_test_dfs()


def use_show_in_interactive_output(table_name: str):
    show(
        sample_dfs[table_name],
        caption=table_name,
        style="table-layout:auto;width:auto;float:left;caption-side:bottom",
    )


table_selector = widgets.Dropdown(options=sample_dfs.keys(), value="int_float_str")
out = widgets.interactive_output(
    use_show_in_interactive_output, {"table_name": table_selector}
)

widgets.VBox([table_selector, out])
full stack trace
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[1], line 6
      3 from itables import show
      4 from itables.sample_dfs import get_dict_of_test_dfs
----> 6 sample_dfs = get_dict_of_test_dfs()
      9 def use_show_in_interactive_output(table_name: str):
     10     show(
     11         sample_dfs[table_name],
     12         caption=table_name,
     13         style="table-layout:auto;width:auto;float:left;caption-side:bottom",
     14     )

File [~/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/itables/sample_dfs.py:202](http://127.0.0.1:8888/home/jovyan/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/itables/sample_dfs.py#line=201), in get_dict_of_test_dfs(N, M, polars)
    103 def get_dict_of_test_dfs(N=100, M=100, polars=False):
    104     NM_values = np.reshape(np.linspace(start=0.0, stop=1.0, num=N * M), (N, M))
    106     test_dfs = {
    107         "empty": pd.DataFrame(dtype=float),
    108         "no_rows": pd.DataFrame(dtype=float, columns=["a"]),
    109         "no_columns": pd.DataFrame(dtype=float, index=["a"]),
    110         "no_rows_one_column": pd.DataFrame([1.0], index=["a"], columns=["a"]).iloc[:0],
    111         "no_columns_one_row": pd.DataFrame([1.0], index=["a"], columns=["a"]).iloc[
    112             :, :0
    113         ],
    114         "bool": pd.DataFrame(
    115             [[True, True, False, False], [True, False, True, False]],
    116             columns=list("abcd"),
    117         ),
    118         "nullable_boolean": pd.DataFrame(
    119             [
    120                 [True, True, False, None],
    121                 [True, False, None, False],
    122                 [None, False, True, False],
    123             ],
    124             columns=list("abcd"),
    125             dtype="bool" if PANDAS_VERSION_MAJOR == 0 else "boolean",
    126         ),
    127         "int": pd.DataFrame(
    128             [[-1, 2, -3, 4, -5], [6, -7, 8, -9, 10]], columns=list("abcde")
    129         ),
    130         "nullable_int": pd.DataFrame(
    131             [[-1, 2, -3], [4, -5, 6], [None, 7, None]],
    132             columns=list("abc"),
    133             dtype="Int64",
    134         ),
    135         "float": pd.DataFrame(
    136             {
    137                 "int": [0.0, 1],
    138                 "inf": [np.inf, -np.inf],
    139                 "nan": [np.nan, -np.nan],
    140                 "math": [math.pi, math.e],
    141             }
    142         ),
    143         "str": pd.DataFrame(
    144             {
    145                 "text_column": ["some", "text"],
    146                 "very_long_text_column": ["a " + "very " * 12 + "long text"] * 2,
    147             }
    148         ),
    149         "time": pd.DataFrame(
    150             {
    151                 "datetime": [datetime(2000, 1, 1), datetime(2001, 1, 1), pd.NaT],
    152                 "timestamp": [
    153                     pd.NaT,
    154                     datetime(2000, 1, 1, 18, 55, 33),
    155                     datetime(
    156                         2001,
    157                         1,
    158                         1,
    159                         18,
    160                         55,
    161                         55,
    162                         456654,
    163                         tzinfo=None if pytz is None else pytz.timezone("US/Eastern"),
    164                     ),
    165                 ],
    166                 "timedelta": [
    167                     timedelta(days=2),
    168                     timedelta(seconds=50),
    169                     pd.NaT - datetime(2000, 1, 1),
    170                 ],
    171             }
    172         ),
    173         "date_range": pd.DataFrame(
    174             {"timestamps": pd.date_range("now", periods=5, freq="s")}
    175         ),
    176         "ordered_categories": pd.DataFrame(
    177             {"int": np.arange(4)},
    178             index=pd.CategoricalIndex(
    179                 ["first", "second", "third", "fourth"],
    180                 categories=["first", "second", "third", "fourth"],
    181                 ordered=True,
    182                 name="categorical_index",
    183             ),
    184         ),
    185         "ordered_categories_in_multiindex": pd.DataFrame(
    186             {"int": np.arange(4), "integer_index": np.arange(4)},
    187             index=pd.CategoricalIndex(
    188                 ["first", "second", "third", "fourth"],
    189                 categories=["first", "second", "third", "fourth"],
    190                 ordered=True,
    191                 name="categorical_index",
    192             ),
    193         ).set_index("integer_index", append=True),
    194         "object": pd.DataFrame(
    195             {"dict": [{"a": 1}, {"b": 2, "c": 3}], "list": [["a"], [1, 2]]}
    196         ),
    197         "multiindex": pd.DataFrame(
    198             np.arange(16).reshape((4, 4)),
    199             columns=pd.MultiIndex.from_product((["A", "B"], [1, 2])),
    200             index=pd.MultiIndex.from_product((["C", "D"], [3, 4])),
    201         ),
--> 202         "countries": get_countries(),
    203         "capital": get_countries().set_index(["region", "country"])[["capital"]],
    204         "complex_index": get_df_complex_index(),
    205         "int_float_str": pd.DataFrame(
    206             {
    207                 "int": range(N),
    208                 "float": np.linspace(5.0, 0.0, N),
    209                 "str": [
    210                     letter for letter, _ in zip(cycle(string.ascii_lowercase), range(N))
    211                 ],
    212             }
    213         ),
    214         "wide": pd.DataFrame(
    215             NM_values,
    216             index=["row_{}".format(i) for i in range(N)],
    217             columns=["column_{}".format(j) for j in range(M)],
    218         ),
    219         "long_column_names": pd.DataFrame(
    220             {
    221                 "short name": [0] * 5,
    222                 "very " * 5 + "long name": [0] * 5,
    223                 "very " * 10 + "long name": [1] * 5,
    224                 "very " * 20 + "long name": [2] * 5,
    225                 "nospacein" + "very" * 50 + "longname": [3] * 5,
    226                 "nospacein" + "very" * 100 + "longname": [3] * 5,
    227             }
    228         ),
    229         "sorted_index": pd.DataFrame(
    230             {"i": [0, 1, 2], "x": [0.0, 1.0, 2.0], "y": [0.0, 0.1, 0.2]}
    231         ).set_index(["i"]),
    232         "reverse_sorted_index": pd.DataFrame(
    233             {"i": [2, 1, 0], "x": [0.0, 1.0, 2.0], "y": [0.0, 0.1, 0.2]}
    234         ).set_index(["i"]),
    235         "sorted_multiindex": pd.DataFrame(
    236             {"i": [0, 1, 2], "j": [3, 4, 5], "x": [0.0, 1.0, 2.0], "y": [0.0, 0.1, 0.2]}
    237         ).set_index(["i", "j"]),
    238         "unsorted_index": pd.DataFrame(
    239             {"i": [0, 2, 1], "x": [0.0, 1.0, 2.0], "y": [0.0, 0.1, 0.2]}
    240         ).set_index(["i"]),
    241         "duplicated_columns": pd.DataFrame(
    242             np.arange(4, 8).reshape((2, 2)),
    243             columns=pd.Index(["A", "A"]),
    244             index=pd.MultiIndex.from_arrays(
    245                 np.arange(4).reshape((2, 2)), names=["A", "A"]
    246             ),
    247         ),
    248         "named_column_index": pd.DataFrame({"a": [1]}).rename_axis("columns", axis=1),
    249         "big_integers": pd.DataFrame(
    250             {
    251                 "bigint": [
    252                     1234567890123456789,
    253                     2345678901234567890,
    254                     3456789012345678901,
    255                 ],
    256                 "expected": [
    257                     "1234567890123456789",
    258                     "2345678901234567890",
    259                     "3456789012345678901",
    260                 ],
    261             }
    262         ),
    263     }
    265     if polars:
    266         import polars as pl

File [~/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/itables/sample_dfs.py:43](http://127.0.0.1:8888/home/jovyan/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/itables/sample_dfs.py#line=42), in get_countries(html)
     40 def get_countries(html=True):
     41     """A Pandas DataFrame with the world countries (from the world bank data)
     42     Flags are loaded from https://flagpedia.net/"""
---> 43     df = pd.read_csv(find_package_file("samples/countries.csv"))
     44     df = df.rename(columns={"capitalCity": "capital", "name": "country"})
     45     df["iso2Code"] = df["iso2Code"].fillna("NA")  # Namibia

File ~/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/pandas/io/parsers/readers.py:1026, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options, dtype_backend)
   1013 kwds_defaults = _refine_defaults_read(
   1014     dialect,
   1015     delimiter,
   (...)
   1022     dtype_backend=dtype_backend,
   1023 )
   1024 kwds.update(kwds_defaults)
-> 1026 return _read(filepath_or_buffer, kwds)

File [~/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/pandas/io/parsers/readers.py:620](http://127.0.0.1:8888/home/jovyan/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/pandas/io/parsers/readers.py#line=619), in _read(filepath_or_buffer, kwds)
    617 _validate_names(kwds.get("names", None))
    619 # Create the parser.
--> 620 parser = TextFileReader(filepath_or_buffer, **kwds)
    622 if chunksize or iterator:
    623     return parser

File [~/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/pandas/io/parsers/readers.py:1620](http://127.0.0.1:8888/home/jovyan/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/pandas/io/parsers/readers.py#line=1619), in TextFileReader.__init__(self, f, engine, **kwds)
   1617     self.options["has_index_names"] = kwds["has_index_names"]
   1619 self.handles: IOHandles | None = None
-> 1620 self._engine = self._make_engine(f, self.engine)

File [~/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/pandas/io/parsers/readers.py:1880](http://127.0.0.1:8888/home/jovyan/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/pandas/io/parsers/readers.py#line=1879), in TextFileReader._make_engine(self, f, engine)
   1878     if "b" not in mode:
   1879         mode += "b"
-> 1880 self.handles = get_handle(
   1881     f,
   1882     mode,
   1883     encoding=self.options.get("encoding", None),
   1884     compression=self.options.get("compression", None),
   1885     memory_map=self.options.get("memory_map", False),
   1886     is_text=is_text,
   1887     errors=self.options.get("encoding_errors", "strict"),
   1888     storage_options=self.options.get("storage_options", None),
   1889 )
   1890 assert self.handles is not None
   1891 f = self.handles.handle

File [~/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/pandas/io/common.py:873](http://127.0.0.1:8888/home/jovyan/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/pandas/io/common.py#line=872), in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
    868 elif isinstance(handle, str):
    869     # Check whether the filename is to be opened in binary mode.
    870     # Binary mode does not support 'encoding' and 'newline'.
    871     if ioargs.encoding and "b" not in ioargs.mode:
    872         # Encoding
--> 873         handle = open(
    874             handle,
    875             ioargs.mode,
    876             encoding=ioargs.encoding,
    877             errors=errors,
    878             newline="",
    879         )
    880     else:
    881         # Binary mode
    882         handle = open(handle, ioargs.mode)

FileNotFoundError: [Errno 2] No such file or directory: '/home/jovyan/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/itables/samples/countries.csv'

@mwouts
Copy link
Owner Author

mwouts commented Sep 17, 2024

just tried to run through the docs (on binder and locally) and got this error:

(...)
FileNotFoundError: [Errno 2] No such file or directory: '/home/jovyan/miniforge3/envs/complexapps-2024/lib/python3.11/site-packages/itables/samples/countries.csv'

Thanks for giving it a go, and sorry about that - My attempt to simplify the pyproject.toml didn't go as expected...

This should be fixed now, at least I have seen this notebook run on Binder: https://mybinder.org/v2/gh/mwouts/itables/try_anywidget?urlpath=lab/tree/docs/ipywidgets.md. Let me know what you think! Thanks

@jgunstone
Copy link

jgunstone commented Sep 19, 2024

thanks @mwouts for the fix -
generally looks and works great, I can see the ability to provide bits of interaction whilst keeping the look and feel of itables could be v useful.

just had a quick play and have a few comments to address as you see fit:

  • I personally think that value would be a better trait name that data as it matches the ipywidgets lib. (that said, ipydatagrid uses data for dataframes so there is already a precedent for that)
  • as a general comment, might be worth reviewing the trait names and datastructures from ipydatagrid as that would make it easy for users to switch between based on their use case (though acknowledge the mapping is unlikely to be clean)
  • the traits dont seem to support bidirectional communication... ie. once an ITable has been instantiated the data trait cannot be updated from python by going table.data = [...], or table.dt_args = {...}. This would feel natural for an ipywidget. Note, I developed the ipyautoui library which is a pure python lib built on ipywidgets... I struggled a little with this and arrived at a solution where by trait is _value and then I have a value setter and getter... enabling me to do stuff on set and get whilst the widget still feels like an ipywidget.
  • confirmed that the selected_rows is independent of search which is great
  • wondered if saving data as records (i.e. list of dicts) would be better?

you can copy the markdown below into you ipywidget.md file for more info

## JG Comments

```{code-cell} ipython3


import ipywidgets as widgets
from itables import show
from itables.sample_dfs import get_dict_of_test_dfs
from itables.widget import ITable


sample_dfs = get_dict_of_test_dfs()
name = "ordered_categories"
df = sample_dfs[name]

table = ITable()  

table = ITable(
    df,
    caption=name,
    select=True,
    style="table-layout:auto;width:auto;float:left",
)
table
# view traits

table.traits()
columns = [c["title"] for c in table.dt_args["columns"]]
columns
# `data` is the trait name for the widget value
# as a big ipywidgets user, I'd advocate using `value` instead 
# as that then becomes consistent with all the other ipywidgets
table.data  # would a list of records be possible?
columns = [c["title"] for c in table.dt_args["columns"]]
[dict(zip(d, columns)) for d in table.data]
# not possible set table trait value... which would be nice
table.data  = [['a', 0], ['b', 1], ['c', 2], ['d', 3]]
table = ITable(df)  
table
# used this to check that the `selected_rows` doesn't care about search - it doesn't - which is great 
name1 = "countries"
df1 = sample_dfs[name1]
table1 = ITable(
    df1,
    caption=name1,
    select=True,
    style="table-layout:auto;width:auto;float:left",
)
table1
table1.selected_rows

@mwouts
Copy link
Owner Author

mwouts commented Sep 19, 2024

Hi @jgunstone , thank you so much for your feedback, that's really helpful!

I personally think that value would be a better trait name that data as it matches the ipywidgets lib. (that said, ipydatagrid uses data for dataframes so there is already a precedent for that)

Well interesting that you mention that! I was seeing data and dt_args as internal traits, an I don't really expect the users to modify them. Instead, I was thinking that you would use the update method to update the data and the dt args by passing directly the dataframe and the usual options (if you don't mind, can you give it a try and let me know what you think?)

Internally the update method transforms df into the appropriate list of rows, defines the columns, and increases destroy_and_recreate to refresh the table (refreshing on data or dt_args separately causes issues when the column definitions don't match the row length).

model.on('change:destroy_and_recreate', () => {
create_table(true);
});

At the very least I should make that more explicit in the documentation. I can also move the traits that I don't think people should use to underscore names as you suggest, that's a good point! Actually, the traits that I would like to expose are the following:

  • selected_rows, to set or retrieve a selection (I think that's the most useful one!)
  • caption, style and classes

At some point I plan to make the tables editable (#243, will require a subscription to datatables' editor), but until then I don't want to expose data directly (it's not a one to one conversion of df, etc). I'd be curious to give a look at how you can defined setters and getters - setting df for instance would be ideal and possibly more idiomatic than update.

Also thanks for pointing out at ipydatagrid! At first sight we seem to have the same approach re passing the data as a DataFrame through the first argument of the widget. Do you see people interacting directly with the data attribute maybe?

@jgunstone
Copy link

jgunstone commented Sep 20, 2024

Hi hi -
this is how they do the setter / getter in ipydatagrid:
https://github.com/jupyter-widgets/ipydatagrid/blob/f7fab2945d89063eaa647fb7e9f94cc1c140d7bb/ipydatagrid/datagrid.py#L465-L492

and then the trait is _data - maybe you could do a similar thing by putting the code in your update method into the setter? I think it would be nice to interact with data in this way.

just playing with dt_args and getting a little confused (though tbh I haven't done loads of customisation stuff with itables so not super familiar generally) -

what vars would typically be passed to dt_args, and how are they distinct from what would just be passed as **kwargs?

import pandas as pd
import itables.options as opt
from itables import init_notebook_mode, show
from itables.sample_dfs import get_countries

df = get_countries(html=False)
init_notebook_mode(all_interactive=True)

show(df, classes="display nowrap compact")
ITable(df, classes="display nowrap compact")
# ^ this works the same as show which is nice from a user perspective. 


ITable(df, dt_args=dict(classes="display nowrap compact"))
# ^ this doesn't do anything.... 

@mwouts mwouts merged commit ef5cd58 into main Sep 22, 2024
14 checks passed
@mwouts mwouts deleted the try_anywidget branch September 22, 2024 13:12
mwouts added a commit that referenced this pull request Sep 22, 2024
Selected rows in the widget, in Streamlit and in Shiny (#250)
Version 2.2.0
@mwouts
Copy link
Owner Author

mwouts commented Sep 22, 2024

The Jupyter Widget is now part of ITables v2.2. See https://mwouts.github.io/itables/ipywidgets.html for the documentation.

Thank you @jgunstone for your feedback on the widget, it has been very helpful. Since our last chat I have made sure that only the traits that the user can modify directly are public. I have also added a df property and setter to let the user modify the underlying dataframe more easily - examples are available in the documentation.

Re your last question re dt_args, that's an internal distinction that I make between the arguments that are passed to the JavaScript DataTable constructor, and the other ones (e.g caption, style, classes, selected rows...). As a user you don't need to make that distinction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement a Jupyter Widget for ITables Support for Selecting rows?
3 participants