Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intro example not working #695

Closed
XenonLamb opened this issue Jun 11, 2024 · 3 comments
Closed

Intro example not working #695

XenonLamb opened this issue Jun 11, 2024 · 3 comments

Comments

@XenonLamb
Copy link

I tried to run the introduction notebook, but encountered the following error:

`---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[3], line 1
----> 1 import mlcroissant as mlc
3 # FileObjects and FileSets define the resources of the dataset.
4 distribution = [
5 # gpt-3 is hosted on a GitHub repository:
6 mlc.FileObject(
(...)
22 ),
23 ]

File ~/.local/lib/python3.9/site-packages/mlcroissant/init.py:3
1 """Defines the public interface to the mlcroissant package."""
----> 3 from mlcroissant._src import torch
4 from mlcroissant._src.core import constants
5 from mlcroissant._src.core.constants import DataType

File ~/.local/lib/python3.9/site-packages/mlcroissant/_src/torch/init.py:3
1 """PyTorch utilities public API."""
----> 3 from mlcroissant._src.torch.torch_adapter import LoaderFactory
4 from mlcroissant._src.torch.torch_adapter import LoaderSpecificationDataType
6 all = ["LoaderFactory", "LoaderSpecificationDataType"]

File ~/.local/lib/python3.9/site-packages/mlcroissant/_src/torch/torch_adapter/init.py:3
1 """PyTorch dataloader-based public API."""
----> 3 from mlcroissant._src.torch.torch_adapter.dataloader import LoaderFactory
4 from mlcroissant._src.torch.torch_adapter.dataloader import LoaderSpecificationDataType
6 all = ["LoaderFactory", "LoaderSpecificationDataType"]

File ~/.local/lib/python3.9/site-packages/mlcroissant/_src/torch/torch_adapter/dataloader.py:11
8 from typing import Any, Dict, Optional
10 from mlcroissant._src.core.optional import deps
---> 11 from mlcroissant._src.datasets import Dataset
13 try:
14 dp = deps.torchdata_datapipes

File ~/.local/lib/python3.9/site-packages/mlcroissant/_src/datasets.py:11
8 from absl import logging
9 from etils import epath
---> 11 from mlcroissant._src.core.context import Context
12 from mlcroissant._src.core.graphs import utils as graphs_utils
13 from mlcroissant._src.core.issues import ValidationError

File ~/.local/lib/python3.9/site-packages/mlcroissant/_src/core/context.py:12
9 from etils import epath
10 import networkx as nx
---> 12 from mlcroissant._src.core.issues import Issues
13 from mlcroissant._src.core.rdf import Rdf
16 class CroissantVersion(enum.Enum):

File ~/.local/lib/python3.9/site-packages/mlcroissant/_src/core/issues.py:16
11 class GenerationError(Exception):
12 """Error during the generation of the dataset."""
15 @dataclasses.dataclass(frozen=True)
---> 16 class Issues:
17 """Issues during the validation of the format.
18
19 Issues can either be errors (blocking) or warnings (informative).
20
21 We use sets to represent errors and warnings to avoid repeated strings.
22 """
24 _errors: set[tuple[str, Any]] = dataclasses.field(default_factory=set, hash=False)

File ~/.local/lib/python3.9/site-packages/mlcroissant/_src/core/issues.py:27, in Issues()
24 _errors: set[tuple[str, Any]] = dataclasses.field(default_factory=set, hash=False)
25 _warnings: set[tuple[str, Any]] = dataclasses.field(default_factory=set, hash=False)
---> 27 def _wrap_in_context(self, context: str | None, issue: str) -> str:
28 if context is None:
29 return issue

TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'`

@luisoala
Copy link
Contributor

luisoala commented Jun 13, 2024

hi @XenonLamb thx for posting

i just reran all cells of the notebook on colab and it worked fine on my end

can you reproduce your result on colab https://githubtocolab.com/mlcommons/croissant/blob/main/python/mlcroissant/recipes/introduction.ipynb ?

cc #694

@brendon-boldt
Copy link

Python version issue, most likely. X | Y typing syntax is supported from 3.10 onward (https://peps.python.org/pep-0604/), and the stack trace is from Python 3.9.

@XenonLamb
Copy link
Author

Python version issue, most likely. X | Y typing syntax is supported from 3.10 onward (https://peps.python.org/pep-0604/), and the stack trace is from Python 3.9.

Yes after running on a python 3.10 kernel, the example seems to work. Thank you for the clarification!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants