Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Designing an extension for TFRecords #3505

Open
maxbendick opened this issue Dec 31, 2017 · 5 comments
Open

Designing an extension for TFRecords #3505

maxbendick opened this issue Dec 31, 2017 · 5 comments

Comments

@maxbendick
Copy link

Generally, I want to use a parser written in Python to render a binary file in a JupyterLab extension.

I'd like to create an extension that can display data from Tensorflow TFRecord files. I can see one way I might do this in the pdf-extension, but I think creating a mime-renderer extension would require writing a parser for TFRecords in JavaScript/TypeScript/etc.

A parser does exist for TFRecords in Tensorflow, but it's written in C++/Python.

Here's some approaches to reuse the Tensorflow's parser, both very naive:

  1. The user launches a python server that reads the TFRecord. The server is registered in the TypeScript extension. The extension queries the server for relevant data, then renders.

  2. In the TypeScript extension, generate python code to read the record, and execute the code in a kernel. Render the output.

Any thoughts? Maybe there are more approaches I'm not seeing.

@ellisonbg
Copy link
Contributor

ellisonbg commented Jan 1, 2018 via email

@jasongrout
Copy link
Contributor

jasongrout commented Jan 3, 2018

@danielballan wrote an hdf5 viewer that uses a server-side library to parse the file and provide a server extension for data requests and a front-end renderer that seems similar to what you want to do as option (1). @danielballan - do you have your code somewhere?

@danielballan
Copy link
Contributor

Yes. It's not pushed anywhere and I'll have to go digging, but I'll try to find time this weekend.

@danielballan
Copy link
Contributor

danielballan commented Jan 7, 2018

@maxbendick Here's my work so far:

  • a modification to the notebook server, adding a new configuration whereby new data formats ("external formats") such as HDF5 can be mapped to some separate server -- outside of the notebook server's purview -- such as h5serv (this piece was developed with some input form @ellisonbg and @jasongrout)
  • a PR into phosphor (check out the GIF screencap demo) with a lazy-loading data grid that calls out to an h5serv endpoint to load chunks of data as needed
  • a new JupyterLab extension -- the important file is this one -- which is not quite finished

The phosphor PR is the only piece that fully works, but I'll take this opportunity to revisit the work and connect the rest of the pieces....

@dnelson86
Copy link

See also #448

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants