-
Notifications
You must be signed in to change notification settings - Fork 13
Session 1: Data and simple neural networks
The first lesson explores data structures in TensorFlow (tf), focusing on how to represent multichannel microelectrode timeseries. We will load data into tf structures, manipulate them, and visualize them.
Before you proceed, ensure that you have a functional Jupyter notebook server running and that you can open notebooks in your web browser. The main repository README has instructions on Getting Started. Please also ensure that you have the example datasets available to your environment.
Run the notebook titled "Lesson1-TestEnvironment.ipynb".
TODO: Create that notebook, with some blocks to test keras, tensorflow, gpu, pytorch, fast.ai
- Other notebook tips
- Tab-completion
- shift+tab completion
- ?
- ??
- Press "h" for keyboard shortcuts
For your reference, we have provided the scripts that we use to download datasets from the internet, import them, parse out the associated behaviour/labels, and save them to an intermediate data format. Even though importing your own data will likely be an entirely different process, let's look at the data import process together to go over the general concepts.
Run the notebook named "Lesson1-DataImport.ipynb" TODO: Make this notebook. Import raw data using python-neo. Inspect data structures. Convert to a common data format.
The common format will include the following variables:
-
data
: a numpy array for the raw data with shape (n_channels, n_samples). David's data will also need another dimension for segments, or, if segments are unequal length,data
will be a list of arrays. -
timestamps
: a numpy array of shape (n_samples,), or a list of arrays for multiple segments. -
channels
: a pandas dataframe of length n_channels. It must have columns forname
,pos_x
,pos_y
,pos_z
, and any other relevant information. -
events
: a pandas dataframe of length n_events. It must have columns fortimestamp
,type
,value
, and any other relevant information.
Run the notebook named "Lesson1-ExploreData.ipynb" TODO: Make this notebook.
* Arrays: matrices and tensors. For each of our example datasets:
* Explain the experiment if applicable. Describe the recording setup (electrodes, amps, other measures).
* Print data shape
* Print some of the contents
* Look at scale, precision. Data should be standardized.
* FP16 vs FP32 on GPU.
* Print additional structure (labels, kinematics, etc.)
* Visualize individual trials, colour coded by condition
* Visualize condition-average (much information lost)
* Visualize covariance structure.
* Tensor decomposition
* Domain expertise and feature engineering
* Become experts in neurophysiology of PD --> beta burst length and PAC
* BG-thalamocortical network has oscillatory activity --> time-frequency transform to spectrogram
* Beta band-pass filter --> Hilbert transform --> Instantaneous amplitude/phase
* Become experts in intracortical array neurophysiology --> "Neural modes"
* High-pass filter
* Threshold
* Spike-sorting
* Demultiplexing?
* Binned spike counts
* Counts to rates
* Dimensionality reduction (tensor decomp; factor analysis)
We will now try to classify data using a simple 1-layer neural network with linear activations. Run "Lesson1-SimpleClassification.ipynb". TODO: Make this notebook.
* Features can then be used in 'simpler' ML algorithm
* Describe Linear Discriminant Analysis using engineered features
* Analytical solution
* Show e.g. LDA in neural network parlance. (https://www.jstor.org/stable/2584434)
* Loss function
* Regularization
* Loss gradient
* Learning rate
* Why log(p) instead of accuracy?
* In some cases, neural networks eliminate much of the need for feature engineering.
* Indeed, with enough data, and enough parameters, it is provable that feature engineering is unnecessary.