diff --git a/notebooks/02 sklearn Pipeline.ipynb b/notebooks/02 sklearn Pipeline.ipynb index b205f720..138ca562 100644 --- a/notebooks/02 sklearn Pipeline.ipynb +++ b/notebooks/02 sklearn Pipeline.ipynb @@ -116,9 +116,9 @@ " \n", "Here comes the tricky part!\n", " \n", - "The input to the pipeline will be our dataframe `X`, which one row per identifier.\n", + "The input to the pipeline will be our dataframe `X`, with one row per identifier.\n", "It is currently empty.\n", - "But which time series data should the `RelevantFeatureAugmenter` to actually extract the features from?\n", + "But which time series data should the `RelevantFeatureAugmenter` use to actually extract the features from?\n", "\n", "We need to pass the time series data (stored in `df_ts`) to the transformer.\n", " \n", @@ -179,7 +179,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "During interference, the augmentor does only extract the relevant features it has found out in the training phase and the classifier predicts the target using these features." + "During inference, the augmenter only extracts those features that it has found as being relevant in the training phase. The classifier predicts the target using these features." ] }, { @@ -211,7 +211,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "You can also find out, which columns the augmenter has selected" + "You can also find out which columns the augmenter has selected" ] }, { @@ -248,11 +248,11 @@ "metadata": {}, "source": [ "In the example above we passed in a single `df_ts` into the `RelevantFeatureAugmenter`, which was used both for training and predicting.\n", - "During training, only the data with the `id`s from `X_train` where extracted and during prediction the rest.\n", + "During training, only the data with the `id`s from `X_train` were extracted. The rest of the data are extracted during prediction.\n", "\n", "However, it is perfectly fine to call `set_params` twice: once before training and once before prediction. \n", "This can be handy if you for example dump the trained pipeline to disk and re-use it only later for prediction.\n", - "You only need to make sure that the `id`s of the enteties you use during training/prediction are actually present in the passed time series data." + "You only need to make sure that the `id`s of the entities you use during training/prediction are actually present in the passed time series data." ] }, {