Update documentation

tumaer · Jan 8, 2024 · 8a82a81 · 8a82a81
1 parent d2b3e9e
commit 8a82a81
Show file tree

Hide file tree

Showing 8 changed files with 335 additions and 9 deletions.
diff --git a/_images/kernel_trick_idea.svg b/_images/kernel_trick_idea.svg
diff --git a/_sources/lecture/gradients.md b/_sources/lecture/gradients.md
@@ -23,7 +23,7 @@ $$
     * ...
 2. 2000s: Rise of Python begins
 3. 2015: Autograd for the automatic differentiation of Python & NumPy is released
-4. 2016/2017: PyTorch & Tensorflow are introduced with automatic differentiation at their core
+4. 2016/2017: PyTorch & Tensorflow/JAX are introduced with automatic differentiation at their core. See [this Tweet](https://twitter.com/soumithchintala/status/1736555740448362890) for the history of PyTorch and its connection to JAX.
 5. 2018: JAX is introduced with its very thin Python layer on top of Tensorflow's compilation stack, where it performs automatic differentiation on the highest representation level
 6. 2020-2022: Forward-mode estimators to replace the costly and difficult-to-implement backpropagation are being introduced
 

diff --git a/_sources/lecture/svm.md b/_sources/lecture/svm.md
@@ -592,7 +592,7 @@ In general, $x$ and $\varphi$ are vectors where $\varphi$ has the entire $x$ as
 
 $$h(x)= g(\omega^T \varphi(x)+b).$$ (classifier_with_feature_map)
 
-**Example**
+**Example XNOR**
 
 The following classification problem is non-linear as there is no linear decision boundary.
 
@@ -616,7 +616,22 @@ name: xnor_example_embedded
 XNOR after feature mapping.
 ```
 
-But of course, this is constructed, as here we could immediately guess $\varphi(x_1,x_2)$. In general, this is not possible.
+**Example Circular Region**
+
+Given is a set of points $x \in \mathbb{R}^{2}$ with two possible labels: purple ($-1$) and orange ($1$), as can be seen in the left figure below. The task is to find a feature map such that a linear classifier can perfectly separate the two sets.
+
+```{figure} ../imgs/kernel_trick_idea.svg
+---
+width: 600px
+align: center
+name: kernel_trick_idea
+---
+Binary classification of circular region (Source: [Wikipedia](https://en.wikipedia.org/wiki/Kernel_method)).
+```
+
+Here, it is again obvious that if we embed the inputs in a 3D space by adding their squares, i.e. $\varphi((x_1, x_2)) = (x_1, x_2, x_1^2+x_2^2)$, we will be able to draw a hyperplane separating the subsets.
+
+But of course, these examples are constructed, as here we could immediately guess $\varphi(x_1,x_2)$. In general, this is not possible.
 
 > Recall : the dual problem of SVM involves a scalar product $x^{(i)\top}x^{(j)}$ of feature vectors.
 $\Rightarrow$ motivates the general notation of a dual problem with feature maps.

diff --git a/exercise/bayes.html b/exercise/bayes.html
@@ -664,8 +664,8 @@ <h3><span class="section-number">2.1.4. </span>Bayesian Linear Regression Model<
 <p>The model below essentially makes the following prior assumptions:</p>
 <div class="math notranslate nohighlight">
 \[y \approx h(x) = wx + b + \epsilon, \quad \text{with:}\]</div>
-<div class="amsmath math notranslate nohighlight" id="equation-fcf9e785-ead1-4ac0-ab6d-ff7f91f07108">
-<span class="eqno">(2.28)<a class="headerlink" href="#equation-fcf9e785-ead1-4ac0-ab6d-ff7f91f07108" title="Permalink to this equation">#</a></span>\[\begin{align}
+<div class="amsmath math notranslate nohighlight" id="equation-e621f765-b16d-4d57-9fc1-839550ac21c7">
+<span class="eqno">(2.28)<a class="headerlink" href="#equation-e621f765-b16d-4d57-9fc1-839550ac21c7" title="Permalink to this equation">#</a></span>\[\begin{align}
 y_i &amp;\sim \mathcal{N}(\mu, \sigma^2)\\
 \mu &amp;= w \cdot x_i + b\\
 w &amp;\sim \mathcal{N}(0,1^2)\\

diff --git a/lecture/gradients.html b/lecture/gradients.html
@@ -453,7 +453,7 @@ <h2><span class="section-number">8.1. </span>A Brief Incomplete History<a class=
 </li>
 <li><p>2000s: Rise of Python begins</p></li>
 <li><p>2015: Autograd for the automatic differentiation of Python &amp; NumPy is released</p></li>
-<li><p>2016/2017: PyTorch &amp; Tensorflow are introduced with automatic differentiation at their core</p></li>
+<li><p>2016/2017: PyTorch &amp; Tensorflow/JAX are introduced with automatic differentiation at their core. See <a class="reference external" href="https://twitter.com/soumithchintala/status/1736555740448362890">this Tweet</a> for the history of PyTorch and its connection to JAX.</p></li>
 <li><p>2018: JAX is introduced with its very thin Python layer on top of Tensorflow’s compilation stack, where it performs automatic differentiation on the highest representation level</p></li>
 <li><p>2020-2022: Forward-mode estimators to replace the costly and difficult-to-implement backpropagation are being introduced</p></li>
 </ol>