<h1 id="k-nearest-neighbours">K Nearest Neighbours<a aria-hidden="true" class="anchor-heading icon-link" href="#k-nearest-neighbours"></a></h1>
<ul>
<li><a href="/notes/bnjp85xg014r1adgmadmfye">non-parametric</a> classifier</li>
<li>Simply look at the K points in the training set that are closest to the input, counts how many members of each class are in this set, and returns the probability</li>
</ul>
<div class="math math-display"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><mi>p</mi><mo stretchy="false">(</mo><mi>y</mi><mo>=</mo><mi>c</mi><mi mathvariant="normal">∣</mi><mi>x</mi><mo separator="true">,</mo><mi>D</mi><mo separator="true">,</mo><mi>K</mi><mo stretchy="false">)</mo><mo>=</mo><mfrac><mn>1</mn><mi>K</mi></mfrac><munder><mo>∑</mo><mrow><mi>i</mi><mo>∈</mo><msub><mi>N</mi><mi>k</mi></msub><mo stretchy="false">(</mo><mi>x</mi><mo separator="true">,</mo><mi>D</mi><mo stretchy="false">)</mo></mrow></munder><mrow><mi>I</mi><mo stretchy="false">(</mo><msub><mi>y</mi><mi>i</mi></msub><mo>=</mo><mi>c</mi><mo stretchy="false">)</mo></mrow></mrow><annotation encoding="application/x-tex">p(y=c|x, D, K) = \frac{1}{K}\sum_{i \in N_k(x, D)}{I(y_i = c)}</annotation></semantics></math>p(y=c∣x,D,K)=K1​i∈Nk​(x,D)∑​I(yi​=c)</div>
Here
<math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>N</mi><mi>k</mi></msub><mo stretchy="false">(</mo><mi>x</mi><mo separator="true">,</mo><mi>D</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">N_k(x, D)</annotation></semantics></math>Nk​(x,D) are the indices of the K nearest points to <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi></mrow><annotation encoding="application/x-tex">x</annotation></semantics></math>x in <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>D</mi></mrow><annotation encoding="application/x-tex">D</annotation></semantics></math>D.
<math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>I</mi><mo stretchy="false">(</mo><mi>e</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">I(e)</annotation></semantics></math>I(e) is the indicator function defined as followes:
<div class="math math-display"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><mi>I</mi><mo stretchy="false">(</mo><mi>e</mi><mo stretchy="false">)</mo><mo>=</mo><mo stretchy="false">{</mo><mtable rowspacing="0.16em" columnalign="left right" columnspacing="1em"><mtr><mtd><mstyle scriptlevel="0" displaystyle="false"><mrow><mn>1</mn><mtext> if </mtext><mi>e</mi><mtext> is true</mtext></mrow></mstyle></mtd></mtr><mtr><mtd><mstyle scriptlevel="0" displaystyle="false"><mrow><mn>2</mn><mtext> if </mtext><mi>e</mi><mtext> is false</mtext></mrow></mstyle></mtd></mtr></mtable></mrow><annotation encoding="application/x-tex">I(e) = \{\begin{array}{lr} 1 \text{ if } e \text{ is true} \\ 2 \text{ if } e \text{ is false} \end{array}</annotation></semantics></math>I(e)={1 if e is true2 if e is false​</div>
<ul>
<li>Generally works quite well if the distance metric is good and has enough labeled training data.</li>
<li>The main problem is that they do not work well with high dimensional inputs due to the need for dense datasets and the <a href="/notes/28qy8rhhyvu73wzcx7vx3xz">Curse of Dimensionality</a>.</li>
</ul>
<hr>
Backlinks
<ul>
<li><a href="/notes/28qy8rhhyvu73wzcx7vx3xz">Curse of Dimensionality</a></li>
<li><a href="/notes/d168udza8utz7wz8oae3u7x">Titanic</a></li>
</ul>

K Nearest Neighbours


Hi! I'm Param! This is a place for my personal notes.

My website is https://param.codes, and I write more readable
stuff on my substack: https://newsletter.param.codes.

I've found that taking notes helps me remember things, and it's nice
to look back on information that you processed years ago. I jot down random things, there's no real structure, but some of these
notes could eventually become blog posts.

Some notes you might find interesting:

- [[history.laphams_quarterly.democracy.campaign_finance]]
- [[engineering.being_a_mentor]]
- [[history.china.dynasties]]
- [[history.india.indira_gandhi]]

I also keep notes on books I read [[here|books]].

This is built using [[Dendron|engineering.dendron]] and hosted using
[GitHub Pages](https://github.com/paramsingh/notes).

Get in touch via [Twitter](https://twitter.com/iliekcomputers) or email me at `me [at] param [dot] codes`!