{"id":31142,"date":"2025-05-21T05:00:00","date_gmt":"2025-05-21T03:00:00","guid":{"rendered":"https:\/\/sii.pl\/blog\/?p=31142"},"modified":"2025-05-22T15:27:40","modified_gmt":"2025-05-22T13:27:40","slug":"kan-a-revolution-in-neurons-a-new-generation-of-deep-learning-networks","status":"publish","type":"post","link":"https:\/\/sii.pl\/blog\/en\/kan-a-revolution-in-neurons-a-new-generation-of-deep-learning-networks\/","title":{"rendered":"KAN: a revolution in neurons \u2013 a new generation of deep learning networks"},"content":{"rendered":"\n<p>Artificial intelligence has seen rapid advancements, but the core unit of neural networks \u2013 the artificial neuron \u2013 has remained largely unchanged. The newly introduced Kolmogorov-Arnold Network (KAN) challenges this norm by rethinking how activation functions are used and learned. With its innovative structure, KAN offers improved efficiency, interpretability, and performance, potentially reshaping the future of deep learning.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Fundamentals of classical neuron operation<\/strong><\/strong><\/h2>\n\n\n\n<p>The role of AI has rapidly expanded in our daily lives, influencing everything from personal assistants and recommendation systems to more advanced fields like healthcare diagnostics, autonomous vehicles, and natural language processing. Numerous architectures and layer types, such as convolutional layers and embedding layers, have been developed. However, despite these groundbreaking advancements, the fundamental unit that powers these networks \u2013 the artificial neuron \u2013 has remained largely unchanged since its inception.<\/p>\n\n\n\n<p>The core concept of the neuron, which is inspired by the biological neuron&#8217;s in the human brain, is still defined by a relatively simple mathematical equation <strong>that sums the input values, applies weights, adds a bias, and then passes the result through an activation function<\/strong>.<\/p>\n\n\n\n<p>This simplicity is key to the flexibility and power of neural networks, allowing them to be scaled up into deep architectures without changing the foundational building block. In essence, while the neural network landscape has evolved rapidly, the neuron at its heart remains the same.<\/p>\n\n\n\n<p>Below is a mathematical representation of an artificial neuron:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"251\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/1-1-1024x251.jpg\" alt=\"A mathematical representation of an artificial neuron\" class=\"wp-image-31144\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/1-1-1024x251.jpg 1024w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/1-1-300x74.jpg 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/1-1-768x188.jpg 768w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/1-1.jpg 1223w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">A mathematical representation of an artificial neuron<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"520\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/ryc.-1-1024x520.jpg\" alt=\"The graphical representation of an artificial neuron with n inputs (source: Wikipedia)\" class=\"wp-image-31115\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/ryc.-1-1024x520.jpg 1024w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/ryc.-1-300x152.jpg 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/ryc.-1-768x390.jpg 768w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/ryc.-1.jpg 1220w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Fig. 1 The graphical representation of an artificial neuron with n inputs (source: <a href=\"https:\/\/pl.wikipedia.org\/wiki\/Sztuczny_neuron\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >Wikipedia<\/a>)<\/figcaption><\/figure>\n\n\n\n<p>Neural networks rely on activation functions to introduce nonlinearity, allowing them to solve complex problems. While weights are adjusted during training, biases and activation functions typically remain fixed. According to the Universal Approximation Theorem, even a single hidden layer with enough neurons can approximate any continuous function.<\/p>\n\n\n\n<p>However, complex tasks often require deep neural networks \u2013 multi-layer perceptrons (MLPs) \u2013which stack multiple layers of neurons to handle challenges like image recognition or language translation.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Kolmogorov-Arnold theorem as inspiration<\/strong><\/strong><\/h2>\n\n\n\n<p>Recently, a new type of algorithm known as the <strong>Kolmogorov-Arnold Network<\/strong> was introduced, marking a significant advancement in neural network architecture. This novel approach shows strong potential to outperform traditional multi-layer perceptrons, offering new opportunities for improved performance and efficiency in solving complex tasks. Its innovative structure could help overcome some of the limitations of current deep learning models, potentially transforming the landscape of neural network research and applications.<\/p>\n\n\n\n<p>This new algorithm is inspired by the Kolmogorov-Arnold Representation Theorem, which states that any complex function with many variables can be decomposed into simpler functions, each depending on a single variable. This decomposition makes it easier to tackle complicated problems by focusing on one factor at a time.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>KAN \u2013 a new neural network architecture<\/strong><\/strong><\/h2>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img decoding=\"async\" width=\"963\" height=\"596\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/Ryc.-2.png\" alt=\"KAN \u2013 simplification of the complex relationship between feature and label (source: KAN: Kolmogorov-Arnold Networks)\" class=\"wp-image-31117\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/Ryc.-2.png 963w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/Ryc.-2-300x186.png 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/Ryc.-2-768x475.png 768w\" sizes=\"(max-width: 963px) 100vw, 963px\" \/><figcaption class=\"wp-element-caption\">Fig. 2 KAN \u2013 simplification of the complex relationship between feature and label (source: <a href=\"https:\/\/arxiv.org\/abs\/2404.19756\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >KAN: Kolmogorov-Arnold Networks<\/a>)<\/figcaption><\/figure>\n\n\n\n<p>In this algorithm, this concept is applied by breaking down a complex, nonlinear machine learning problem into smaller, more manageable components. Each custom activation function simplifies the complex relationship between the feature and label. This is shown in Figure 2 above.<\/p>\n\n\n\n<p>By summing the results of all these simplified functions, we obtain the final prediction for the overall problem. Mathematically, this can be expressed as:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"350\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/2-1-1024x350.jpg\" alt=\"the final prediction\" class=\"wp-image-31146\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/2-1-1024x350.jpg 1024w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/2-1-300x102.jpg 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/2-1-768x262.jpg 768w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/2-1.jpg 1204w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">The final prediction<\/figcaption><\/figure>\n\n\n\n<p> In KAN implementation function \u03c8 is an identity operation \u03c8(x)=x.<\/p>\n\n\n\n<p>A more intuitive representation for a neural network would be in matrix form. The matrix below illustrates how a layer with n inputs would look in the Kolmogorov-Arnold Network (KAN) algorithm.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"136\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/3-1024x136.jpg\" alt=\"matrix\" class=\"wp-image-31121\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/3-1024x136.jpg 1024w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/3-300x40.jpg 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/3-768x102.jpg 768w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/3.jpg 1218w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Matrix<\/figcaption><\/figure>\n\n\n\n<p>Where for every input is specified an activation function \u03d5.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Activation functions serve as learning parameters<\/strong><\/strong><\/h2>\n\n\n\n<p>In comparison to a standard MLP, where activation functions are fixed at the nodes and weights are the only learnable parameters on the edges, the Kolmogorov-Arnold Network (KAN) takes a different approach.<\/p>\n\n\n\n<p>In KAN, the activation functions themselves are learnable and are placed on the edges, while the nodes perform a sum operation on the outputs of these activation functions. This shift allows for more flexibility in how the network models complex relationships. <\/p>\n\n\n\n<p>The paper illustrates this concept using the graph shown in Figure 3.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"221\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/Ryc.-3-1024x221.jpg\" alt=\"Fig. 3 KAN \u2013 activation functions and nodes (source: KAN: Kolmogorov-Arnold Networks)\" class=\"wp-image-31125\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/Ryc.-3-1024x221.jpg 1024w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/Ryc.-3-300x65.jpg 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/Ryc.-3-768x165.jpg 768w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/Ryc.-3.jpg 1221w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Fig. 3 KAN \u2013 activation functions and nodes (source: <a href=\"https:\/\/arxiv.org\/abs\/2404.19756\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >KAN: Kolmogorov-Arnold Networks<\/a>)<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>How to make a custom activation function?<\/strong><\/strong><\/h2>\n\n\n\n<p>Everything about this algorithm sounds logical and intuitive, but how can we estimate an activation function for every input and apply it multiple times in each layer? The answer to this question lies in B-splines. A B-spline (or Basis spline) is a flexible curve composed of several connected segments, which allows it to create smooth and complex shapes.<\/p>\n\n\n\n<p>In simple terms, a B-spline is a curve defined by control points, allowing it to be shaped in various ways to approximate different functions. One of its key advantages is that B-splines can be used for numerical differentiation, making them ideal for backpropagation when training a Kolmogorov-Arnold Network (KAN).<\/p>\n\n\n\n<p>A notable feature of B-splines is that adjusting a single control point only affects the local portion of the curve, leaving the rest unchanged. By manipulating these control points to fit the B-spline to the desired shape, we effectively train the network. These control points act as the primary trainable parameters, forming the core idea behind the Kolmogorov-Arnold Network.<\/p>\n\n\n\n<p>Below, on Figure 4, is an example of curve created by few control points.(graph from Wikipedia). More detailed informationan be found <a href=\"https:\/\/en.wikipedia.org\/wiki\/B-spline\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >on Wikipedia<\/a>.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"565\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/Ryc.-4-1024x565.jpg\" alt=\"Krzywa utworzona przez kilka punkt\u00f3w kontrolnych (\u017ar\u00f3d\u0142o: Wikipedia)\" class=\"wp-image-31127\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/Ryc.-4-1024x565.jpg 1024w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/Ryc.-4-300x165.jpg 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/Ryc.-4-768x423.jpg 768w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/Ryc.-4.jpg 1197w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Fig. 4 Curve formed by several checkpoints (source: <a href=\"https:\/\/en.wikipedia.org\/wiki\/B-spline\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >Wikipedia<\/a>)<\/figcaption><\/figure>\n\n\n\n<p>So, with an understanding of the backbone idea of KAN, our artificial neural network would now have a slightly different shape. In MLP, our trainable parameters are weights, but now our trainable parameters are control points in B-splines.<\/p>\n\n\n\n<p>A single layer of a <strong>multi-layer perceptron (MLP)<\/strong> can be described using vectors and matrices:<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"133\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/4-1-1024x133.jpg\" alt=\"\" class=\"wp-image-31129\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/4-1-1024x133.jpg 1024w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/4-1-300x39.jpg 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/4-1-768x100.jpg 768w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/4-1.jpg 1208w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">A <strong>multi-layer perceptron (MLP)<\/strong> can be described using vectors and matrices<\/figcaption><\/figure>\n\n\n\n<ol class=\"wp-block-list\">\n<li><code>Input Vector: The input features form a vector  x <\/code><\/li>\n\n\n\n<li><code>Weight Matrix: The weights connecting the inputs to the neurons are represented by a matrix <\/code><\/li>\n\n\n\n<li><code>Bias Vector: Each neuron has a bias, represented as a vector  b <\/code><\/li>\n\n\n\n<li><code>Linear Transformation: The input undergoes a linear transformation  z=Wx+b <\/code><\/li>\n\n\n\n<li><code>Activation Function: An activation function is applied to z, producing the output vector  a=f(z)<\/code><\/li>\n<\/ol>\n\n\n\n<p>The single layer for the KAN algorithm would be slightly different.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"147\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/5-1024x147.jpg\" alt=\"jedna warstwa dla KAN\" class=\"wp-image-31131\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/5-1024x147.jpg 1024w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/5-300x43.jpg 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/5-768x110.jpg 768w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/5.jpg 1199w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Jedna warstwa dla KAN<\/figcaption><\/figure>\n\n\n\n<p>Where \u03d5_mn are an activation function\u2019s. <a href=\"https:\/\/arxiv.org\/abs\/2404.19756\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >In paper[1]<\/a> the activation function has next definition<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"475\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/6-1-1024x475.jpg\" alt=\"an activation function\u2019s\" class=\"wp-image-31148\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/6-1-1024x475.jpg 1024w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/6-1-300x139.jpg 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/6-1-768x356.jpg 768w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/6-1.jpg 1216w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">An activation function\u2019s<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Comparison of KAN and MLP: performance and interpretability<\/strong><\/strong><\/h2>\n\n\n\n<p>From the formulas above, you can see that MLPs have fewer trainable parameters than KAN, which means KAN require more computational power and time to train.<\/p>\n\n\n\n<p>However, the innovation in this new algorithm is that we can use fewer neurons and layers by optimizing not just the weights in a neuron, but also the activation function. During training, gradient-based optimization is used to adjust the positions of spline control points \u2013 just like weights in traditional networks. The paper describes how a KAN model with fewer layers outperforms standard MLP networks.<\/p>\n\n\n\n<p>In addition to the smaller size and optimized activation functions, KAN offers a major advantage in interpretability. By inspecting the learned B-splines, we can gain insight into how the model is working, unlike MLPs, which operate as a complete black box. Another strong feature of KAN is its support for continual learning \u2013 during fine-tuning, the model retains knowledge from the original task. This is due to the property of B-splines, where adjusting one control point only affects the local area of the curve.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Application and practical potential<\/strong><\/strong><\/h2>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/abs\/2404.19756\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >The original KAN paper [1]<\/a> presents several examples across different tasks where the KAN model achieved higher test accuracy with fewer trainable parameters compared to traditional MLP models. For example, in the task of signature classification, the KAN model achieved an accuracy of 81.6%, while the traditional MLP model achieved 78% (<a href=\"https:\/\/arxiv.org\/abs\/2404.19756\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >Table 3 from [1]<\/a>).<\/p>\n\n\n\n<p>Additionally, the paper demonstrates that in tasks requiring symbolic regression and the discovery of higher-order polynomial relationships (such as fitting data to complex functions), KAN networks consistently outperformed MLPs by better capturing these relationships without the need for deep architectures or excessive training time.<\/p>\n\n\n\n<p>However, publications generally conclude that while KAN is not universally superior, it can outperform classical models in specific tasks and domains. A more comprehensive comparison was conducted in the <a href=\"https:\/\/arxiv.org\/pdf\/2407.16674\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >KAN or MLP: A Fairer Comparison paper [4]<\/a>, where the KAN model&#8217;s performance was evaluated across machine learning (on eight datasets), computer vision, NLP, and audio processing tasks. The study showed that standard MLP models slightly outperformed KAN models, with accuracy differences ranging from 0.2% (machine learning tasks) up to 8% (computer vision tasks).<\/p>\n\n\n\n<p>Furthermore, several follow-up studies have explored adaptations of KAN in specialized domains. One of them is that the <a href=\"https:\/\/arxiv.org\/abs\/2408.08803\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >Fourier KAN outperforms MLP on the Text Classification Head Fine-tuning paper [5]<\/a>, where a modified version of KAN called Fourier KAN (FR-KAN) was proposed. In this study, FR-KAN was used as an alternative to MLP-based text classification heads and demonstrated significant improvements, achieving an average accuracy increase of 10% and an F1-score improvement of 11% across seven pre-trained transformer models and four text classification tasks.<\/p>\n\n\n\n<p>Currently, there are no known production-level deployments of KAN, but the model has sparked significant interest in the research community. Numerous studies have proposed modifications and adaptations of KAN, demonstrating its potential to achieve higher performance in specific use cases. However, these works also emphasize that <strong>KAN is not a one-size-fits-all solution \u2013 it requires careful tuning and thoughtful application to deliver superior results.<\/strong><strong><\/strong><\/p>\n\n\n\n<p>So far, KAN has shown the most promise in tasks involving symbolic reasoning, time series analysis, and function approximation.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><a href=\"https:\/\/sii.pl\/en\/job-ads\/\" target=\"_blank\" rel=\"noreferrer noopener\"><img decoding=\"async\" width=\"737\" height=\"170\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/praca-EN-k-3.jpg\" alt=\"\" class=\"wp-image-31150\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/praca-EN-k-3.jpg 737w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/praca-EN-k-3-300x69.jpg 300w\" sizes=\"(max-width: 737px) 100vw, 737px\" \/><\/a><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Conclusions and the future of KAN<\/strong><\/strong><\/h2>\n\n\n\n<p>The Kolmogorov-Arnold Network (KAN) introduces a fresh perspective on long-standing principles in neural networks, particularly the activation function, which has remained mostly unchanged for years. While the basic structure of the artificial neuron remains intact, KAN enables us to finely tune the activation function, allowing the network to better model complex relationships between features and labels. This flexibility could lead to improved performance in capturing intricate data patterns.<\/p>\n\n\n\n<p>However, several important questions arise:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Will this algorithm prove as effective in real-world applications as it has in toy examples or controlled tests outlined in the paper?<\/li>\n\n\n\n<li>Can KAN layers be effectively combined with traditional MLP layers to reduce the number of trainable parameters, and what impact would this have on the model&#8217;s overall accuracy and efficiency?<\/li>\n<\/ul>\n\n\n\n<p>These open questions highlight the potential of KAN, while also pointing to the need for further research and experimentation to fully understand its practical benefits.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Source<\/strong><\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Chen, Z., Choi, H., Poggio, T., Balestriero, R., &amp; Baraniuk, R. (2024). <a href=\"https:\/\/arxiv.org\/abs\/2404.19756\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >KAN: Kolmogorov-Arnold Networks<\/a>. arXiv preprint arXiv:2404.07143<\/li>\n\n\n\n<li>Schoenberg, I. J. (1946). Contributions to the problem of approximation of equidistant data by analytic functions. Quarterly of Applied Mathematics, <strong>4<\/strong>, 45\u201399 and 112\u2013141.<\/li>\n\n\n\n<li><a href=\"https:\/\/medium.com\/@mryasinusif\/why-is-the-kan-kolmogorov-arnold-networks-so-promising-8494242a8bdd\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >Why is the (KAN) Kolmogorov-Arnold Networks so promising<\/a><\/li>\n\n\n\n<li>Runpeng Yu, Weihao Yu, and Xinchao Wang (2024). <a href=\"https:\/\/arxiv.org\/pdf\/2407.16674\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >KAN or MLP: A Fairer Comparison<\/a>. National University of Singapore<\/li>\n\n\n\n<li>Abdullah Al Imran, &amp; Md Farhan Ishmam (2024). <a href=\"https:\/\/arxiv.org\/abs\/2408.08803\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >FourierKAN outperforms MLP on Text Classification Head Fine-tuning<\/a><\/li>\n<\/ol>\n\n\n\n<p>***<\/p>\n\n\n\n<p>If you are interested in the topic of neural networks, be sure to also take a look <a href=\"https:\/\/sii.pl\/blog\/wyszukiwarka\/neuron\/\" target=\"_blank\" rel=\"noopener\" title=\"\">at other articles by our experts<\/a>.<\/p>\n\n\n<div class=\"kk-star-ratings kksr-auto kksr-align-left kksr-valign-bottom\"\n    data-payload='{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;31142&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;bottom&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;2&quot;,&quot;legendonly&quot;:&quot;&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;5&quot;,&quot;starsonly&quot;:&quot;&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;11&quot;,&quot;greet&quot;:&quot;&quot;,&quot;legend&quot;:&quot;5\\\/5 ( votes: 2)&quot;,&quot;size&quot;:&quot;18&quot;,&quot;title&quot;:&quot;KAN: a revolution in neurons \u2013 a new generation of deep learning networks&quot;,&quot;width&quot;:&quot;139.5&quot;,&quot;_legend&quot;:&quot;{score}\\\/{best} ( {votes}: {count})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}'>\n            \n<div class=\"kksr-stars\">\n    \n<div class=\"kksr-stars-inactive\">\n            <div class=\"kksr-star\" data-star=\"1\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"2\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"3\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"4\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"5\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n    <\/div>\n    \n<div class=\"kksr-stars-active\" style=\"width: 139.5px;\">\n            <div class=\"kksr-star\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n    <\/div>\n<\/div>\n                \n\n<div class=\"kksr-legend\" style=\"font-size: 14.4px;\">\n            5\/5 ( votes: 2)    <\/div>\n    <\/div>\n","protected":false},"excerpt":{"rendered":"<p>Artificial intelligence has seen rapid advancements, but the core unit of neural networks \u2013 the artificial neuron \u2013 has remained &hellip; <a class=\"continued-btn\" href=\"https:\/\/sii.pl\/blog\/en\/kan-a-revolution-in-neurons-a-new-generation-of-deep-learning-networks\/\">Continued<\/a><\/p>\n","protected":false},"author":714,"featured_media":31140,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_editorskit_title_hidden":false,"_editorskit_reading_time":0,"_editorskit_is_block_options_detached":false,"_editorskit_block_options_position":"{}","inline_featured_image":false,"footnotes":""},"categories":[1320],"tags":[2822,2819,2820,1526,1442],"class_list":["post-31142","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-hard-development","tag-neural-networks","tag-kan-en","tag-da-en","tag-guidebook","tag-ai-en"],"acf":[],"aioseo_notices":[],"republish_history":[],"featured_media_url":"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2025\/05\/AI_2-1.jpg","category_names":["Hard development"],"_links":{"self":[{"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/posts\/31142"}],"collection":[{"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/users\/714"}],"replies":[{"embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/comments?post=31142"}],"version-history":[{"count":3,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/posts\/31142\/revisions"}],"predecessor-version":[{"id":31220,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/posts\/31142\/revisions\/31220"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/media\/31140"}],"wp:attachment":[{"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/media?parent=31142"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/categories?post=31142"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/tags?post=31142"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}