Hi, I am a Deep Learning / Machine Learning expert and Data Scientist, currently working as a deep learning specialist researcher at a mathematics research center, apart from freelancing work.
The site demonstrated above uses NLP (Natural Language Processing) models to achieve the categorization. I have rich experience with NLP models, mainly with Deep Learning (neural networks), but also with other ML models (like nearest neighbor, SVG, Random Forest, Logistic Regression etc.).
Basically I'll need a dataset of texts and/or keywords that are already categorised, and then I can build a model that will predict the categorisation of new samples. The easiest way to obtain such dataset is probably Wikipedia, which is freely available and the English article corpus is huge. However, there are also other text-based datasets available that I could use if that would fit the goal better.
As many of these (including Wiki articles) already have categories, there is need to use unsupervised learning, as we can train the model on an existing mapping. The weights would be determined by the certainty of the model.
For Deep Learning models I use TensorFlow/Keras, with TFLite compatible models; which means that once the model is trained, there is little resource needed to actually run it; and it can be run from a JavaScript environment, for example (no need for Python), so it can be easily integrated to a web server if that's needed. Also I can create an executable for any OS.