Hybrid models#

Semi-supervised learning#

Semi-supervised learning is a hybrid of supervised learning, used in cases where only a few samples are labeled and a large number of samples are not labeled. Semi-supervised learning enables efficient use of the data available, including the unlabeled data.

Self-supervised learning#

Self-supervised learning problems are unsupervised learning problems where data is not labeled; these problems are translated into supervised learning problems in order to apply algorithms for supervised learning to solve them sustainably.

Usually, self-supervised algorithms are used to solve an alternate task in which they supervise themselves to solve the problem or generate an output. One example of self-supervised learning is Generative Adversarial Networks (GANs); these are commonly used to generate synthetic data by training on labeled and/or unlabeled data.

Multi-instance learning#

Multi-instance learning is a supervised learning problem in which data is not labeled by individual data samples, but cumulatively in categories or classes. Compared to typical supervised learning, where labeling is done for each data sample, such as news articles labeled in categories such as politics, science, and sports, with multi-instance learning, labeling is done categorically. In such scenarios, individual samples are collectively labeled in multiple classes, and by using supervised learning algorithms, predictions can be made.

Multitask learning#

Multitask learning is a form of supervised learning involving training a model on one dataset and using that model to solve multiple tasks or problems.

ML models are often based on single-task learning where they only predict a specific adverse event such as organ dysfunction or life support intervention. It would be far more beneficial to train multitask models that consider multiple competing risks and the interdependencies between organ systems for outcome prediction in realistic settings, like done in multitask prediction of organ dysfunction in the intensive care unit using sequential subnetwork routing.

Reinforcement learning#

Reinforcement learning is a type of learning in which an agent, such as a robot system, learns to operate in a defined environment to perform sequential decision-making tasks or achieve a pre-defined goal. Simultaneously, the agent learns based on continuously evaluated feedback and rewards from the environment. Both feedback and rewards are used to shape the learning of the agent.

An example is Google’s AlphaGo, which outperformed the world’s leading Go player. After 40 days of self-training using feedback and rewards, AlphaGo was able to beat the world’s best human Go player.

Ensemble learning#

Ensemble learning involves two or more models trained on the same data. Predictions are made using each model individually and a collective prediction is made as a result of combining all outputs and averaging them to determine the final outcome or prediction. An example of this is the random forest algorithm.

Transfer learning#

In transfer learning, a model is trained to perform a task and it is transferred to another model as a starting point for training or fine-tuning for performing another task.

This type of learning is popular in deep learning, where pre-trained models are used to solve computer vision or natural language processing problems by fine-tuning or training using a pre-trained model. Learning from pre-trained models gives a huge jump start as models do not need to be trained from scratch, saving large amounts of training data.

Federated learning#

Federated learning is learning in a collaborative way (synergy between cloud and edge). The training process is distributed across multiple devices, storing only a local sample of the data. Data is neither exchanged nor transferred between devices or the cloud to maintain data privacy and security. Instead of sharing data, locally trained models are shared to learn from each other to train global models.

Most likely federated learning will be an active research topic. Studies on federated learning can expand depending on the need for advanced new learning processes/architectures in the machine learning domain.

A new study, The future of digital health with federated learning, claims that federated learning can help solve challenges in data privacy and data governance by enabling machine learning models from non-co-located data.

Note that even though only models, not raw data, are communicated to a central server, models could be reverse engineered to identify client data, and that there may be situations where one of the members of a federation can attack others by inserting hidden backdoors into the joint global model.

On top of that, federated learning models may require frequent communication between nodes. This means storage capacity and high bandwidth would be needed.