Reinforcement learning is different from supervised and unsupervised learning (the most common forms of machine learning). Supervised learning involves learning with labelled datasets to produce an output that is generic to that dataset (for example finding the price of a new house given the housing prices of a specific location). Unsupervised learning involves finding the connections between unlabeled data or clustering that data (think of a bunch of images that are not labelled but have parameters like colour, size etc. and the program will return a result of whether the image is a fruit or an animal).

Reinforcement learning is unlike the above methods; it is a framework that does not use the data recognition techniques mentioned above. Instead it uses experience-driven sequential decision-making. This method interacts with the environment to learn and move towards a goal that rewards the actions taken. Most game playing algorithms use reinforcement learning — determining the moves the computer should make to win the game. Without the need to specify all the rules of the game, the algorithm learning from playing the game repeatedly and exploring all possible options.

The use of reinforcement learning is currently very limited (Alpha Go and some robots use it) but many industries are exploring the uses and would continue experimenting with it in 2019.

Some of the industrial use cases that are under consideration are:

Higher education — use of reinforcement learning for personalized teaching and learning systems.

Healthcare — Determining treatment policies for chronic illnesses like schizophrenia, diabetes etc.

Finance — Although many models are currently under study, a live use case is JPMorgan Chase RL program called “LOXM” that is used for executing trades in the stock market. It is known to have increased the speed of the client orders that are executed at the best possible price.

