This course will introduce you to Reinforcement Learning, a subfield of Machine Learning. Markov Decision Processes, Bandit Algorithms, Dynamic Programming, and Temporal Difference (TD) approaches will be covered. The Value function, Bellman Equation, and Value iteration will be introduced to you. You will also be introduced to Policy Gradient techniques. You will get experience making judgements in an unpredictable setting.

 
                                 
                         
                             
                                             
                                             
                                             
                                             
                                                    
                                                 
                                                    
                                                 
                                                    
                                                 
                                                    
                                                 
                                                    
                                                 
                                            
                                         
                                            
                                         
                                            
                                         
                                            
                                         
                                            
                                         
                                            
                                         
                                            
                                         
                                            
                                         
                                            
                                         
                                            
                                         
                                            
                                         
                                            
                                         
                                            
                                         
                                            
                                         
                                            
                                         
                                            
                                         
                                            
                                         
                         
                         
                         
                         
                         
                         
                        