default search action
Prashanth L. A.
Person information
- affiliation: University of Maryland
- affiliation: INRIA Lille - Nord Europe
- affiliation: Indian Institute of Science, Department of Computer Science and Automation
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j17]Akash Mondal, Prashanth L. A., Shalabh Bhatnagar:
Truncated Cauchy random perturbations for smoothed functional-based stochastic optimization. Autom. 162: 111528 (2024) - [c28]Mizhaan Prajit Maniyar, Prashanth L. A., Akash Mondal, Shalabh Bhatnagar:
A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning. AISTATS 2024: 4708-4716 - [c27]Shubhada Agrawal, Prashanth L. A., Siva Theja Maguluri:
Policy Evaluation for Variance in Average Reward Reinforcement Learning. ICML 2024 - [c26]Gugan Thoppe, Prashanth L. A., Sanjay P. Bhat:
Risk Estimation in a Markov Cost Process: Lower and Upper Bounds. ICML 2024 - [i36]Ayon Ghosh, Prashanth L. A., Krishna P. Jagannathan:
Concentration Bounds for Optimized Certainty Equivalent Risk Estimation. CoRR abs/2405.20933 (2024) - [i35]Tejaram Sangadi, Prashanth L. A., Krishna P. Jagannathan:
Finite Time Analysis of Temporal Difference Learning for Mean-Variance in a Discounted MDP. CoRR abs/2406.07892 (2024) - 2023
- [j16]Nirav Bhavsar, Prashanth L. A.:
Nonasymptotic Bounds for Stochastic Optimization With Biased Noisy Gradient Oracles. IEEE Trans. Autom. Control. 68(3): 1628-1641 (2023) - [c25]Gandharv Patil, Prashanth L. A., Dheeraj Nagaraj, Doina Precup:
Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation. AISTATS 2023: 5438-5448 - [c24]Shalabh Bhatnagar, Prashanth L. A.:
Generalized Simultaneous Perturbation Stochastic Approximation with Reduced Estimator Bias. CISS 2023: 1-6 - [c23]Nithia Vijayan, Prashanth L. A.:
A policy gradient approach for optimization of smooth risk measures. UAI 2023: 2168-2178 - [i34]Mizhaan Prajit Maniyar, Akash Mondal, Prashanth L. A., Shalabh Bhatnagar:
A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning. CoRR abs/2304.10951 (2023) - [i33]Sanjay Bhat, Prashanth L. A., Gugan Thoppe:
VaR\ and CVaR Estimation in a Markov Cost Process: Lower and Upper Bounds. CoRR abs/2310.11389 (2023) - [i32]Sumedh Gupte, Prashanth L. A., Sanjay P. Bhat:
Optimization of utility-based shortfall risk: A non-asymptotic viewpoint. CoRR abs/2310.18743 (2023) - 2022
- [j15]Prashanth L. A., Michael C. Fu:
Risk-Sensitive Reinforcement Learning via Policy Gradient Search. Found. Trends Mach. Learn. 15(5): 537-693 (2022) - [j14]Prashanth L. A., Sanjay P. Bhat:
A Wasserstein Distance Approach for Concentration of Empirical Risk Estimates. J. Mach. Learn. Res. 23: 238:1-238:61 (2022) - [c22]Vincent Y. F. Tan, Prashanth L. A., Krishna P. Jagannathan:
A Survey of Risk-Aware Multi-Armed Bandits. IJCAI 2022: 5623-5629 - [i31]Nithia Vijayan, Prashanth L. A.:
Approximate gradient ascent methods for distortion risk measures. CoRR abs/2202.11046 (2022) - [i30]Dipayan Sen, Prashanth L. A., Aditya Gopalan:
Adaptive Estimation of Random Vectors with Bandit Feedback. CoRR abs/2203.16810 (2022) - [i29]Vincent Y. F. Tan, Prashanth L. A., Krishna P. Jagannathan:
A Survey of Risk-Aware Multi-Armed Bandits. CoRR abs/2205.05843 (2022) - [i28]Akash Mondal, Prashanth L. A., Shalabh Bhatnagar:
A Gradient Smoothed Functional Algorithm with Truncated Cauchy Random Perturbations for Stochastic Optimization. CoRR abs/2208.00290 (2022) - [i27]Gandharv Patil, Prashanth L. A., Dheeraj Nagaraj, Doina Precup:
Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation. CoRR abs/2210.05918 (2022) - [i26]Shalabh Bhatnagar, Prashanth L. A.:
Generalized Simultaneous Perturbation Stochastic Approximation with Reduced Estimator Bias. CoRR abs/2212.10477 (2022) - 2021
- [j13]Prashanth L. A., Nathaniel Korda, Rémi Munos:
Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling. Mach. Learn. 110(3): 559-618 (2021) - [j12]Nithia Vijayan, Prashanth L. A.:
Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint. Syst. Control. Lett. 155: 104988 (2021) - [c21]Ajay Kumar Pandey, Prashanth L. A., Sanjay P. Bhat:
Estimation of Spectral Risk Measures. AAAI 2021: 12166-12173 - [i25]Nithia Vijayan, Prashanth L. A.:
Smoothed functional-based gradient algorithms for off-policy reinforcement learning. CoRR abs/2101.02137 (2021) - [i24]Nithia Vijayan, Prashanth L. A.:
Likelihood ratio-based policy gradient methods for distorted risk measures: A non-asymptotic analysis. CoRR abs/2107.04422 (2021) - [i23]Arvind S. Menon, Prashanth L. A., Krishna P. Jagannathan:
Online Estimation and Optimization of Utility-Based Shortfall Risk. CoRR abs/2111.08805 (2021) - 2020
- [j11]Prashanth L. A., Shalabh Bhatnagar, Nirav Bhavsar, Michael C. Fu, Steven I. Marcus:
Random Directions Stochastic Approximation With Deterministic Perturbations. IEEE Trans. Autom. Control. 65(6): 2450-2465 (2020) - [c20]Prashanth L. A., Krishna P. Jagannathan, Ravi Kumar Kolla:
Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions. ICML 2020: 5577-5586 - [i22]Nirav Bhavsar, Prashanth L. A.:
Non-Asymptotic Bounds for Zeroth-Order Stochastic Optimization. CoRR abs/2002.11440 (2020)
2010 – 2019
- 2019
- [j10]Ravi Kumar Kolla, Prashanth L. A., Sanjay P. Bhat, Krishna P. Jagannathan:
Concentration bounds for empirical conditional value-at-risk: The unbounded case. Oper. Res. Lett. 47(1): 16-20 (2019) - [c19]Vinay Praneeth Boda, Prashanth L. A.:
Correlated bandits or: How to minimize mean-squared error online. ICML 2019: 686-694 - [c18]Sanjay P. Bhat, Prashanth L. A.:
Concentration of risk measures: A Wasserstein distance approach. NeurIPS 2019: 11739-11748 - [i21]Ravi Kumar Kolla, Prashanth L. A., Krishna P. Jagannathan:
Risk-aware Multi-armed Bandits Using Conditional Value-at-Risk. CoRR abs/1901.00997 (2019) - [i20]Vinay Praneeth Boda, Prashanth L. A.:
Correlated bandits or: How to minimize mean-squared error online. CoRR abs/1902.02953 (2019) - [i19]Sanjay P. Bhat, Prashanth L. A.:
Improved Concentration Bounds for Conditional Value-at-Risk and Cumulative Prospect Theory using Wasserstein distance. CoRR abs/1902.10709 (2019) - [i18]Ajay Kumar Pandey, Prashanth L. A., Sanjay P. Bhat:
Estimation of Spectral Risk Measures. CoRR abs/1912.10398 (2019) - 2018
- [j9]Cheng Jie, Prashanth L. A., Michael C. Fu, Steven I. Marcus, Csaba Szepesvári:
Stochastic Optimization in a Cumulative Prospect Theory Framework. IEEE Trans. Autom. Control. 63(9): 2867-2882 (2018) - [i17]Ravi Kumar Kolla, Prashanth L. A., Sanjay P. Bhat, Krishna P. Jagannathan:
Concentration bounds for empirical conditional value-at-risk: The unbounded case. CoRR abs/1808.01739 (2018) - [i16]Prashanth L. A., Shalabh Bhatnagar, Nirav Bhavsar, Michael C. Fu, Steven I. Marcus:
Random directions stochastic approximation with deterministic perturbations. CoRR abs/1808.02871 (2018) - [i15]Prashanth L. A., Michael C. Fu:
Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint. CoRR abs/1810.09126 (2018) - 2017
- [j8]Prashanth L. A., Shalabh Bhatnagar, Michael C. Fu, Steven I. Marcus:
Adaptive System Optimization Using Random Directions Stochastic Approximation. IEEE Trans. Autom. Control. 62(5): 2223-2238 (2017) - [c17]Aditya Gopalan, Prashanth L. A., Michael C. Fu, Steven I. Marcus:
Weighted Bandits or: How Bandits Learn Distorted Values That Are Not Expected. AAAI 2017: 1941-1947 - 2016
- [j7]Prashanth L. A., Mohammad Ghavamzadeh:
Variance-constrained actor-critic algorithms for discounted and average reward MDPs. Mach. Learn. 105(3): 367-417 (2016) - [j6]Prashanth L. A., H. L. Prasad, Shalabh Bhatnagar, Prakash Chandra:
A constrained optimization perspective on actor-critic algorithms and application to network routing. Syst. Control. Lett. 92: 46-51 (2016) - [c16]Xiaowei Hu, Prashanth L. A., András György, Csaba Szepesvári:
(Bandit) Convex Optimization with Biased Noisy Gradient Oracles. AISTATS 2016: 819-828 - [c15]Sai Koti Reddy Danda, Prashanth L. A., Shalabh Bhatnagar:
Improved Hessian estimation for adaptive random directions stochastic approximation. CDC 2016: 3682-3687 - [c14]Prashanth L. A., Cheng Jie, Michael C. Fu, Steven I. Marcus, Csaba Szepesvári:
Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control. ICML 2016: 1406-1415 - [i14]Xiaowei Hu, Prashanth L. A., András György, Csaba Szepesvári:
(Bandit) Convex Optimization with Biased Noisy Gradient Oracles. CoRR abs/1609.07087 (2016) - [i13]Aditya Gopalan, Prashanth L. A., Michael C. Fu, Steven I. Marcus:
Weighted bandits or: How bandits learn distorted values that are not expected. CoRR abs/1611.10283 (2016) - 2015
- [j5]Shalabh Bhatnagar, Prashanth L. A.:
Simultaneous Perturbation Newton Algorithms for Simulation Optimization. J. Optim. Theory Appl. 164(2): 621-643 (2015) - [j4]Prashanth L. A., H. L. Prasad, Nirmit Desai, Shalabh Bhatnagar, Gargi Dasgupta:
Simultaneous perturbation methods for adaptive labor staffing in service systems. Simul. 91(5): 432-455 (2015) - [c13]Nathaniel Korda, Prashanth L. A., Rémi Munos:
Fast Gradient Descent for Drifting Least Squares Regression, with Application to Bandits. AAAI 2015: 2708-2714 - [c12]H. L. Prasad, Prashanth L. A., Shalabh Bhatnagar:
Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games. AAMAS 2015: 1371-1379 - [c11]Nathaniel Korda, Prashanth L. A.:
On TD(0) with function approximation: Concentration bounds and a centered variant with exponential convergence. ICML 2015: 626-634 - [i12]Prashanth L. A., Shalabh Bhatnagar:
Adaptive system optimization using (simultaneous) random directions stochastic approximation. CoRR abs/1502.05577 (2015) - [i11]Prashanth L. A., Cheng Jie, Michael C. Fu, Steven I. Marcus:
Cumulative Prospect Theory Meets Reinforcement Learning: Estimation and Control. CoRR abs/1506.02632 (2015) - [i10]Prashanth L. A., H. L. Prasad, Shalabh Bhatnagar, Prakash Chandra:
A constrained optimization perspective on actor critic algorithms and application to network routing. CoRR abs/1507.07984 (2015) - 2014
- [j3]Prashanth L. A., Abhranil Chatterjee, Shalabh Bhatnagar:
Two timescale convergent Q-learning for sleep-scheduling in wireless sensor networks. Wirel. Networks 20(8): 2589-2604 (2014) - [c10]Prashanth L. A.:
Policy Gradients for CVaR-Constrained MDPs. ALT 2014: 155-169 - [c9]Raphael Fonteneau, Prashanth L. A.:
Simultaneous perturbation algorithms for batch off-policy search. CDC 2014: 2622-2627 - [c8]Prashanth L. A., Abhranil Chatterjee, Shalabh Bhatnagar:
Adaptive sleep-wake control using reinforcement learning in sensor networks. COMSNETS 2014: 1-8 - [c7]Prashanth L. A., Nathaniel Korda, Rémi Munos:
Fast LSTD Using Stochastic Approximation: Finite Time Analysis and Application to Traffic Control. ECML/PKDD (2) 2014: 66-81 - [i9]H. L. Prasad, Prashanth L. A., Shalabh Bhatnagar:
Algorithms for Nash Equilibria in General-Sum Stochastic Games. CoRR abs/1401.2086 (2014) - [i8]Raphael Fonteneau, Prashanth L. A.:
Simultaneous Perturbation Algorithms for Batch Off-Policy Search. CoRR abs/1403.4514 (2014) - [i7]Prashanth L. A., Mohammad Ghavamzadeh:
Actor-Critic Algorithms for Risk-Sensitive Reinforcement Learning. CoRR abs/1403.6530 (2014) - [i6]Prashanth L. A.:
Policy Gradients for CVaR-Constrained MDPs. CoRR abs/1405.2690 (2014) - [i5]Nathaniel Korda, Prashanth L. A.:
On TD(0) with function approximation: Concentration bounds and a centered variant with exponential convergence. CoRR abs/1411.3224 (2014) - 2013
- [c6]Prashanth Lakshmanrao Ananthapadmanabharao, Horabailu Laxminarayana Prasad, Nirmit Desai, Shalabh Bhatnagar:
Mechanisms for hostile agents with capacity constraints. AAMAS 2013: 659-666 - [c5]Prashanth L. A., Mohammad Ghavamzadeh:
Actor-Critic Algorithms for Risk-Sensitive MDPs. NIPS 2013: 252-260 - [i4]Prashanth L. A., Nathaniel Korda, Rémi Munos:
Analysis of stochastic approximation for efficient least squares regression and LSTD. CoRR abs/1306.2557 (2013) - [i3]Nathaniel Korda, Prashanth L. A., Rémi Munos:
Online gradient descent for least squares regression: Non-asymptotic bounds and application to bandits. CoRR abs/1307.3176 (2013) - [i2]Prashanth Lakshmanrao Ananthapadmanabharao, Abhranil Chatterjee, Shalabh Bhatnagar:
Reinforcement Learning for Sleep-Wake Scheduling in Sensor Networks. CoRR abs/1312.7292 (2013) - [i1]Prashanth L. A., H. L. Prasad, Nirmit Desai, Shalabh Bhatnagar, Gargi Dasgupta:
Simultaneous Perturbation Methods for Adaptive Labor Staffing in Service Systems. CoRR abs/1312.7430 (2013) - 2012
- [j2]Prashanth L. A., Shalabh Bhatnagar:
Threshold Tuning Using Stochastic Optimization for Graded Signal Control. IEEE Trans. Veh. Technol. 61(9): 3865-3880 (2012) - 2011
- [j1]Prashanth L. A., Shalabh Bhatnagar:
Reinforcement Learning With Function Approximation for Traffic Signal Control. IEEE Trans. Intell. Transp. Syst. 12(2): 412-421 (2011) - [c4]Prashanth L. A., H. L. Prasad, Nirmit Desai, Shalabh Bhatnagar, Gargi Banerjee Dasgupta:
Stochastic Optimization for Adaptive Labor Staffing in Service Systems. ICSOC 2011: 487-494 - [c3]Prashanth L. A., Shalabh Bhatnagar:
Reinforcement learning with average cost for adaptive control of traffic lights at intersections. ITSC 2011: 1640-1645
2000 – 2009
- 2008
- [c2]Prashanth L. A., Sajal Kumar Das, K. Gopinath:
MAC Design for Heterogeneous Application Support in OFDM Based Wireless Systems. CCNC 2008: 412-413 - [c1]Prashanth L. A., K. Gopinath:
OFDM-MAC algorithms and their impact on TCP performance in next generation mobile networks. COMSWARE 2008: 133-140
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-09-13 00:43 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint