Export file:


  • RIS(for EndNote,Reference Manager,ProCite)
  • BibTex
  • Text


  • Citation Only
  • Citation and Abstract

Maximize Producer Rewards in Distributed Windmill Environments: A Q-Learning Approach

1 Google Inc., 1600 Amphitheatre Pkwy Mountain View, CA 94043, USA;
2 Department of Electrical Engineering and Computer Science, University of Kansas, 1520 West 15th Street 2001 Eaton Hall, KS 66045, USA;
3 Department of Telecommunication Engineering, University of Oklahoma, 4502 E41st ST #4403, Tulsa, OK 74105, USA;
4 College of Electronics and Information Engineering, Tongji University, 4800 Cao'an Road, 201804, Shanghai, China

Special Issues: Wind Power Implementation Challenges

In Smart Grid environments, homes equipped with windmills are encouraged to generate energy and sell it back to utilities. Time of Use pricing and the introduction of storage devices would greatly influence a user in deciding when to sell back energy and how much to sell. Therefore, a study of sequential decision making algorithms that can optimize the total pay off for the user is necessary. In this paper, reinforcement learning is used to tackle this optimization problem. The problem of determining when to sell back energy is formulated as a Markov decision process and the model is learned adaptively using Q-learning. Experiments are done with varying sizes of storage capacities and under periodic energy generation rates of different levels of fluctuations. The results show a notable increase in discounted total rewards from selling back energy with the proposed approach.
  Article Metrics


1. The Smart Grid: An Introduction. Technical report, Office of Electricity Delivery and Energy Reliability, Department of Energy, 2008.

2. Understanding the Benefits of the Smart Grid. Technical report, DOE/NETL-2010/1413, NETL Lab, Department of Energy, 2010.

3. Methodological Approach for Estimating the Benefits and Costs of Smart Grid Demonstration Projects. Technical report, 1020342, Electric Power Research Institute, 2010.

4. Borenstein S, Jaske M, Rosenfeld A (2002) Dynamic pricing, advanced metering, and demand response in electricity markets. Available from: https://escholarship.org/uc/item/11w8d6m4.

5. King CS (2001) The economics of real-time and time-of-use pricing for residential consumers. Technical report, Technical report, American Energy Institute.

6. SMART GRID POLICY. Technical report, Docket No. PL09-4-000, United States of America Federal Energy Regulatory Commission, 2009.

7. Communication Networks and Systems for Power Utility Automation—Part 7-420: Basic Communication Structure—Distributed Energy Resources Logical Nodes. Technical report, IEC 61850-7-420, International Electrotechnical Commission, 2009.

8. Distributed Generation and Renewable Energy Current Programs for Businesses. Available from: http://docs.cpuc.ca.gov/published/news release/7408.htm.

9. Understanding Net Metering. . Available from: http://www.solarcity.com/learn/understanding-netmetering.aspx.

10. Ketter W, Collins J, Block CA (2010) Smart grid economics: Policy guidance through competitive simulation. ERIM report series research in management Erasmus Research Institute of Management. Erasmus Research Institute of Management (ERIM). Available from: http://hdl.handle.net/1765/21307.

11. Nanduri V, Das TK (2007) A reinforcement learning model to assess market power under auction-based energy pricing. IEEE T Power Syst 22: 85-95.    

12. Krause T, Beck EV, Cherkaoui R, et al. (2006) A comparison of Nash equilibria analysis and agent-based modelling for power markets. Int J Elec Power 28: 599-607.    

13. Frezzi P, Garcés F, Haubrich HJ (2007) Analysis of Short-term Bidding Strategies in Power Markets. Power Tech, 2007 IEEE Lausanne 971-976.

14. Tellidou AC, Bakirtzis AG (2006) Multi-agent reinforcement learning for strategic bidding in power markets. Intelligent Systems, 2006 3rd International IEEE Conference on, 408-413.

15. Watanabe I, Okada K, Tokoro K, et al. (2002) Adaptive multiagent model of electric power market with congestion management. Evolutionary Computation, 2002. CEC'02. Proceedings of the 2002 Congress on, 523-528.

16. Bompard EF, Abrate G, Napoli R, et al. (2007) Multi-agent models for consumer choice and retailer strategies in the competitive electricity market. Int J Emerging Electr Pow Syst 8: 4.

17. Vytelingum P, Voice TD, Ramchurn SD, et al. (2010) Agent-based micro-storage management for the smart grid. Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems 1: 39-46.

18. Li B, Gangadhar S, Cheng S et al. (2011) Predicting user comfort level using machine learning for Smart Grid environments. Innovative Smart Grid Technologies (ISGT), 2011 IEEE PES 1-6.

19. Reddy PP, Veloso MM (2011) Strategy Learning for Autonomous Agents in Smart Grid Markets. Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI), 1446-1451.

20. Reddy PP, Veloso MM (2011) Learned Behaviors of Multiple Autonomous Agents in Smart Grid Markets. Proceedings of the Twenty-Fifth Conference on Artificial Intelligence (AAAI-11), 1396-1401.

21. Goldin J (2007) Making Decisions about the Future: The Discounted-Utility Model. Mind Matters: Wesleyan J Psychology 2: 49-55.

22. Watkins C. Learning from Delayed Rewards. PhD thesis, University of Cambridge,England, 1989.

23. Watkins C, Dayan P (1992) Technical Note: Q-Learning. Mach Learn 8: 279-292.

24. Puterman ML (1990) Markov decision processes. Handbooks in Operations Research and Management Science 2: 331-434.    

Copyright Info: © 2015, Samuel Cheng, et al., licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution Licese (http://creativecommons.org/licenses/by/4.0)

Download full text in PDF

Export Citation

Article outline

Show full outline
Copyright © AIMS Press All Rights Reserved