Important Dates:
 |
Feng Wu ( ) 
Multi-Agent Systems Lab
School of Computer Science
University of Science and Technology of China
Jinzhai Road 96, Hefei, Anhui 230027, P.R.China
Phone: +86-551-360-6724
E-Mail: wufeng@mail.ustc.edu.cn
Address: P.O.Box 4, Hefei, Anhui 230026, China
Curriculum Vitae
Research Statement
|
I am a PhD candidate of the School of Computer Science and a member of the Multi-Agent Systems Lab at University of Science and Technology of China (USTC). My advisors are Professor Xiaoping Chen and Professor Shlomo Zilberstein.
In Sept. 2008-Sept. 2010, I visited the Resource-Bounded Reasoning Lab at University of Massachusetts Amherst (Umass) and worked on multi-agent planning with decision-theoretic models such as decentralized POMDPs (DEC-POMDPs).
Before that, I worked for the WrightEagle robot soccer team on multi-agent decision-making and real-time game play.
In the summer of 2010, I worked with Bhaskara Marthi using ROS system and PR2 robot.
My main research interests are in planning under uncertainty, multi-agent planning and learning, resource-bounded reasoning, decision theory, game theory, autonomous robots, human-robot interaction and reinforcement learning.
Projects
 |
 |
 |
| Multi-Agent Online Planning |
Memory-Bounded Planning |
Monte-Carlo Planning |
- Planning for Decentralized POMDPs with Many Agents
( more... )
To the best of our knowledge, the state-of-the-art approaches for general decentralized POMDP settings are only tested for problems with 2 agents. Theoretically, all these algorithms can be used for domains with many agents but the scalability presents a big challenge. For large problems with many agents, modeling them as DEC-POMDPs itself is nontrival because the state, joint action and joint observation spaces all grow up exponentially with the number of agents (e.g. a problem with 100 agents and 2 actions for each agent totally has 2^100 joint actions). Technically, even storing the transition, observation and reward tables are challenging. On the other hand, many simulators exist for many real-world applications (e.g. robot soccer). Our preliminary idea is to learn the decentralized policies directly by interacting with simulators. So we don't explicitly represent a problem as a DEC-POMDP while still compute the decentralized policies. The input of our learning alogorithm is a simulated environment and the output is a set of policies, one for each agent. For more details, please read our paper [UAI'10].
- Memory-Bounded Planning for Multi-Agent Systems
( more... )
In multi-agent systems, each agent must considers all possible outcome of the other agents' behavior. In partial observable domains, agents only have their own partial and noisy perspective of the environment. It is very difficult to maintain a belief (like what has been done in single-agent POMDPs) about the states as well as the other agents' strategies (see I-POMDP for more details). One possible solution is to do the planning offline and build a set of policy (decision) trees, one for each agent. So each agent can execute its own policy using only its local information. While the execution is decentralized, the planning can be done in a centralized way. The main challenge for this approach is that the size of the policy trees will grow double-exponentially with the horizon. It will run out of memory very quickly even for very small problems. The memory-bounded technique is to build decentralized policies using only fixed amount of memory. The goal of this project is to explore different memory-bounded policy representations, design methods to construct them efficiently and solve large problems. For more details, please read our papers [AAMAS'10, AAAI'10].
- Online Planning for Multi-Agent Systems
( more... )
The basic idea for online planning is to interleave planning with executing. Instead of computing a complete plan for the entire problem, online planning only focuses on the current step and chooses an action for the current situation. In many applications, the reachable states are very limited. The online planning technique can take the advantage of this and make better use of the computation resource. Moreover, other methods that are very useful in practice such as communication, failure recovery and re-planning can be combined with online planning in a very nature way. The main challenge of the online planning is the time-constraint. When executing, agents have very limited time for planning. Multi-agent systems present another challenge: agents must reason about all possible strategies of the others. For general DEC-POMDP settings, it is insufficient to maintain only a single belief for the team given local information. One possible solution is to keep track of all possible histories (action-observation sequences) of the team, so-called the belief pool. The key challenge for this approach is that the size of beliefs blows up exponentially with time-steps. To overcome this, efficient belief clustering techniques are required. However, any approaximate solutions will introduce errors. Our current solution is to monitor the belief pool and use communication to refresh the pool when any inconsistencies are detected. For more details, please read our paper [ICAPS'09].
- Online Planning for Sensing Objects
( more... )
What's Multi-Agent System? [
AAAI Website,
MAS Blog,
MAS Planning]
Papers
- Online Planning for Multi-Agent Systems with Bounded Communication, Feng Wu, Shlomo Zilberstein, Xiaoping Chen,
Artificial Intelligence (AIJ), Volume 175, Issue 2, Page 487-511, February 2011.
[online version]
An extended version of our ICAPS-09 paper about multi-agent online planning: online coordination, history pool, and belief inconsistency.
- Online Planning for Ad Hoc Autonomous Agent Teams, Feng Wu, Shlomo Zilberstein, Xiaoping Chen,
Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI-11), Barcelona, Spain, July 2011.
(to appear)
- Rollout Sampling Policy Iteration for Decentralized POMDPs, Feng Wu, Shlomo Zilberstein, and Xiaoping Chen,
Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI-10), Page 666-673, Catalina, CA, USA, July 2010.
[abs]
[bib]
[pdf]
We propose the idea of learning decentralized policies from rollout samples (DecRSPI) and solve a general DEC-POMDP problem with 20 agents.
- Trial-Based Dynamic Programming for Multi-Agent Planning, Feng Wu, Shlomo Zilberstein, and Xiaoping Chen,
Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI-10), Page 908-914, Atlanta, GA, USA, July 2010.
[abs]
[bib]
[pdf]
We attack the second challege of MBDP--policy evaluation and introduce TBDP that can solve a navigation problem with 37824 states.
- Point-Based Policy Generation for Decentralized POMDPs, Feng Wu, Shlomo Zilberstein, and Xiaoping Chen,
Proceedings of the 9th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS-10), Page 1307-1314, Toronto, Canada, May 2010.
Also appear in the 5th Workshop on Multi-agent Sequential Decision-Making in Uncertain Domains (MSDM-10).
[abs]
[bib]
[pdf]
We tackle the major challenge of MBDP--policy backup and introduce PBPG that can solve the box-pushing problem in about 11 seconds.
- Multi-Agent Online Planning with Communication, Feng Wu, Shlomo Zilberstein, and Xiaoping Chen,
Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS-09), Page 321-329, Thessaloniki, Greece, September 2009.
Fast-tracked to Artificial Intelligence Journal (AIJ) (one out of six highly selected papers) and nominated as the best paper.
[abs]
[bib]
[pdf]
We present MAOP--an online planning algorithm for DEC-POMDPs and the new communication strategy is based on the belief inconsistency.
- Solving Large-Scale and Sparse-Reward DEC-POMDPs with Correlation-MDPs, Feng Wu and Xiaoping Chen,
Proceedings of the Robot Soccer World Cup XI Symposium (RoboCup-07), Page 208-219, Atlanta, GA, USA, July 2007.
[online version]
We extend the correlation device to have a more complex structure and use it for computing a joint policy in robot soccer domains.
What's DEC-POMDP?
[
IJCAI'09 Tutorial,
AAMAS'10 Tutorial,
Teamcore Page,
Problem Repository]
Awards
- WrightEagle Microsoft Robotic Studios Team, Feng Wu, Limin Zhao, Xufeng Han and Xiaoping Chen,
The World Champion of RoboCup MSRS Challenge, Atlanta, GA, USA, July 2007.
[RoboCup 2007]
[MSRS]
[Photo]
- WE2006: WrightEagle Simulation 2D Team, Feng Wu, Changjie Fan and Xiaoping Chen,
The World Champion of RoboCup Simulation 2D Competition, Breman, German, June 2006.
[RoboCup 2006]
[Video]
[Photo]
- WE2005: WrightEagle Simulation 2D Team, Changjie Fan, Feng Wu and Xiaoping Chen,
The Second Place of RoboCup Simulation 2D Competition, Osaka, Japan, July 2005.
[RoboCup 2005]
[Video]
[Photo]
What's RoboCup?
[
OSAKA Video,
Official Site]
Links
- Programme Committee:
IJCAI'2011,
AAAI'2011
- Conferences Deadlines:
NIPS'11(06/02),
AAMAS'12(10/07)
- Online Courses:
Machine Learning,
Multiagent Systems,
Optimization,
Graphical Models,
Machine Learning Theory
- Academic advices:
Advice collection,
Ernst Advice,
Academic Research,
Paper Writting,
Research Inspiration,
Job Interview
- Open Sources:
ROS Wiki,
ROS API,
ROS SVN,
Stage/Player/Gazebo,
WG Events,
PR Wiki,
OpenCV Wiki,
OpenRAVE Wiki,
OpenMPI Doc
I like Wikimedia!
[
Wikipedia,
Wikibooks,
Wiktionary]