What is MARO? =============== .. figure:: ./images/logo.svg :width: 666px :align: center :alt: MARO :target: https://github.com/microsoft/maro Multi-Agent Resource Optimization (MARO) platform is an instance of Reinforcement learning as a Service (RaaS) for real-world resource optimization. It can be applied to many important industrial domains, such as container inventory management in logistics, bike repositioning in transportation, virtual machine provisioning in data centers, and asset management in finance. Besides `Reinforcement Learning `_ (RL), it also supports other planning/decision mechanisms, such as `Operations Research `_. Key Components --------------- .. figure:: ./images/maro_overview.svg :width: 1000px - Simulation toolkit: it provides some predefined scenarios, and the reusable wheels for building new scenarios. - RL toolkit: it provides a full-stack abstraction for RL, such as agent manager, agent, RL algorithms, learner, actor, and various shapers. - Distributed toolkit: it provides distributed communication components, interface of user-defined functions for message auto-handling, cluster provision, and job orchestration. Quick Start ------------- .. code-block:: python from maro.simulator import Env from maro.simulator.scenarios.cim.common import Action # Initialize an environment with a specific scenario, related topology. # In Container Inventory Management, 1 tick means 1 day, here durations=100 means a length of 100 days env = Env(scenario="cim", topology="toy.5p_ssddd_l0.0", start_tick=0, durations=100) # Query environment summary, which includes business instances, intra-instance attributes, etc. print(env.summary) for ep in range(2): # Gym-like step function. metrics, decision_event, is_done = env.step(None) while not is_done: past_week_ticks = [ x for x in range(decision_event.tick - 7, decision_event.tick) ] decision_port_idx = decision_event.port_idx intr_port_infos = ["booking", "empty", "shortage"] # Query the snapshot list of the environment to get the information of # the booking, empty container inventory, shortage of the decision port in the past week. past_week_info = env.snapshot_list["ports"][ past_week_ticks : decision_port_idx : intr_port_infos ] dummy_action = Action( vessel_idx=decision_event.vessel_idx, port_idx=decision_event.port_idx, quantity=0 ) # Drive environment with dummy action (no repositioning). metrics, decision_event, is_done = env.step(dummy_action) # Query environment business metrics at the end of an episode, # it is your optimized object (usually includes multi-target). print(f"ep: {ep}, environment metrics: {env.metrics}") env.reset() Contents ---------- .. toctree:: :maxdepth: 2 :caption: Installation installation/pip_install.rst installation/playground.rst installation/grass_cluster_provisioning_on_azure.rst installation/k8s_cluster_provisioning_on_azure.rst .. toctree:: :maxdepth: 2 :caption: Scenarios scenarios/container_inventory_management.rst scenarios/citi_bike.rst .. toctree:: :maxdepth: 2 :caption: Examples examples/multi_agent_dqn_cim.rst examples/greedy_policy_citi_bike.rst .. toctree:: :maxdepth: 2 :caption: Key Components key_components/simulation_toolkit.rst key_components/data_model.rst key_components/event_buffer.rst key_components/business_engine.rst key_components/rl_toolkit.rst key_components/distributed_toolkit.rst key_components/communication.rst key_components/orchestration.rst .. toctree:: :maxdepth: 2 :caption: API Documents apidoc/maro.rst