Skip to content

Commit e8f2980

Browse files
committed
Initial release. Hello world :).
0 parents  commit e8f2980

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

97 files changed

+6500
-0
lines changed

.gitignore

+31
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
*.swp
2+
*.pyc
3+
*.py~
4+
.DS_Store
5+
6+
# Setuptools distribution and build folders.
7+
/dist/
8+
/build
9+
10+
# Virtualenv
11+
/env
12+
13+
# Python egg metadata, regenerated from source files by setuptools.
14+
/*.egg-info
15+
16+
*.sublime-project
17+
*.sublime-workspace
18+
19+
logs/
20+
21+
.ipynb_checkpoints
22+
ghostdriver.log
23+
24+
junk
25+
MUJOCO_LOG.txt
26+
mujoco-bundle
27+
28+
29+
rllab_mujoco
30+
31+
tutorial/*.html

.travis.yml

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
dist: trusty
2+
sudo: required
3+
cache:
4+
apt: true
5+
pip: false
6+
language: python
7+
python:
8+
- "2.7"
9+
# - "3.2"
10+
11+
# Install numpy and scipy so we don't need to compile them
12+
addons:
13+
apt:
14+
packages:
15+
- python-numpy
16+
- python-matplotlib
17+
- python-tk
18+
19+
before_install:
20+
- Xvfb :12 -screen 0 800x600x24 +extension RANDR &
21+
- mkdir -p ~/.mujoco
22+
- curl https://openai-public.s3-us-west-2.amazonaws.com/mujoco/$MUJOCO_KEY_BUNDLE.tar.gz | tar xz -C ~/.mujoco
23+
env:
24+
- DISPLAY=:12
25+
26+
install: pip install -r requirements.txt
27+
script: nose2
28+
29+
notifications:
30+
slack:
31+
secure: h/Mxm8K+avH/2W0818zCHmLloRPMFN4NJL01+VShvAkH80/acfjeq/+mMdWXXPL/oOB6kSHDk+GDhwR6+s03ZcPMn5INTFvFYqUc6UWmT+NXtOPxGTN0xda6MdYUkWQUKaMyjFrweZQOMOASFBIzPOq4XeVbM5aB8s4EJhnfAcYZhp/idwKbToVihN4KZgxlvZIFc8iEp1o9uSl5qrsaeYYYXRkb6mauacAwOo4/Chu+cOnoLUOnvhBFE3rV3doDNrbnoalO8XiExtgx5CIAYWrlMni7r2Q+LlzgwdyTH19ZtybPxJTZIIWSBQ2UtcoYdIEDcc36GcUwz1VUGg32mLJJnY2xw80CWR4ixFPpLwwP5Y99WTn8v094B4nmFTWOwNWXp3EkqtTN9XcJoRBqXB5ArucIPqrx57dOCljSKx22gL6WaF2p3stSAxIGFektGyGnisaELrFZG1C63aHoUPicj3gUlijmAoUmYaDRf6P1wnpXqBpKDAWWhAMSatvx1ekmEJgR7OQklQnnfjx9kENDUygNUWS4IQwN2qYieuzHFL3of7/30mTM43+Vt/vWN8GI7j01BXu6FNGGloHxjH1pt3bLP/+uj5BJsT2HWF+Z8XR4VE6cyVuKsQAFgCXwOkoDHALbcwsspONDIt/9ixkesgh1oFt4CzU3UuU5wYs=
32+
on_success: change

CODE_OF_CONDUCT.rst

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
OpenAI Gym is dedicated to providing a harassment-free experience for
2+
everyone, regardless of gender, gender identity and expression, sexual
3+
orientation, disability, physical appearance, body size, age, race, or
4+
religion. We do not tolerate harassment of participants in any form.
5+
6+
This code of conduct applies to all OpenAI Gym spaces (including Gist
7+
comments) both online and off. Anyone who violates this code of
8+
conduct may be sanctioned or expelled from these spaces at the
9+
discretion of the OpenAI team.
10+
11+
We may add additional rules over time, which will be made clearly
12+
available to participants. Participants are responsible for knowing
13+
and abiding by these rules.

Dockerfile

+35
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# A Dockerfile that sets up a full Gym install
2+
FROM ubuntu:14.04
3+
4+
RUN apt-get update \
5+
&& apt-get install -y xorg-dev \
6+
libgl1-mesa-dev \
7+
xvfb \
8+
libxinerama1 \
9+
libxcursor1 \
10+
libglu1-mesa \
11+
libav-tools \
12+
python-numpy \
13+
python-scipy \
14+
python-pyglet \
15+
python-setuptools \
16+
libpq-dev \
17+
libjpeg-dev \
18+
curl \
19+
cmake \
20+
&& apt-get clean \
21+
&& rm -rf /var/lib/apt/lists/* \
22+
&& easy_install pip
23+
24+
WORKDIR /usr/local/gym
25+
RUN mkdir gym && touch gym/__init__.py
26+
COPY ./gym/version.py ./gym
27+
COPY ./requirements.txt .
28+
COPY ./setup.py .
29+
RUN pip install -r requirements.txt
30+
31+
# Finally, upload our actual code!
32+
COPY . /usr/local/gym
33+
34+
WORKDIR /root
35+
ENTRYPOINT ["/usr/local/gym/bin/docker_entrypoint"]

Makefile

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
.PHONY: install test
2+
3+
install:
4+
pip install -r requirements.txt
5+
6+
test:
7+
nose2

README.rst

+208
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
gym
2+
******
3+
4+
**OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms.** This is the ``gym`` open-source library, which gives you access to an ever-growing variety of environments.
5+
6+
``gym`` makes no assumptions about the structure of your agent, and is compatible with any numerical computation library, such as Tensorflow or Theano. You can use it from Python code, and soon from other languages.
7+
8+
If you're not sure where to start, we recommend beginning with the
9+
`docs <https://gym.openai.com/docs>`_ on our site.
10+
11+
.. contents:: **Contents of this document**
12+
:depth: 2
13+
14+
Basics
15+
======
16+
17+
There are two basic concepts in reinforcement learning: the
18+
environment (namely, the outside world) and the agent (namely, the
19+
algorithm you are writing). The agent sends `actions` to the
20+
environment, and the environment replies with `observations` and
21+
`rewards` (that is, a score).
22+
23+
The core `gym` interface is `Env
24+
<https://github.com/openai/gym/blob/master/gym/core.py>`_, which is
25+
the unified environment interface. There is no interface for agents;
26+
that part is left to you. The following are the ``Env`` methods you
27+
should know:
28+
29+
- `reset(self)`: Reset the environment's state. Returns `observation`.
30+
- `step(self, action)`: Step the environment by one timestep. Returns `observation`, `action`, `reward`, `done`.
31+
- `render(self, mode='human', close=False)`: Render one frame of the environment. The default mode will do something human friendly, such as pop up a window. Passing the `close` flag signals the renderer to close any such windows.
32+
33+
Installation
34+
============
35+
36+
You can perform a minimal install of ``gym`` with:
37+
38+
.. code:: shell
39+
40+
git clone [email protected]:gym
41+
cd gym
42+
pip install -e .
43+
44+
You'll be able to run a few environments right away:
45+
46+
- `algorithmic <https://gym.openai.com/envs#algorithmic>`_
47+
- `toy_text <https://gym.openai.com/envs#toy_text>`_
48+
- `classic_control <https://gym.openai.com/envs#classic_control>`_ (you'll need ``pyglet`` to render though)
49+
50+
We recommend playing with those environments at first, and then later
51+
installing the dependencies for the remaining environments.
52+
53+
Installing everything
54+
---------------------
55+
56+
Once you're ready to install everything, run ``pip install -e .[all]``.
57+
58+
MuJoCo has a proprietary dependency we can't set up for you. Follow
59+
the
60+
`instructions <https://github.com/openai/mujoco-py#obtaining-the-binaries-and-license-key>`_
61+
in the ``mujoco-py`` package for help.
62+
63+
For the install to succeed, you'll need to have some system packages
64+
installed. We'll build out the list here over time; please let us know
65+
what you end up installing on your platform.
66+
67+
On Ubuntu 14.04:
68+
69+
.. code:: shell
70+
71+
apt-get install -y numpy python-dev cmake zlib1g-dev libjpeg-dev xvfb libav-tools xorg-dev python-opengl
72+
73+
Supported systems
74+
-----------------
75+
76+
We currenty support Python 2.7 on Linux and OSX.
77+
78+
We will expand support to Python 3 and Windows based on demand. We
79+
will also soon ship a Docker container exposing OpenAI Gym as an API
80+
callable from any platform.
81+
82+
Pip version
83+
-----------
84+
85+
To run ``pip install -e .[all]``, you'll need a semi-recent pip.
86+
Please make sure your pip is at least at version ``1.5.0``. You can
87+
upgrade using the following: ``pip install --ignore-installed
88+
pip``. Alternatively, you can open `setup.py
89+
<https://github.com/openai/gym/blob/master/setup.py>`_ and
90+
install the dependencies by hand.
91+
92+
Installing dependencies for specific environments
93+
-------------------------------------------------
94+
95+
If you'd like to install the dependencies for only specific
96+
environments, see `setup.py
97+
<https://github.com/openai/gym/blob/master/setup.py>`_. We
98+
maintain the lists of dependencies on a per-environment group basis.
99+
100+
Environments
101+
============
102+
103+
The code for each environment group is housed in its own subdirectory
104+
`gym/envs
105+
<https://github.com/openai/gym/blob/master/gym/envs>`_. The
106+
specification of each task is in `gym/envs/__init__.py
107+
<https://github.com/openai/gym/blob/master/gym/envs/__init__.py>`_. It's
108+
worth browsing through both.
109+
110+
Algorithmic
111+
-----------
112+
113+
These are a variety of algorithmic tasks, such as learning to copy a
114+
sequence.
115+
116+
.. code:: python
117+
118+
import gym
119+
env = gym.make('Copy-v0')
120+
env.reset()
121+
env.render()
122+
123+
Atari
124+
-----
125+
126+
The Atari environments are a variety of Atari video games. If you didn't do the full install, you can install dependencies via ``pip install -e .[atari]`` and then get started as follow:
127+
128+
.. code:: python
129+
130+
import gym
131+
env = gym.make('SpaceInvaders-v0')
132+
env.reset()
133+
env.render()
134+
135+
This will install ``atari-py``, which automatically compiles the `Arcade Learning Environment <http://www.arcadelearningenvironment.org/>`_. This can take quite a while (a few minutes on a decent laptop), so just be prepared.
136+
137+
Board games
138+
-----------
139+
140+
The board game environments are a variety of board games. If you didn't do the full install, you can install dependencies via ``pip install -e .[board_game]`` and then get started as follow:
141+
142+
.. code:: python
143+
144+
import gym
145+
env = gym.make('Go9x9-v0')
146+
env.reset()
147+
env.render()
148+
149+
Classic control
150+
---------------
151+
152+
These are a variety of classic control tasks, which would appear in a typical reinforcement learning textbook. If you didn't do the full install, you will need to run ``pip install -e .[classic_control]`` to enable rendering. You can get started with them via:
153+
154+
.. code:: python
155+
156+
import gym
157+
env = gym.make('CartPole-v0')
158+
env.reset()
159+
env.render()
160+
161+
MuJoCo
162+
------
163+
164+
`MuJoCo <http://www.mujoco.org/>`_ is a physics engine which can do
165+
very detailed efficient simulations with contacts. It's not
166+
open-source, so you'll have to follow the instructions in `mujoco-py
167+
<https://github.com/openai/mujoco-py#obtaining-the-binaries-and-license-key>`_
168+
to set it up. You'll have to also run ``pip install -e .[mujoco]`` if you didn't do the full install.
169+
170+
.. code:: python
171+
172+
import gym
173+
env = gym.make('Humanoid')
174+
env.reset()
175+
env.render()
176+
177+
Toy text
178+
--------
179+
180+
Toy environments which are text-based. There's no extra dependency to install, so to get started, you can just do:
181+
182+
.. code:: python
183+
184+
import gym
185+
env = gym.make('FrozenLake')
186+
env.reset()
187+
env.render()
188+
189+
Examples
190+
========
191+
192+
See the ``examples`` directory.
193+
194+
- Run `examples/agents/random_agent.py <https://github.com/openai/gym/blob/master/examples/agents/random_agent.py>`_ to run an simple random agent and upload the results to the scoreboard.
195+
- Run `examples/agents/cem.py <https://github.com/openai/gym/blob/master/examples/agents/cem.py>`_ to run an actual learning agent (using the cross-entropy method) and upload the results to the scoreboard.
196+
- Run `examples/scripts/list_envs <https://github.com/openai/gym/blob/master/examples/scripts/list_envs>`_ to generate a list of all environments. (You see also just `browse <https://gym.openai.com/docs>`_ the list on our site.
197+
- Run `examples/scripts/upload <https://github.com/openai/gym/blob/master/examples/scripts/upload>`_ to upload the recorded output from ``random_agent.py`` or ``cem.py``. Make sure to obtain an `API key <https://gym.openai.com/settings/profile>`_.
198+
199+
Testing
200+
=======
201+
202+
We are using `nose2 <https://github.com/nose-devs/nose2>`_ for tests. You can run them via
203+
204+
.. code:: shell
205+
206+
nose2
207+
208+
You can also run tests in a specific directory by using the ``-s`` option, or by passing in the specific name of the test. See the `nose2 docs <http://nose2.readthedocs.org/en/latest/usage.html#naming-tests>`_ for more details.

bin/docker_entrypoint

+12
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
#!/bin/sh
2+
3+
# This script is the entrypoint for our Docker image.
4+
5+
set -e
6+
7+
# Set up display; otherwise rendering will cause segfaults
8+
rm -f /tmp/.X12-lock
9+
Xvfb :12 -screen 0 800x600x24 +extension RANDR 2>/dev/null &
10+
export DISPLAY=:12
11+
12+
exec "$@"

examples/agents/_policies.py

+19
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Support code for cem.py
2+
3+
class BinaryActionLinearPolicy(object):
4+
def __init__(self, theta):
5+
self.w = theta[:-1]
6+
self.b = theta[-1]
7+
def act(self, ob):
8+
y = ob.dot(self.w) + self.b
9+
a = int(y < 0)
10+
return a
11+
12+
class ContinuousActionLinearPolicy(object):
13+
def __init__(self, theta, n_in, n_out):
14+
assert len(theta) == (n_in + 1) * n_out
15+
self.W = theta[0 : n_in * n_out].reshape(n_in, n_out)
16+
self.b = theta[n_in * n_out : None].reshape(1, n_out)
17+
def act(self, ob):
18+
a = ob.dot(self.W) + self.b
19+
return a

0 commit comments

Comments
 (0)