|
1 | 1 | .. pandas documentation master file, created by
|
2 | 2 |
|
| 3 | +.. module:: pandas |
| 4 | + |
3 | 5 | *********************************************
|
4 | 6 | pandas: powerful Python data analysis toolkit
|
5 | 7 | *********************************************
|
6 | 8 |
|
7 |
| -`PDF Version <pandas.pdf>`__ |
8 |
| - |
9 |
| -`Zipped HTML <pandas.zip>`__ |
10 |
| - |
11 |
| -.. module:: pandas |
12 |
| - |
13 | 9 | **Date**: |today| **Version**: |version|
|
14 | 10 |
|
15 |
| -**Binary Installers:** https://pypi.org/project/pandas |
16 |
| - |
17 |
| -**Source Repository:** https://github.com/pandas-dev/pandas |
18 |
| - |
19 |
| -**Issues & Ideas:** https://github.com/pandas-dev/pandas/issues |
20 |
| - |
21 |
| -**Q&A Support:** https://stackoverflow.com/questions/tagged/pandas |
22 |
| - |
23 |
| -**Developer Mailing List:** https://groups.google.com/forum/#!forum/pydata |
24 |
| - |
25 |
| -**pandas** is a `Python <https://www.python.org>`__ package providing fast, |
26 |
| -flexible, and expressive data structures designed to make working with |
27 |
| -"relational" or "labeled" data both easy and intuitive. It aims to be the |
28 |
| -fundamental high-level building block for doing practical, **real world** data |
29 |
| -analysis in Python. Additionally, it has the broader goal of becoming **the |
30 |
| -most powerful and flexible open source data analysis / manipulation tool |
31 |
| -available in any language**. It is already well on its way toward this goal. |
32 |
| - |
33 |
| -pandas is well suited for many different kinds of data: |
34 |
| - |
35 |
| - - Tabular data with heterogeneously-typed columns, as in an SQL table or |
36 |
| - Excel spreadsheet |
37 |
| - - Ordered and unordered (not necessarily fixed-frequency) time series data. |
38 |
| - - Arbitrary matrix data (homogeneously typed or heterogeneous) with row and |
39 |
| - column labels |
40 |
| - - Any other form of observational / statistical data sets. The data actually |
41 |
| - need not be labeled at all to be placed into a pandas data structure |
42 |
| - |
43 |
| -The two primary data structures of pandas, :class:`Series` (1-dimensional) |
44 |
| -and :class:`DataFrame` (2-dimensional), handle the vast majority of typical use |
45 |
| -cases in finance, statistics, social science, and many areas of |
46 |
| -engineering. For R users, :class:`DataFrame` provides everything that R's |
47 |
| -``data.frame`` provides and much more. pandas is built on top of `NumPy |
48 |
| -<https://www.numpy.org>`__ and is intended to integrate well within a scientific |
49 |
| -computing environment with many other 3rd party libraries. |
50 |
| - |
51 |
| -Here are just a few of the things that pandas does well: |
52 |
| - |
53 |
| - - Easy handling of **missing data** (represented as NaN) in floating point as |
54 |
| - well as non-floating point data |
55 |
| - - Size mutability: columns can be **inserted and deleted** from DataFrame and |
56 |
| - higher dimensional objects |
57 |
| - - Automatic and explicit **data alignment**: objects can be explicitly |
58 |
| - aligned to a set of labels, or the user can simply ignore the labels and |
59 |
| - let `Series`, `DataFrame`, etc. automatically align the data for you in |
60 |
| - computations |
61 |
| - - Powerful, flexible **group by** functionality to perform |
62 |
| - split-apply-combine operations on data sets, for both aggregating and |
63 |
| - transforming data |
64 |
| - - Make it **easy to convert** ragged, differently-indexed data in other |
65 |
| - Python and NumPy data structures into DataFrame objects |
66 |
| - - Intelligent label-based **slicing**, **fancy indexing**, and **subsetting** |
67 |
| - of large data sets |
68 |
| - - Intuitive **merging** and **joining** data sets |
69 |
| - - Flexible **reshaping** and pivoting of data sets |
70 |
| - - **Hierarchical** labeling of axes (possible to have multiple labels per |
71 |
| - tick) |
72 |
| - - Robust IO tools for loading data from **flat files** (CSV and delimited), |
73 |
| - Excel files, databases, and saving / loading data from the ultrafast **HDF5 |
74 |
| - format** |
75 |
| - - **Time series**-specific functionality: date range generation and frequency |
76 |
| - conversion, moving window statistics, moving window linear regressions, |
77 |
| - date shifting and lagging, etc. |
78 |
| - |
79 |
| -Many of these principles are here to address the shortcomings frequently |
80 |
| -experienced using other languages / scientific research environments. For data |
81 |
| -scientists, working with data is typically divided into multiple stages: |
82 |
| -munging and cleaning data, analyzing / modeling it, then organizing the results |
83 |
| -of the analysis into a form suitable for plotting or tabular display. pandas |
84 |
| -is the ideal tool for all of these tasks. |
85 |
| - |
86 |
| -Some other notes |
87 |
| - |
88 |
| - - pandas is **fast**. Many of the low-level algorithmic bits have been |
89 |
| - extensively tweaked in `Cython <https://cython.org>`__ code. However, as with |
90 |
| - anything else generalization usually sacrifices performance. So if you focus |
91 |
| - on one feature for your application you may be able to create a faster |
92 |
| - specialized tool. |
93 |
| - |
94 |
| - - pandas is a dependency of `statsmodels |
95 |
| - <https://www.statsmodels.org/stable/index.html>`__, making it an important part of the |
96 |
| - statistical computing ecosystem in Python. |
97 |
| - |
98 |
| - - pandas has been used extensively in production in financial applications. |
99 |
| - |
100 |
| -.. note:: |
| 11 | +**Download documentation**: `PDF Version <pandas.pdf>`__ | `Zipped HTML <pandas.zip>`__ |
101 | 12 |
|
102 |
| - This documentation assumes general familiarity with NumPy. If you haven't |
103 |
| - used NumPy much or at all, do invest some time in `learning about NumPy |
104 |
| - <https://docs.scipy.org>`__ first. |
| 13 | +**Useful links**: |
| 14 | +`Binary Installers <https://pypi.org/project/pandas>`__ | |
| 15 | +`Source Repository <https://github.com/pandas-dev/pandas>`__ | |
| 16 | +`Issues & Ideas <https://github.com/pandas-dev/pandas/issues>`__ | |
| 17 | +`Q&A Support <https://stackoverflow.com/questions/tagged/pandas>`__ | |
| 18 | +`Mailing List <https://groups.google.com/forum/#!forum/pydata>`__ |
105 | 19 |
|
106 |
| -See the package overview for more detail about what's in the library. |
| 20 | +:mod:`pandas` is an open source, BSD-licensed library providing high-performance, |
| 21 | +easy-to-use data structures and data analysis tools for the `Python <https://www.python.org/>`__ |
| 22 | +programming language. |
107 | 23 |
|
| 24 | +See the :ref:`overview` for more detail about what's in the library. |
108 | 25 |
|
109 | 26 | {% if single_doc and single_doc.endswith('.rst') -%}
|
110 | 27 | .. toctree::
|
111 |
| - :maxdepth: 4 |
| 28 | + :maxdepth: 2 |
112 | 29 |
|
113 | 30 | {{ single_doc[:-4] }}
|
114 | 31 | {% elif single_doc %}
|
115 | 32 | .. autosummary::
|
116 |
| - :toctree: api/generated/ |
| 33 | + :toctree: reference/api/ |
117 | 34 |
|
118 | 35 | {{ single_doc }}
|
119 | 36 | {% else -%}
|
120 | 37 | .. toctree::
|
121 |
| - :maxdepth: 4 |
| 38 | + :maxdepth: 2 |
122 | 39 | {% endif %}
|
123 | 40 |
|
124 | 41 | {% if not single_doc -%}
|
125 |
| - What's New <whatsnew/v0.24.0> |
| 42 | + What's New in 0.24.0 <whatsnew/v0.24.0> |
126 | 43 | install
|
127 | 44 | getting_started/index
|
128 |
| - cookbook |
129 | 45 | user_guide/index
|
130 |
| - r_interface |
131 | 46 | ecosystem
|
132 |
| - comparison_with_r |
133 |
| - comparison_with_sql |
134 |
| - comparison_with_sas |
135 |
| - comparison_with_stata |
136 | 47 | {% endif -%}
|
137 | 48 | {% if include_api -%}
|
138 |
| - api/index |
| 49 | + reference/index |
139 | 50 | {% endif -%}
|
140 | 51 | {% if not single_doc -%}
|
141 | 52 | development/index
|
|
0 commit comments