Skip to content

Commit c2b1892

Browse files
authored
docs: code samples for sample, get, Series.round (#295)
BEGIN_COMMIT_OVERRIDE docs: code samples for `sample`, `get`, `Series.round` (#295) docs: code samples for DataFrame `set_index`, `items` (#295) END_COMMIT_OVERRIDE Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://togithub.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [x] Appropriate docs were updated (if necessary) - `DataFrame.sample`, `Series.sample`: https://screenshot.googleplex.com/kPy5swVACMeBhSo - `DataFrame.get`, `Series.get`: https://screenshot.googleplex.com/7hirn5oz2b4L6B3 - `DataFrame.set_index`: https://screenshot.googleplex.com/3CXARrp5hwV6gau - `DataFrame.items`: https://screenshot.googleplex.com/bk3HAiXZQq3TYD9 - `Series.round`: https://screenshot.googleplex.com/C9c4m84NWNMnAwS Fixes internal issues 318011542 and 318011745 🦕
1 parent 64bdf76 commit c2b1892

File tree

3 files changed

+189
-2
lines changed

3 files changed

+189
-2
lines changed

third_party/bigframes_vendored/pandas/core/frame.py

Lines changed: 76 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1187,6 +1187,47 @@ def set_index(
11871187
Set the DataFrame index (row labels) using one existing column. The
11881188
index can replace the existing index.
11891189
1190+
**Examples:**
1191+
1192+
>>> import bigframes.pandas as bpd
1193+
>>> bpd.options.display.progress_bar = None
1194+
1195+
>>> df = bpd.DataFrame({'month': [1, 4, 7, 10],
1196+
... 'year': [2012, 2014, 2013, 2014],
1197+
... 'sale': [55, 40, 84, 31]})
1198+
>>> df
1199+
month year sale
1200+
0 1 2012 55
1201+
1 4 2014 40
1202+
2 7 2013 84
1203+
3 10 2014 31
1204+
<BLANKLINE>
1205+
[4 rows x 3 columns]
1206+
1207+
Set the 'month' column to become the index:
1208+
1209+
>>> df.set_index('month')
1210+
year sale
1211+
month
1212+
1 2012 55
1213+
4 2014 40
1214+
7 2013 84
1215+
10 2014 31
1216+
<BLANKLINE>
1217+
[4 rows x 2 columns]
1218+
1219+
Create a MultiIndex using columns 'year' and 'month':
1220+
1221+
>>> df.set_index(['year', 'month'])
1222+
sale
1223+
year month
1224+
2012 1 55
1225+
2014 4 40
1226+
2013 7 84
1227+
2014 10 31
1228+
<BLANKLINE>
1229+
[4 rows x 1 columns]
1230+
11901231
Args:
11911232
keys:
11921233
A label. This parameter can be a single column key.
@@ -1621,6 +1662,39 @@ def items(self):
16211662
Iterates over the DataFrame columns, returning a tuple with
16221663
the column name and the content as a Series.
16231664
1665+
**Examples:**
1666+
1667+
>>> import bigframes.pandas as bpd
1668+
>>> bpd.options.display.progress_bar = None
1669+
1670+
>>> df = bpd.DataFrame({'species': ['bear', 'bear', 'marsupial'],
1671+
... 'population': [1864, 22000, 80000]},
1672+
... index=['panda', 'polar', 'koala'])
1673+
>>> df
1674+
species population
1675+
panda bear 1864
1676+
polar bear 22000
1677+
koala marsupial 80000
1678+
<BLANKLINE>
1679+
[3 rows x 2 columns]
1680+
1681+
>>> for label, content in df.items():
1682+
... print(f'--> label: {label}')
1683+
... print(f'--> content:\\n{content}')
1684+
...
1685+
--> label: species
1686+
--> content:
1687+
panda bear
1688+
polar bear
1689+
koala marsupial
1690+
Name: species, dtype: string
1691+
--> label: population
1692+
--> content:
1693+
panda 1864
1694+
polar 22000
1695+
koala 80000
1696+
Name: population, dtype: Int64
1697+
16241698
Returns:
16251699
Iterator: Iterator of label, Series for each column.
16261700
"""
@@ -4587,7 +4661,7 @@ def index(self):
45874661
... 'Location': ['Seattle', 'New York', 'Kona']},
45884662
... index=([10, 20, 30]))
45894663
>>> df
4590-
Name Age Location
4664+
Name Age Location
45914665
10 Alice 25 Seattle
45924666
20 Bob 30 New York
45934667
30 Aritra 35 Kona
@@ -4603,7 +4677,7 @@ def index(self):
46034677
46044678
>>> df1 = df.set_index(["Name", "Location"])
46054679
>>> df1
4606-
Age
4680+
Age
46074681
Name Location
46084682
Alice Seattle 25
46094683
Bob New York 30

third_party/bigframes_vendored/pandas/core/generic.py

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -254,6 +254,55 @@ def get(self, key, default=None):
254254
255255
Returns default value if not found.
256256
257+
**Examples:**
258+
259+
>>> import bigframes.pandas as bpd
260+
>>> bpd.options.display.progress_bar = None
261+
262+
>>> df = bpd.DataFrame(
263+
... [
264+
... [24.3, 75.7, "high"],
265+
... [31, 87.8, "high"],
266+
... [22, 71.6, "medium"],
267+
... [35, 95, "medium"],
268+
... ],
269+
... columns=["temp_celsius", "temp_fahrenheit", "windspeed"],
270+
... index=["2014-02-12", "2014-02-13", "2014-02-14", "2014-02-15"],
271+
... )
272+
>>> df
273+
temp_celsius temp_fahrenheit windspeed
274+
2014-02-12 24.3 75.7 high
275+
2014-02-13 31.0 87.8 high
276+
2014-02-14 22.0 71.6 medium
277+
2014-02-15 35.0 95.0 medium
278+
<BLANKLINE>
279+
[4 rows x 3 columns]
280+
281+
>>> df.get(["temp_celsius", "windspeed"])
282+
temp_celsius windspeed
283+
2014-02-12 24.3 high
284+
2014-02-13 31.0 high
285+
2014-02-14 22.0 medium
286+
2014-02-15 35.0 medium
287+
<BLANKLINE>
288+
[4 rows x 2 columns]
289+
290+
>>> ser = df['windspeed']
291+
>>> ser
292+
2014-02-12 high
293+
2014-02-13 high
294+
2014-02-14 medium
295+
2014-02-15 medium
296+
Name: windspeed, dtype: string
297+
>>> ser.get('2014-02-13')
298+
'high'
299+
300+
If the key is not found, the default value will be used.
301+
302+
>>> df.get(["temp_celsius", "temp_kelvin"])
303+
>>> df.get(["temp_celsius", "temp_kelvin"], default="default_value")
304+
'default_value'
305+
257306
Args:
258307
key: object
259308
@@ -410,6 +459,51 @@ def sample(
410459
411460
You can use `random_state` for reproducibility.
412461
462+
**Examples:**
463+
464+
>>> import bigframes.pandas as bpd
465+
>>> bpd.options.display.progress_bar = None
466+
467+
>>> df = bpd.DataFrame({'num_legs': [2, 4, 8, 0],
468+
... 'num_wings': [2, 0, 0, 0],
469+
... 'num_specimen_seen': [10, 2, 1, 8]},
470+
... index=['falcon', 'dog', 'spider', 'fish'])
471+
>>> df
472+
num_legs num_wings num_specimen_seen
473+
falcon 2 2 10
474+
dog 4 0 2
475+
spider 8 0 1
476+
fish 0 0 8
477+
<BLANKLINE>
478+
[4 rows x 3 columns]
479+
480+
Fetch one random row from the DataFrame (Note that we use `random_state`
481+
to ensure reproducibility of the examples):
482+
483+
>>> df.sample(random_state=1)
484+
num_legs num_wings num_specimen_seen
485+
dog 4 0 2
486+
<BLANKLINE>
487+
[1 rows x 3 columns]
488+
489+
A random 50% sample of the DataFrame:
490+
491+
>>> df.sample(frac=0.5, random_state=1)
492+
num_legs num_wings num_specimen_seen
493+
dog 4 0 2
494+
fish 0 0 8
495+
<BLANKLINE>
496+
[2 rows x 3 columns]
497+
498+
Extract 3 random elements from the Series `df['num_legs']`:
499+
500+
>>> s = df['num_legs']
501+
>>> s.sample(n=3, random_state=1)
502+
dog 4
503+
fish 0
504+
spider 8
505+
Name: num_legs, dtype: Int64
506+
413507
Args:
414508
n (Optional[int], default None):
415509
Number of items from axis to return. Cannot be used with `frac`.

third_party/bigframes_vendored/pandas/core/series.py

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -722,6 +722,25 @@ def round(self, decimals: int = 0) -> Series:
722722
"""
723723
Round each value in a Series to the given number of decimals.
724724
725+
**Examples:**
726+
727+
>>> import bigframes.pandas as bpd
728+
>>> bpd.options.display.progress_bar = None
729+
730+
>>> s = bpd.Series([0.1, 1.3, 2.7])
731+
>>> s.round()
732+
0 0.0
733+
1 1.0
734+
2 3.0
735+
dtype: Float64
736+
737+
>>> s = bpd.Series([0.123, 1.345, 2.789])
738+
>>> s.round(decimals=2)
739+
0 0.12
740+
1 1.34
741+
2 2.79
742+
dtype: Float64
743+
725744
Args:
726745
decimals (int, default 0):
727746
Number of decimal places to round to. If decimals is negative,

0 commit comments

Comments
 (0)