Skip to content

variable length of a dimension in DataArray #1265

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
basnijholt opened this issue Feb 13, 2017 · 7 comments
Closed

variable length of a dimension in DataArray #1265

basnijholt opened this issue Feb 13, 2017 · 7 comments
Labels

Comments

@basnijholt
Copy link

I am having the following issue, say I have this code where I find eigenvalues of some system composed of the variables A, B, C, and D.

In my example the number of eigenvalues (num_eigs) is constant.

However, in general this number is not constant. Is there a way in xarray to handle this issue?

import numpy as np
import xarray as xr
A = np.linspace(0, 40, 10)
B = np.linspace(0, 1, 20)
C = np.linspace(0, 1, 30)
D = np.linspace(-3, 3, 40)

num_eigs = 4

# pretend like this is taking the eigenvalues of a matrix composed with A, B, C, and D
res = np.random.rand(len(A), len(B), len(C), len(D), num_eigs)

# now I need to set the last dimension to `range(res.shape[-1]))`
da = xr.DataArray(res, coords=
    [('A', A), ('B', B), ('C', C), ('D', D), ('eig', range(res.shape[-1]))])

For those that ask why I would need this: I do numerics in quantum mechanics and create quantum systems with certain parameters. Some of these parameters could be the system size. This would result in different Hamiltonians (a matrix describing the system) and therefore in a different number of eigenvalues.

@shoyer
Copy link
Member

shoyer commented Feb 13, 2017

Xarray adds labels to NumPy array, so it can't handle variable length arrays any better than NumPy.

Basically, your options are to either (a) store stored numpy arrays using dtype=object (not really recommended), (b) pad each array up to a common length with NaNs (used to mark missing values in xarray) or (c) put multiple variables in an xarray.Dataset and use different dimension names for the variable length dimension.

Depending on your exact use case, either (b) or (c) could be a good solution.

@RafalSkolasinski
Copy link

I believe that this is a common problem in simulation of quantum mechanical problems. I will try to come with a bit more realistic / practical example that I hope will help with choosing the best solution.

@shoyer
Copy link
Member

shoyer commented Feb 13, 2017

I'm definitely happy to look at a more realistic / complete example. My PhD work was actually doing quantum simulations.

@RafalSkolasinski
Copy link

Hi, I tried to came with a bit more interesting but still simple example

from itertools import product
import numpy as np
import pandas as pd

import holoviews as hv
hv.notebook_extension()


def energies(L, a):
    k = np.pi * np.arange(1, L//a) / L
    return {'exact': k**2, 'approx': 2*(1 - np.cos(k * a)) / a**2}

L = np.arange(10, 21, 2)
a = np.array([1, .5, .25])

data = []
for Li, ai in product(L, a):
    output = dict(L=Li, a=ai)
    output.update(**energies(Li, ai))
    data.append(output)

df = pd.DataFrame(data)

hmap_data = {}
for n, row in df.iterrows():
    key = row.L, row.a
    val = (hv.Points((np.arange(len(row.exact)), row.exact), kdims=['n', 'E']) * 
           hv.Points((np.arange(len(row.approx)), row.approx), kdims=['n', 'E']))
    hmap_data[key] = val
    
hv.HoloMap(hmap_data, kdims=['L', 'a']).select(n=(0, 20), E=(0, 20))

example is simple and don't include any serious simulation. I compare here energies of particle in 1D box vs what would came out from tight-binding simulation. Example is very simple but it captures situation that happens often when calculating spectrum of a finite system: for different system size we get different amount of energy levels.

That simple example is manageable without any pandas or xarray machinery but imagine real simulation made with kwant for series of different input parameters (system dimensions, gate voltages, chemical potentials, etc...)

@stale
Copy link

stale bot commented Jan 23, 2019

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity If this issue remains relevant, please comment here; otherwise it will be marked as closed automatically

@stale
Copy link

stale bot commented Dec 25, 2020

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

@stale stale bot added the stale label Dec 25, 2020
@keewis
Copy link
Collaborator

keewis commented Dec 26, 2020

instead of the workarounds mentioned in #1265 (comment) this should work once the integration with awkward is implemented (see also #4285)

@keewis keewis closed this as completed Dec 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants