Skip to content

189 ordinal scale domain item ordering #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 42 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
87ab745
Reifying current default ordering logic with a passing test case and …
monfera Mar 25, 2016
a547482
Ensure that null / undefined removal is in place, even in the presenc…
monfera Mar 25, 2016
69bf832
Ensure that no errors arise from the possibility of broader order spe…
monfera Mar 25, 2016
3fd3cc0
Ensure that it adheres to the categories array even if it doesn't com…
monfera Mar 25, 2016
e2977c5
Adding attribute definitions: categorymode and categories; simplifyin…
monfera Mar 26, 2016
62c4ac5
Renaming of 'categories' to 'categorylist' (it was explicitly deleted…
monfera Mar 27, 2016
85d60d6
Minimal commit to demonstrate the working of categorymode = 'array'
monfera Mar 29, 2016
f6c40b6
Minimal change for implementing 'category ascending' and 'category de…
monfera Mar 31, 2016
e4f2167
factored out orderedCategories into a separate function
monfera Mar 31, 2016
03643fe
factored out orderedCategories into a separate function
monfera Mar 31, 2016
9c366ce
lint orderedCategories
monfera Mar 31, 2016
f626b08
#189 role: info
monfera Apr 7, 2016
2f939af
#189 updating preexisting test cases in line with PR feedback (if the…
monfera Apr 7, 2016
ce35e8b
#189 updating preexisting test cases so that order itself is checked;…
monfera Apr 7, 2016
262c747
#189 updating preexisting test cases: further updates to proper axis …
monfera Apr 7, 2016
5f33a7b
#189 updating preexisting test cases: further updates to proper axis …
monfera Apr 7, 2016
f4214cb
#189 adding a path for when categorymode is not in ['array', 'categor…
monfera Apr 8, 2016
cbc98fd
#189 updating test case with non-utilized categorylist elements at be…
monfera Apr 8, 2016
a806ca2
#189 test case with null value for a category we just want for an axi…
monfera Apr 8, 2016
5ad6b5c
#189 baseline test case based on codepen pointed to by @etpinard
monfera Apr 8, 2016
82076f2
#189 commented out categorymode value ascending/descending as it's no…
monfera Apr 9, 2016
6fb2871
#189 adding test cases for trace lines with mutually exclusive catego…
monfera Apr 9, 2016
616121b
#189 adding test cases for trace lines with partially overlapping, fu…
monfera Apr 9, 2016
89794c6
#189 adding test cases for combinations of category ordering and stac…
monfera Apr 9, 2016
849978e
#189 test case linting
monfera Apr 9, 2016
d6fad10
#189 adding missing docs and reducing assumptions
monfera Apr 9, 2016
7c355b8
#189 adding actual DOM axis tick order tests for a couple of less tri…
monfera Apr 9, 2016
8a2292c
#189 rewriting cateogory sorter to O(1) + one sort call
monfera Apr 9, 2016
9b01bcc
#189 rewriting cateogory sorter: extract out logic
monfera Apr 9, 2016
010451b
#189 rewriting cateogory sorter: switching to switch
monfera Apr 9, 2016
6c9d0b5
#189 rewriting cateogory sorter: misc. improvements
monfera Apr 9, 2016
166aeb4
#189 renaming for uniformity (predominantly createGraphDiv() seems to…
monfera Apr 12, 2016
e472e14
#189 initial round of coercions with test cases
monfera Apr 12, 2016
d16fe6b
#189 additional tests for categorymode coercions
monfera Apr 12, 2016
0314f37
#189 comment fix
monfera Apr 12, 2016
043ac1b
#189 PR feedback and linting
monfera Apr 12, 2016
b98bd65
#189 image based regression test JSONs
monfera Apr 12, 2016
e007f73
#189 reworking the order code because it converted numbers to strings…
monfera Apr 12, 2016
34a5054
#189 adding image based tests
monfera Apr 12, 2016
c6d44d9
#189 comment update
monfera Apr 12, 2016
44a0c3d
#189 switching from O(n) to O(log(N)) complexity unique insertion sort
monfera Apr 12, 2016
613fda3
#189 adding axis attributes to 3D plots
monfera Apr 12, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/plot_api/plot_api.js
Original file line number Diff line number Diff line change
Expand Up @@ -851,10 +851,10 @@ function doCalcdata(gd) {
fullLayout._piecolormap = {};
fullLayout._piedefaultcolorcount = 0;

// delete category list, if there is one, so we start over
// initialize the category list, if there is one, so we start over
// to be filled in later by ax.d2c
for(i = 0; i < axList.length; i++) {
axList[i]._categories = [];
axList[i]._categories = axList[i]._initialCategories.slice();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice find.

}

for(i = 0; i < fullData.length; i++) {
Expand Down
5 changes: 5 additions & 0 deletions src/plots/cartesian/axis_defaults.js
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ var layoutAttributes = require('./layout_attributes');
var handleTickValueDefaults = require('./tick_value_defaults');
var handleTickDefaults = require('./tick_defaults');
var setConvert = require('./set_convert');
var orderedCategories = require('./ordered_categories');
var cleanDatum = require('./clean_datum');
var axisIds = require('./axis_ids');

Expand Down Expand Up @@ -64,6 +65,10 @@ module.exports = function handleAxisDefaults(containerIn, containerOut, coerce,
}
}

containerOut._initialCategories = axType === 'category' ?
orderedCategories(letter, containerIn.categorymode, containerIn.categorylist, options.data) :
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to coerce categorymode and cateogrylist before sending them to orderedCategories.

Coercion is done in this scope by the shortcut coerce function defined in plots/cartesian/layout_defaults.js. The shortcut coerce function is wrapper around Lib.coerce which applies coercion rules based on the type of the user input. For example, an enumerated attribute can only be set to the possible values listed in the attributes definitions, if not, it gets coerced to the default value.

Moreover, we'll need to add logic around categorymode and categorylist. For instance, at the moment, categorylist only has an effect if categorymode is set to array. Meaning that if a user specifies categorylist without categorymode, the defaults categories will be shown even though it is clear that the user wanted custom categories. So, we can be smarter by overriding the catogorymode default if categoylist is an array, similar to what's being done currently for tickmode and tickvals in plots/cartesian/tick_value_defaults.js (see tests here for more info).

We may want to look up lib/nested_properly.js while familiarizing yourself with Lib.coerce (see its test cases here).

Don't hesitate to ask me any questions about our coercion framework. Surprising, it may the trickiest part to understand about plotly.js.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard Looking into this now as it appears to be the last piece in this CR. My test cases will need an update once coercion is in place, to switch to the more data-driven default or override behavior.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard tl;dr blocking question for you at the end.

As an aid to thinking about goals, here's a summary of what coercions I'm planning:

  1. If categorymode is not one of those in the attributes definition, it'll be coerced to the default trace value.
  2. No matter what categorymode is specified or coerced in item#1, if a categorylist is provided and it is an array, categorymode will be coerced into array.

... well it looks like the two things you listed :-) My original take was that categorymode, as a switch, drives the show in an unambiguous way, and the user can flip-flop among modes without worrying about removing and readding (or commenting/uncommenting) the categorylist property. Even now I'm vague about the case where categorymode is category ascending and categorylist is also given. Moreover, it renders the categorymode === 'array' redundant (documentational at best) since the presence of categorylist takes precedence. Maybe we should even do away with categorymode === 'array', making categorymode and categorylist mutually exclusive?

Taking it one step further, shouldn't we just have one property, e.g. called categoryorder that can be category ascending, category descending or an array?

I'm now plunging into the basic coercion so we don't have an unexpected enumerative value but your input to the above question would be useful. It's not a big deal to unify the two properties and at least technically, it would be less ambiguous.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original take was that categorymode, as a switch, drives the show in an unambiguous way, and the user can flip-flop among modes without worrying about removing and readding (or commenting/uncommenting) the categorylist property.

... and you would be correct.

What smart coercion is trying to remove redundancy. For example,

var layout: {
  xaxis: {
    categorylist: ['apples', 'bananas', 'clementines']
  }
};

should be considered the same as:

var layout: {
  xaxis: {
    categorymode: 'array',  // redundant !!!
    categorylist: ['apples', 'bananas', 'clementines']
  }
};

But,

var layout: {
  xaxis: {
    categorymode: 'category descending',  // still used as a switch 
    categorylist: ['apples', 'bananas', 'clementines']  // not used in graph
  }
};

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard Thanks!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and @monfera I should add, patterns like ⏫ are common in plotly.js.

Don't try to reinvent the wheel. As mentioned before, this block should serve as a good guide for how categorymode and categorylist should interact.

[];

setConvert(containerOut);

coerce('title', defaultTitle);
Expand Down
30 changes: 30 additions & 0 deletions src/plots/cartesian/layout_attributes.js
Original file line number Diff line number Diff line change
Expand Up @@ -446,6 +446,36 @@ module.exports = {
'Only has an effect if `anchor` is set to *free*.'
].join(' ')
},
categorymode: {
valType: 'enumerated',
values: [
'trace', 'category ascending', 'category descending',
'value ascending', 'value descending','array'
],
dflt: 'trace',
role: 'style',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

role: 'info'.

By the way, the attribute role fields have no use in plotly.js. They are stripped out on the bundle step. They are only used in plotly.py where they indicate which attribute to strip when the strip_style flag is turned on.

description: [
'Specifies the ordering logic for the case of categorical variables.',
'By default, plotly uses *trace*, which specifies the order that is present in the data supplied.',
'Set `categorymode` to *category ascending* or *category descending* if order should be determined by',
'the alphanumerical order of the category names.',
'Set `categorymode` to *value ascending* or *value descending* if order should be determined by the',
'numerical order of the values.',
'Set `categorymode` to *array* to derive the ordering from the attribute `categorylist`. If a category',
'is not found in the `categorylist` array, the sorting behavior for that attribute will be identical to',
'the *trace* mode. The unspecified categories will follow the categories in `categorylist`.'
].join(' ')
},
categorylist: {
valType: 'data_array',
role: 'style',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here role: 'info'

description: [
'Sets the order in which categories on this axis appear.',
'Only has an effect if `categorymode` is set to *array*.',
'Used with `categorymode`.'
].join(' ')
},


_deprecated: {
autotick: {
Expand Down
33 changes: 33 additions & 0 deletions src/plots/cartesian/ordered_categories.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
/**
* Copyright 2012-2016, Plotly, Inc.
* All rights reserved.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/


'use strict';

var d3 = require('d3');


/**
* TODO add documentation
*/
module.exports = function orderedCategories(axisLetter, categorymode, categorylist, data) {

return categorymode === 'array' ?

// just return a copy of the specified array ...
categorylist.slice() :

// ... or take the union of all encountered tick keys and sort them as specified
// (could be simplified with lodash-fp or ramda)
[].concat.apply([], data.map(function(d) {return d[axisLetter];}))
.filter(function(element, index, array) {return index === array.indexOf(element);})
.sort(({
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We like using for loops in these situations. Although they may be not as nice to look at than these functional patterns, they are much faster. 🐎

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard Thanks, it'll be a quick update from me.

'category ascending': d3.ascending,
'category descending': d3.descending
})[categorymode]);
};
5 changes: 3 additions & 2 deletions src/plots/cartesian/set_convert.js
Original file line number Diff line number Diff line change
Expand Up @@ -182,8 +182,9 @@ module.exports = function setConvert(ax) {
// encounters them, ie all the categories from the
// first data set, then all the ones from the second
// that aren't in the first etc.
// TODO: sorting options - do the sorting
// progressively here as we insert?
// it is assumed that this function is being invoked in the
// already sorted category order; otherwise there would be
// a disconnect between the array and the index returned

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monfera can you elaborate on this?

What else have you tried?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard I tried things like this:

  1. I've first tried the change inside the suggested place (https://github.com/plotly/plotly.js/blob/bbb8205903cdd34ae019d5cd6a2c197e9a2550e7/src/plots/cartesian/set_convert.js#L179-L194): instead of pushing into the array, I inserted it to ensure orderedness (insertion sort). More accurately, for simplicity, I just sorted upon each insertion to try before doing too much work (paid attention to the fact that [].sort mutates the array).
  2. Then I also quickly tried doing the sort only after the loop that invokes ax.d2c but the issue is, by that point we already have the indices, and a sort renders them obsolete.

Then I spent a bit of time better understanding concepts, the data flow and the various ways it can be invoked (e.g. plotting multiple lines; overplotting onto an already existing plot etc). Learnt that currently, the categorical X axis ordering is totally driven by the order in which the point tuples arrive and made sample plots. Which was the point at which I gave up on this point for modification and went for the lower hanging fruit of array. The other options will, I think, be more intrusive to the data flow.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard maybe an example is helpful: The first vector has X cat. values ['a', 'c', 'd']. This is already sorted. Then a subsequent series is plotted in the same chart, with values for ['a', 'b', 'c', 'd']. By this point, the function has returned indices such as 0, 1, 2 for ['a', 'c', 'd'], respectively. Then in the next series comes a new point 'b' correctly inserted at index 1 but the previously returned indices will become invalid and the plot will come out nonsensical (more precisely, having checked the plot, the way it came out was consistent with this mechanics).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard I also considered doing a sorting a lot earlier (more upstream), but it would have issues: 1) the code should possibly reorder only after it's established that it's a categorical axis (it can be user specified or determined heuristically); 2) by that point there are lots of places where _fullData etc. are persisted on objects in the trace order; 3) I believe it would be wrong to change the trace order just for this CR because of the snail trail example and the general possibility that the user does not want to have plotly change the user input order (which is currently implicitly taken as the trace order).

So my current thought is that we need to do the axis tick sorting downstream, nearer the point which is responsible for the d3 data binding order for the axis ticks, and we have to accommodate for the possibility that new lines introduce new points that have to be inserted in lines that have already been added previously.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monfera Thanks for this very detailed overview.

You've got me convinced, ax._categories can't be sorted from inside the d2c routine.

We need to fill in ax._categories earlier. I think within makeCalcdata makes the most sense at the moment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filling in ax._categories within the default step would definitely work (albeit with a slight performance hit).

We already need to loop all traces to check for box plot irregularities here, so maybe you could fill in ax._categories within the same loop.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard I'll look into it, thanks! Unrelated: I learnt that date axes take ISO strings "2016-03-29" as values. When I expressly indicate axis type = 'category' it behaves as categories and the axis ticks get rendered as plaintext "2016-03-29" rather than nicely formatted date. So I'm not yet grounded in the approach for coercion which was step #2 in your original comment.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard I'll look into it more closely today, but on an initial look, the suggested place isn't as trivial (to me) and I'd put it slightly after this point, because

  • setAutoType bails if ax.type!=='-' and at the only place it's called from, there is an identical (redundant) check
  • the loop in question depends on isBoxWithoutPositionCoords returning true
  • the inside of the loop (now) is solely dedicated to building up the boxPositions
  • if axis type 'category' isn't explicitly specified, it won't be known until the subsequent call to autoType() at the end

Therefore I'm planning to put the logic after having returned from setAutoType; the earliest point seems to be just before the call to setConvert.

This way, we wouldn't reuse the suggested loop, but the loop over the traces probably doesn't have a measurable performance impact anyway (I'm assuming there usually aren't thousands or 10ks of trace lines).

I'll do some more tests to ensure I'm not overlooking something; just wanted to share current thinking.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anywhere, in plots/cartesian/axis_defaults.js is fine at this stage. We'll fine tune the location for performance once the functionality is in a working state.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard Thanks for the quick note, I'll go ahead.


if(v !== null && v !== undefined && ax._categories.indexOf(v) === -1) {
ax._categories.push(v);
Expand Down
224 changes: 218 additions & 6 deletions test/jasmine/tests/calcdata_test.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,16 @@ var createGraphDiv = require('../assets/create_graph_div');
var destroyGraphDiv = require('../assets/destroy_graph_div');

describe('calculated data and points', function() {
describe('connectGaps', function() {

var gd;
var gd;

beforeEach(function() {
gd = createGraphDiv();
});
beforeEach(function() {
gd = createGraphDiv();
});

afterEach(destroyGraphDiv);
afterEach(destroyGraphDiv);

describe('connectGaps', function() {

it('should exclude null and undefined points when false', function() {
Plotly.plot(gd, [{ x: [1,2,3,undefined,5], y: [1,null,3,4,5]}], {});
Expand All @@ -28,4 +29,215 @@ describe('calculated data and points', function() {
expect(gd.calcdata[0][3]).toEqual({ x: false, y: false});
});
});

xdescribe('category ordering', function() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monfera these tests are great. The test 🐅 thanks you!

Could you add a couple more cases where multiple traces with shared and/or mutually exclusive categories?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard Thanks :-) Yes, I guess we'll have at least double of the number of test cases (corner cases etc.). Just didn't want to race ahead of your initial feedback and instead, I looked into the unrelated bug you sent my way. Thanks for these specific suggestions.


describe('default category ordering reified', function() {

it('should output categories in the given order by default', function() {

Plotly.plot(gd, [{x: ['c','a','e','b','d'], y: [15,11,12,13,14]}], { xaxis: {
type: 'category'
}});

expect(gd.calcdata[0][0].y).toEqual(15);
expect(gd.calcdata[0][1].y).toEqual(11);
expect(gd.calcdata[0][2].y).toEqual(12);
expect(gd.calcdata[0][3].y).toEqual(13);
expect(gd.calcdata[0][4].y).toEqual(14);
});

it('should output categories in the given order if `trace` order is explicitly specified', function() {

Plotly.plot(gd, [{x: ['c','a','e','b','d'], y: [15,11,12,13,14]}], { xaxis: {
type: 'category',
categorymode: 'trace'
// Wouldn't it be preferred to supply a function and plotly would have several functions like this?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice 😏 , but only for JS users.

All plotly.js attributes must be JSON serializable so that folks using our python and R libraries can have access to the same features.

// E.g. it's easier for symbol completion (whereas there's no symbol completion on string config)
// See arguments from Mike Bostock, highlighted in medium green here:
// https://medium.com/@mbostock/what-makes-software-good-943557f8a488#eef9
// Plus if it's a function, then users can roll their own.
//
// Also, if axis tick order is made configurable, shouldn't we make trace order configurable?
// Trace order as in, if a line or curve is drawn through points, what's the trace sequence.
// These are two orthogonal concepts. In this round, I'm assuming that the trace order is implied
// by the order the {x,y} arrays are specified.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this round, I'm assuming that the trace order is implied by the order the {x,y} arrays are specified

... and you would be correct.

}});

expect(gd.calcdata[0][0].y).toEqual(15);
expect(gd.calcdata[0][1].y).toEqual(11);
expect(gd.calcdata[0][2].y).toEqual(12);
expect(gd.calcdata[0][3].y).toEqual(13);
expect(gd.calcdata[0][4].y).toEqual(14);
});
});

describe('domain alphanumerical category ordering', function() {

it('should output categories in ascending domain alphanumerical order', function() {

Plotly.plot(gd, [{x: ['c','a','e','b','d'], y: [15,11,12,13,14]}], { xaxis: {
type: 'category',
categorymode: 'category ascending'
}});

expect(gd.calcdata[0][0].y).toEqual(11);
expect(gd.calcdata[0][1].y).toEqual(13);
expect(gd.calcdata[0][2].y).toEqual(15);
expect(gd.calcdata[0][3].y).toEqual(14);
expect(gd.calcdata[0][4].y).toEqual(12);
});
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard Just an FYI, since I started this CR with the test cases, my conception on where order would be present became outdated. As now I understand that gd.calcdata will keep the trace order, I'm updating test cases such that the original trace order is expected, and explicit checks on the x/y tuples are in place. It looks good to me but please tell me if you have an alternative suggestion.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nicely done.


it('should output categories in descending domain alphanumerical order', function() {

Plotly.plot(gd, [{x: ['c','a','e','b','d'], y: [15,11,12,13,14]}], { xaxis: {
type: 'category',
categorymode: 'category descending'
}});

expect(gd.calcdata[0][0].y).toEqual(12);
expect(gd.calcdata[0][1].y).toEqual(14);
expect(gd.calcdata[0][2].y).toEqual(15);
expect(gd.calcdata[0][3].y).toEqual(13);
expect(gd.calcdata[0][4].y).toEqual(11);
});

it('should output categories in categorymode order even if category array is defined', function() {

Plotly.plot(gd, [{x: ['c','a','e','b','d'], y: [15,11,12,13,14]}], { xaxis: {
type: 'category',
categorymode: 'category ascending',
categorylist: ['b','a','d','e','c'] // These must be ignored. Alternative: error?
}});

expect(gd.calcdata[0][0].y).toEqual(11);
expect(gd.calcdata[0][1].y).toEqual(13);
expect(gd.calcdata[0][2].y).toEqual(15);
expect(gd.calcdata[0][3].y).toEqual(14);
expect(gd.calcdata[0][4].y).toEqual(12);
});

it('should output categories in ascending domain alphanumerical order, excluding undefined', function() {

Plotly.plot(gd, [{x: ['c',undefined,'e','b','d'], y: [15,11,12,13,14]}], { xaxis: {
type: 'category',
categorymode: 'category ascending'
}});

expect(gd.calcdata[0][0].y).toEqual(11);
expect(gd.calcdata[0][1].y).toEqual(15);
expect(gd.calcdata[0][2].y).toEqual(14);
expect(gd.calcdata[0][3].y).toEqual(12);
});
});

describe('codomain numerical category ordering', function() {

it('should output categories in ascending codomain numerical order', function() {

Plotly.plot(gd, [{x: ['c','a','e','b','d'], y: [15,11,12,13,14]}], { xaxis: {
type: 'category',
categorymode: 'value ascending'
}});

expect(gd.calcdata[0][0].y).toEqual(11);
expect(gd.calcdata[0][1].y).toEqual(12);
expect(gd.calcdata[0][2].y).toEqual(13);
expect(gd.calcdata[0][3].y).toEqual(14);
expect(gd.calcdata[0][4].y).toEqual(15);
});

it('should output categories in descending codomain numerical order', function() {

Plotly.plot(gd, [{x: ['c','a','e','b','d'], y: [15,11,12,13,14]}], { xaxis: {
type: 'category',
categorymode: 'value descending'
}});

expect(gd.calcdata[0][0].y).toEqual(15);
expect(gd.calcdata[0][1].y).toEqual(14);
expect(gd.calcdata[0][2].y).toEqual(13);
expect(gd.calcdata[0][3].y).toEqual(12);
expect(gd.calcdata[0][4].y).toEqual(11);
});

it('should output categories in descending codomain numerical order, excluding nulls', function() {

Plotly.plot(gd, [{x: ['c','a','e','b','d'], y: [15,11,null,13,14]}], { xaxis: {
type: 'category',
categorymode: 'value descending'
}});

expect(gd.calcdata[0][0].y).toEqual(15);
expect(gd.calcdata[0][1].y).toEqual(14);
expect(gd.calcdata[0][2].y).toEqual(12);
expect(gd.calcdata[0][3].y).toEqual(11);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is correct, the null items should disappear from the calcdata data. Similar to the current behavior.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard Clarifications:

  • When either the category or the value of a trace point is null, it gets mapped to {x: false, y: false} by the current production code, rather than a full disappearance, I haven't changed it.
  • In case a category is enlisted in categorylist but has no point for it in the trace: the x values in the {x,y} index tuples will correctly 'jump over' the category which doesn't have a data point. However there's no artificially generated point for such a category so here as well, we don't exactly have a null value (simply, there won't be a tuple with x as a missing category).

As I'm looking at the generated plots as well, I found something unexpected: when the first N or last N categories in categorylist have no corresponding point(s) in the trace, they're chopped from the axis. My guess is that it is to do with the automatic zooming onto the actual range of the data, usually done in the case of continuous axes.

I believe it's not the desired behavior for categorical axes. If it's indeed a range focus thing, then it's a preexisting condition that arose now because in the past, the axis ticks were driven by the trace points, so trivially, there used to be at least one point for each category. Should I work on solving it, or should we optimize for as quick merging of the baseline functionality as possible?

Here's the data:

Plotly.plot(gd, [{x: ['c','a','e','b','d'], y: [15,11,12,13,14]}], { xaxis: {
    type: 'category',
    categorymode: 'array',
    categorylist: ['s','y','b','x','a','d','z','e','c', 'q', 'k']
}});

and here's the chart:
image

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monfera thanks for this writeup. You bring up some interesting points.

plotly.js has always been very data-focused. For example, data linked to 'x' and/or 'y' are used to compute axis ranges by default. In other words, data can impact other graph attributes, but we like to keep data arrays immutable.

In the case where categorylist has more entries to 'x', making the auto-range routine consider the orphan categories in categorylist would be equivalent to filling in 'x' with those categories and the corresponding 'y' items will nulls. Note that null 'y' items are taken into account in the auto-range routine (see example).

Therefore, I'd vote for the status quo, where orphan categorylist categories don't affect the auto-range routine. In the case where a user wants to see those orphan categories, adding (category, null) x/y pairs or setting xaxis.range explicitly would suffice.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard Thanks for the answer, it means we don't need to worry about chopping off unused categories on the axis tails for now. I'm just adding a test case that renders the tick for a null value, it's working out fine.


});
});

describe('explicit category ordering', function() {

it('should output categories in explicitly supplied order, independent of trace order', function() {

Plotly.plot(gd, [{x: ['c','a','e','b','d'], y: [15,11,12,13,14]}], { xaxis: {
type: 'category',
categorymode: 'array',
categorylist: ['b','a','d','e','c']
}});

expect(gd.calcdata[0][0].y).toEqual(13);
expect(gd.calcdata[0][1].y).toEqual(11);
expect(gd.calcdata[0][2].y).toEqual(14);
expect(gd.calcdata[0][3].y).toEqual(12);
expect(gd.calcdata[0][4].y).toEqual(15);
});

it('should output categories in explicitly supplied order, independent of trace order, pruned', function() {

Plotly.plot(gd, [{x: ['c',undefined,'e','b','d'], y: [15,11,12,null,14]}], { xaxis: {
type: 'category',
categorymode: 'array',
categorylist: ['b','a','d','e','c']
}});

expect(gd.calcdata[0][0].y).toEqual(13);
expect(gd.calcdata[0][1].y).toEqual(14);
expect(gd.calcdata[0][2].y).toEqual(15);
});

it('should output categories in explicitly supplied order even if not all categories are present', function() {

Plotly.plot(gd, [{x: ['c','a','e','b','d'], y: [15,11,12,13,14]}], { xaxis: {
type: 'category',
categorymode: 'array',
categorylist: ['y','b','x','a','d','z','e','c']
}});

expect(gd.calcdata[0][0].y).toEqual(13);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Specified categories in categorylist that don't have associated values in the data should be understood as nulls.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard Yes, totally agree with both these comments. The test cases came before the implementation, i.e. before some of my learning and haven't been updated.

expect(gd.calcdata[0][1].y).toEqual(11);
expect(gd.calcdata[0][2].y).toEqual(14);
expect(gd.calcdata[0][3].y).toEqual(12);
expect(gd.calcdata[0][4].y).toEqual(15);
});

it('should output categories in explicitly supplied order first, if not all categories are covered', function() {

Plotly.plot(gd, [{x: ['c','a','e','b','d'], y: [15,11,12,13,14]}], { xaxis: {
type: 'category',
categorymode: 'array',
categorylist: ['b','a','x','c']
}});

expect(gd.calcdata[0][0].y).toEqual(13);
expect(gd.calcdata[0][1].y).toEqual(11);
expect(gd.calcdata[0][2].y).toEqual(15);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I think gd.calcdata[0][2].y should be null in this case and gd.calcdata[0][3].y be equal to 15.

// The order of the rest is unspecified, no need to check. Alternative: make _both_ categorymode and
// categories effective; categories would take precedence and the remaining items would be sorted
// based on the categorymode. This of course means that the mere presence of categories triggers this
// behavior, rather than an explicit 'explicit' categorymode.
});
});
});
});