Skip to content

Add Benchmarks & Split/Unroll-scopes optimizations (x1.6 faster) #182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

ankostis
Copy link
Contributor

For issue #158:

  1. benchmark branch: Added a benchmark with meta_schemas cheking & extensive sample-schema with references.
  2. Implemented two optimizations:
    2.a. split_scopes branch: Keep URIs in 2 parts (url, fragment) to avoid fragging/defragging since fragments do not join, they always override (~1.1 faster)
    2.b. unroll_scopes branch: Replace resolver.in_scope() context-manager with a push/pop stack of scopes, that does nothing it when no id property exists (~1.5 faster).
  3. split_unroll_scopes branch: Combined both optimizations (~1.6 faster).
  4. For future reference, the timed-results are engraved on the respective methods of wltp/test/test_benchmarks.py TC (unfortunately they are host-specific).

ALL TCs RUN AS OK (apart from split_frags which was post-edited on the combined branch)

* Improve statistics print-outs.
* Engrave timming results in benchmark docstrings.
…ing into a list whe

not null
(instead of using a context-manager each time)

Roughly x 1.5 faster
…g by keeping

fragments separated from URL (and avoid redunant frag/defrag).
…ization-branches.

* FIX 2 forgotten test-case on resolver-URIs from split_scopes.

x 1.8 faster in big referenced model.
@ankostis ankostis changed the title Optimizations: Split and unroll scopes (x1.6 faster) Add Benchmarks & Split/Unroll-scopes optimizations (x1.6 faster) Sep 24, 2014
…s_stack empty when

iteration breaks (no detectable performance penalty).

* Replace non-python-2.6 DefragResult with named-tuple.
* Add test-case checking scopes_stack empty.
@ankostis ankostis mentioned this pull request Sep 24, 2014
@Julian
Copy link
Member

Julian commented Sep 24, 2014

Hey -- thanks a lot, I haven't obviously gotten a chance to really read through most of this yet, but thanks for tackling this.

The one thing I notice so far is that the tests and methods (in_scope in particular) that were removed are public API unfortunately, so we need to keep supporting them even if we don't use them internally.

For the benchmarks, what we'll have to do is basically baseline each machine (there are some tools to do that or we can do it manually) in order for the performance numbers to scale to wherever you run the tests, but it looks like you've definitely got a good start there.

I'll have a read through but it'll probably be a few days, but definitely thanks for working on this.

ankostis referenced this pull request in ankostis/jsonschema Sep 28, 2014
Neglibly slower, BUT reduced stdev and simpler main-loop code.
@dnephin
Copy link
Contributor

dnephin commented Feb 27, 2015

This changes improve the performance on my benchmark by nearly 2x.

What can I do to help get this merged? It looks like the tests are failing and it needs to be rebased with msater.

@ankostis ankostis force-pushed the split_unroll_scopes branch 2 times, most recently from ddbc519 to f0c6f8f Compare March 1, 2015 23:41
@cgurnik
Copy link

cgurnik commented Apr 20, 2015

@Julian Would it be possible for you to take a look at this branch? Our project has a relatively simple schema, and validation using jsonschema is frequently the slowest part of our code. A performance increase of 2x would be a significant benefit to us. As @dnephin mentioned, what can I do to help get this merged in?

@Julian
Copy link
Member

Julian commented Apr 20, 2015

Hi @cgurnik -- @dnephin already successfully managed to get in some performance changes -- can you give your project a shot on current master and see if you get any difference?

I'm trying to find some time to do a release but I was overseas for a few days.

If you don't see results can you possibly include some profiling output, would love to have a look!

@dnephin
Copy link
Contributor

dnephin commented Apr 20, 2015

Most of the changes from this branch were included in my branch, so it might even be appropriate to close this one.

@ankostis
Copy link
Contributor Author

@dnephin Do you mean that this dnephin/jsonschema@1c15827 (branch: perf_take2) contains roughly the 2 perf-improvements mentioned in this PR, but not #184?

@dnephin
Copy link
Contributor

dnephin commented Apr 21, 2015

Right, I included many of them in dnephin@2fda155

@Julian Julian closed this Jun 8, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants