Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare for 1.0: big cleanup/simplification #84

Closed
41 tasks done
c-cube opened this issue Oct 25, 2016 · 30 comments
Closed
41 tasks done

Prepare for 1.0: big cleanup/simplification #84

c-cube opened this issue Oct 25, 2016 · 30 comments
Assignees
Milestone

Comments

@c-cube
Copy link
Owner

c-cube commented Oct 25, 2016

Preparing version 1.0, with some breaking changes. The goal of this release is not to add significant new features, but to improve consistency and focus on the core features and modules (in particular, droping or reducing the scope of some of the experimental modules).

The working branch is prepare-1.0 .

NOTE: please ask if you want something changed/added to the list!

Todo list

  • more consistent labels (in particular, replace ~or_ by ~default)
  • label functions (iter', map', …) for CCList and the others?
  • doc (doc for of_list #85)
  • remove int_, bool_, etc. in CCOrd
  • make CCIO.File.walk work properly with symlinks, even broken
    • deal with errors and files that do not actually exist
    • [ ] add CCIO.File.is_symlink
    • [ ] add CCIO.File.deref : ?recursive:bool -> string -> string
    • [ ] make CCIO.File.walk_item = [Dir | File | Symlink] (breaking compat)
  • much simpler S-expression library using Lexing (see nunchaku)
  • rewrite bitfield (much simpler version, almost functor-free, Bit_set)
  • improve CCFormat (see improve CCFormat #82)
    • CCFormat.of_to_string : ('a -> string) -> 'a t
    • CCFormat.const : 'a t -> 'a -> unit t (to hide the type 'a)
    • CCFormat.text to mimick Format.pp_print_text
    • CCFormat.return : (unit,…) format4 -> unit (for strings without arguments)
    • CCFormat.some : 'a t -> 'a option t (prints nothing if None, same as arg if Some x)
    • remove start/stop arguments?
    • make sep a argument-free formatter, default ",@ ", using Format.fprint out "%(%)" sep
      to print it in combinators
      OR use unit t with return: arg-free formatter -> unit t and sp = return ",@ " or sth
  • add CCOrd.Infix
  • remove CCPrint (just use format)
    • use pp as pretty-printer everywhere, and remove Buffer.t printers
  • remove containers.advanced (refer to olinq)
  • remove bigarray library:
    • remove Array1 (point to oml or similar instead)
    • remove CCBigstring (point to bigstring library instead)
  • improve CCUnix
    • adding simple functions such as popen (already with_process_*)
    • functions returning lists of lines?
  • remove threads library and make it into its own repo/package (if anyone uses it — need more tests)
  • CCList:
    • flatten CCList.Set
    • flatten CCList.Idx
    • [ ] flatten CCList.Assoc
    • move CCList.Zipper into its own module (in containers.data)
    • what about CCList.Ref? can keep it, probably
  • move CCArray.Sub into its own module (in containers.data), add many tests
  • [ ] move CCString.Sub into its own module (in containers.data)
  • remove containers.string, merging useful functionality into CCString
    • add edit_distance to CCString, from CCLevenshtein
    • [ ] add submodule Edit_distance_automaton (with compiled automaton + index, maybe)
    • just remove CCKMP, make CCString.Find public (provides same functionality)
    • remove App_parse
    • move CCParse into core
    • use parser combinators from oasis-parser for CCParse (string input only!) + update tests
  • use result everwhere, drop the polymorphic variants that remain
  • remove CCError in favor of CCResult
  • use Hashtbl.seeded_hash for implementing better hash combinators in CCHash (from smbc)
  • improve CCGraph (functor designed for local def?)

before pre-release:

  • [ ] remove all @since annotations
  • remove deprecated functions
  • apply ocp-indent
@c-cube c-cube self-assigned this Oct 25, 2016
@c-cube c-cube modified the milestones: 0.21, 1.0 Oct 25, 2016
@c-cube c-cube changed the title big cleanup/simplification Prepare for 1.0: big cleanup/simplification Nov 3, 2016
@bluddy
Copy link
Contributor

bluddy commented Nov 3, 2016

  • Feels like CCUnix should be a separate library. Maybe point to bos instead?
  • What about using Lwt/Async with these data structures? Functions that can 'context switch' between threads while iterating/mapping would be nice.

@c-cube
Copy link
Owner Author

c-cube commented Nov 3, 2016

@bluddy

About bos, at this point it looks like a full competitor in the stdlib race, I think:

depends: ocamlfind & ocamlbuild & topkg >= 0.7.4 & conf-which & base-unix & rresult & astring & fpath & fmt & logs"

so I'd rather keep containers.unix (which is conditionally compiled) for the basics that don't really call for an additional library (in particular, for sub-process management).

  • About lwt/async, which data structures are you referring to, more specifically? You mean a monadic version of to_seq? That would be nice but it requires inversion of control (same as to_gen) which is not possible for many of the opaque standard containers (Map, Set, Hashtbl, Stack, Queue).

@c-cube c-cube modified the milestones: 0.21, 1.0 Nov 3, 2016
@orbitz
Copy link

orbitz commented Nov 4, 2016

What about labels for most functions? I have standardized to doing this on most of my stuff and I find it helps in reading/understanding the code.

Also, if backwards breaking, what about getting rid of that hideous _or label? :)

@c-cube
Copy link
Owner Author

c-cube commented Nov 4, 2016

@orbitz that's an option, indeed. Do you use ~f and ~x for, say, folds? I will definitely replace the ~or_ label by ~default.

@c-cube
Copy link
Owner Author

c-cube commented Nov 4, 2016

Looks like this is going to be a bit more breaking than originally anticipated, anyway, even if only because of printpp

@c-cube
Copy link
Owner Author

c-cube commented Nov 4, 2016

@orbitz I'm also considering having map without labels, and map' with labels, as a convention, but it's probably too noisy.

@orbitz
Copy link

orbitz commented Nov 4, 2016

I'm pro-labels as the first-class style. I just think the readability win is super high. A stdlib style Foo and FooLabel could be an option as well, but my preference is for Foo having labels.

For folds, it seems like ~f and~init are the common names.

Once you decide what style you'd like, give me some modules and I can offer my slave labour to mechanically convert them.

@c-cube
Copy link
Owner Author

c-cube commented Nov 4, 2016

Good, I think I will do it then. Only some already existing functions will have an unlabelled version (so that module List = struct include List include CCList end doesn't break every call to List.map), but the other, new combinators shall be labelled.

@c-cube
Copy link
Owner Author

c-cube commented Nov 4, 2016

It's also the right moment to ask for direly missing features in core modules (e.g. CCList), btw.

@orbitz
Copy link

orbitz commented Nov 4, 2016

By that do you mean you will do map and map' where map' will be the labeled version?

@c-cube
Copy link
Owner Author

c-cube commented Nov 4, 2016

Yes, something like that (or map_label, which is more explicit).

@orbitz
Copy link

orbitz commented Nov 4, 2016

My vote is to not do that and just to have the function everyone expects to exist to be labeled. It just seems like weird complexity to add if you're willing to break backwards compatibility here. I'd rather have CCListLabel.map than CCList.map' or CCList.map_label.

@c-cube
Copy link
Owner Author

c-cube commented Nov 4, 2016

Well, people also expect List.map to be unlabelled, so it feels to me that I should keep the same API (to support the claim that containers is an extension of the stdlib); of course in new functions/modules I can use a label-based API. Come to think of it, a labelled CCList.fold would also make sense…

@orbitz
Copy link

orbitz commented Nov 4, 2016

In that case, I feel like CCListLabel would be a better solution if you want to be backwards compatible with the stdlib. Someone like myself has very little interest in being compatible with the stldib (I use containers because I want something better than the stdlib).

@bluddy
Copy link
Contributor

bluddy commented Nov 4, 2016

I was surprised by how trivial it is to add a Labels module -- simply create the mli file and include the original module, and the functions get translated automatically. Since it's so easy, I don't see a down side with providing alternative label and non-label versions.

@c-cube
Copy link
Owner Author

c-cube commented Nov 4, 2016

I strongly dislike those FooLabels modules, and never use them from the stdlib. I will reflect on this anyway ;-)

@bluddy
Copy link
Contributor

bluddy commented Nov 4, 2016

I've never really used them either... I kind of like letting people choose, but I can see Jane Street's view that it's just sowing confusion. I dunno -- feel free to decide on a path forward.

In an alternate universe, I'm enjoying using the one-and-only OCaml stdlib and focusing on more productive things.

@nilsbecker
Copy link
Contributor

another vote: for keeping the most basic reduction functions unlabeled. i already find

let open List in
somelist 
|> map (fun x -> x * g x)
|> fold_left (fun x y -> x + 2 * y) 0

to be sufficiently verbose, adding ~f: in front feels like boilerplate.

labels are useful for 1. not having to remember argument order 2. readability of code where the fact that an argument is a function is not obvious 3. switching argument order for currying.

in my experience all these are not common problems for the most basic reduction functions like map, fold -- of course for specialized, rarely used functions this is different. so requiring labels for the more basic functions is only motivated by consistency.

so if a more-or-less clear cut can be made between 'workhorse' and 'rare' functions, i would vote for labeling only the rare ones.

@bluddy
Copy link
Contributor

bluddy commented Nov 17, 2016

I generally agree that map shouldn't require ~f, but already with fold_left and fold_right you want ~init or ~zero to indicate which is the accumulator (since they have different argument orders for some reason). And if you start with the folds, the odd one out becomes map... argh. If OCaml had haskell's \ instead of fun, I would feel better about using a few more keystrokes per function.

@nilsbecker
Copy link
Contributor

nilsbecker commented Nov 17, 2016

for the folds i always thought the order is because it's the same order in which you give arguments to f. so that's still ok for me. but when using fold_right with the |> operator the problems start...
so then one would want labels for fold_right but not for fold_left. that does not sound like a good place to draw the line either... but maybe it's the right place since fold_right accesses the list from the end which is 'unnatural'?

@c-cube
Copy link
Owner Author

c-cube commented Nov 17, 2016

Well in my experience it's on map, iter and the likes that I tend to often write List.map big_anonymous_function l, the case in which a label would help…

@bluddy
Copy link
Contributor

bluddy commented Nov 17, 2016

The fact that OCaml makes it trivial to make a labeled version of a non-labeled implementation really makes it tempting IMO. List.L.map could be labeled, for example, as could Map.L.map (from a single functor application of Map), giving you complete control over which one you want.

@ghost
Copy link

ghost commented Nov 17, 2016

I think labels would be pretty nice to have and I would sometimes use them but I would also prefer having separate modules for the labeled interfaces rather than having some parts labeled and other parts not labeled.

@struktured
Copy link
Contributor

I'm for having a labeled version of the apis, even if its main benefit is to make it easier to swap between jane street's core (or base?) and containers. Sometimes I am forced to switch from one to the other and the most common hangup is the labeled argument function.

@c-cube
Copy link
Owner Author

c-cube commented Nov 18, 2016

You're not making my life easy, the choice is going to be tough :p

@c-cube
Copy link
Owner Author

c-cube commented Dec 29, 2016

Ok, I think I will just add Labels version of CCList and CCArray, I don't want to break every single function in the lib. That being said, most items are now filled, so the release is homing in!

@Drup
Copy link
Collaborator

Drup commented Jan 5, 2017

Feature wish for 1.0: Rewrite CCFormat

  • Replace all the ~sep arguments by formatters
  • Remove all the ~start ~stop
  • Add "wrappers" (like put parens around an expression, with appropriate boxes)
  • Add new combinators to concat of formatters and other useful small things
  • Add translation functions back and forth printers of type 'a -> string.

@c-cube
Copy link
Owner Author

c-cube commented Jan 5, 2017

  • I really don't like having ~sep being a formatter…
  • ok for start/stop
  • The wrappers are there already: CCFormat.{within,hbox,vbox,…}.
  • What do you mean, concat?
  • CCFormat.to_string? or something else?

@Drup
Copy link
Collaborator

Drup commented Jan 5, 2017

I'm not sure why you don't like it, it's essential, and it's not a problem if you have good combinators (like const : ''a printer -> 'a -> unit printer and it's partial application to string).

Also, Please add CCOrd.Infix, local opening is unusable otherwise, since CCOrd.compare will shadow the local comparison function you are in the middle of defining ...

@Drup
Copy link
Collaborator

Drup commented Jan 5, 2017

I meant CCFormat.of_to_string: ('a -> string) -> 'a printer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants