-
Notifications
You must be signed in to change notification settings - Fork 13.3k
string operations are quite inefficient #3294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the analysis and description of how Entity handles strings. String efficiency has long been a problem in Rust because they have been unique types, I suggest that the fix for this problem is to modify Atomic reference counting in strings doesn't fit so well with Rust's memory model, and shouldn't be needed. |
You should use |
change new line point in the case of no args
Related change: - rust-lang@24e41f1d13 Resolves rust-lang#3294 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.
A week ago, I discovered Rust. The project is really interesting and is close to my vision of computer languages. I really love the enum/match mechanism and the memory model with shared and unique boxes. Everything looks consistent but the strings. The mechanism with three kinds of strings is not obvious to use but most of all, the generated llvm is quite inefficient.
For example, if you make the following example:
The constructor will make three calls (upcall_str_new_uniq, glue_grop and llvm.memmove).
The set_a function will make three calls (exchange_malloc, llvm.memmove and glue_drop).
The cmp_a function just do one mandatory call (upcall_cmp_type) but if you compare with a constant, you add a call to upcall_str_new_uniq and a call to glue_free.
Strings operations are very common and a good compiler should optimize theses operations so Rust must do that.
For several years now, I make a language which share a lot of things with Rust. I had the same strings problems that Rust have and I think I found something quite efficient.
I use reference counted objects to hold strings (the objects have a links_count, a string size (in bytes), a string size (in characters) and the characters (in utf8). A static string has a links count of -1 which is never increment nor decrement. The strings are shared between thread (but, like Rust, I don't share objects between threads). I use the cmpxchg llvm function to increment or decrement links to be thread safe.
With the same example:
I have the following LLVM code:
In the constructor, the field is initialized with a static string. The compiler makes an optimization and doesn't generate the increment (which does nothing for a static string).
I think that the way I manage strings with Entity could be very useful for Rust. If it would be decided that it must be done for Rust, I would do the development (or provide some help) with pleasure.
You can have a look at the Entity language (http://code.google.com/p/entity-language/). The final compiler with llvm generation is very young and only a few things are managed by this compiler.
The text was updated successfully, but these errors were encountered: