Skip to content

Merge from upstream #31

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 149 commits into from
Nov 7, 2022
Merged

Merge from upstream #31

merged 149 commits into from
Nov 7, 2022

Conversation

chrisseaton
Copy link
Collaborator

No description provided.

mame and others added 30 commits October 25, 2022 13:20
A regexp that ends with an escape following an incomplete UTF-8 char
might cause buffer overrun. Found by OSS-Fuzz.

```
$ valgrind ./miniruby -e 'Regexp.new("\\u2d73\\0\\0\\0\\0          \\\xE6".b)'
==296213== Memcheck, a memory error detector
==296213== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==296213== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==296213== Command: ./miniruby -e Regexp.new("\\\\u2d73\\\\0\\\\0\\\\0\\\\0\ \ \ \ \ \ \ \ \ \ \\\\\\xE6".b)
==296213==
==296213== Warning: client switching stacks?  SP change: 0x1ffe8020e0 --> 0x1ffeffff10
==296213==          to suppress, use: --max-stackframe=8379952 or greater
==296213== Invalid read of size 1
==296213==    at 0x484EA10: memmove (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==296213==    by 0x339568: memcpy (string_fortified.h:29)
==296213==    by 0x339568: onig_strcpy (regparse.c:271)
==296213==    by 0x339568: onig_node_str_cat (regparse.c:1413)
==296213==    by 0x33CBA0: parse_exp (regparse.c:6198)
==296213==    by 0x33EDE4: parse_branch (regparse.c:6511)
==296213==    by 0x33EEA2: parse_subexp (regparse.c:6544)
==296213==    by 0x34019C: parse_regexp (regparse.c:6593)
==296213==    by 0x34019C: onig_parse_make_tree (regparse.c:6638)
==296213==    by 0x32782D: onig_compile_ruby (regcomp.c:5779)
==296213==    by 0x313EFA: onig_new_with_source (re.c:876)
==296213==    by 0x313EFA: make_regexp (re.c:900)
==296213==    by 0x313EFA: rb_reg_initialize (re.c:3136)
==296213==    by 0x318555: rb_reg_initialize_str (re.c:3170)
==296213==    by 0x318555: rb_reg_init_str (re.c:3205)
==296213==    by 0x31A669: rb_reg_initialize_m (re.c:3856)
==296213==    by 0x3E5165: vm_call0_cfunc_with_frame (vm_eval.c:150)
==296213==    by 0x3E5165: vm_call0_cfunc (vm_eval.c:164)
==296213==    by 0x3E5165: vm_call0_body (vm_eval.c:210)
==296213==    by 0x3E89BD: vm_call0_cc (vm_eval.c:87)
==296213==    by 0x3E89BD: rb_call0 (vm_eval.c:551)
==296213==  Address 0x9d45b10 is 0 bytes after a block of size 32 alloc'd
==296213==    at 0x4844899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==296213==    by 0x20FA7B: objspace_xmalloc0 (gc.c:12146)
==296213==    by 0x35F8C9: str_buf_cat4.part.0 (string.c:3132)
==296213==    by 0x31359D: unescape_escaped_nonascii (re.c:2690)
==296213==    by 0x313A9D: unescape_nonascii (re.c:2869)
==296213==    by 0x313A9D: rb_reg_preprocess (re.c:2992)
==296213==    by 0x313DFC: rb_reg_initialize (re.c:3109)
==296213==    by 0x318555: rb_reg_initialize_str (re.c:3170)
==296213==    by 0x318555: rb_reg_init_str (re.c:3205)
==296213==    by 0x31A669: rb_reg_initialize_m (re.c:3856)
==296213==    by 0x3E5165: vm_call0_cfunc_with_frame (vm_eval.c:150)
==296213==    by 0x3E5165: vm_call0_cfunc (vm_eval.c:164)
==296213==    by 0x3E5165: vm_call0_body (vm_eval.c:210)
==296213==    by 0x3E89BD: vm_call0_cc (vm_eval.c:87)
==296213==    by 0x3E89BD: rb_call0 (vm_eval.c:551)
==296213==    by 0x3E957B: rb_call (vm_eval.c:877)
==296213==    by 0x3E957B: rb_funcallv_kw (vm_eval.c:1074)
==296213==    by 0x2A4123: rb_class_new_instance_pass_kw (object.c:1991)
==296213==
==296213==
==296213== HEAP SUMMARY:
==296213==     in use at exit: 35,476,538 bytes in 9,489 blocks
==296213==   total heap usage: 14,893 allocs, 5,404 frees, 37,517,821 bytes allocated
==296213==
==296213== LEAK SUMMARY:
==296213==    definitely lost: 316,081 bytes in 2,989 blocks
==296213==    indirectly lost: 136,808 bytes in 2,361 blocks
==296213==      possibly lost: 1,048,624 bytes in 3 blocks
==296213==    still reachable: 33,975,025 bytes in 4,136 blocks
==296213==         suppressed: 0 bytes in 0 blocks
==296213== Rerun with --leak-check=full to see details of leaked memory
==296213==
==296213== For lists of detected and suppressed errors, rerun with: -s
==296213== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
```
…builder/custom_name

Bumps [rb-sys](https://github.com/oxidize-rb/rb-sys) from 0.9.31 to 0.9.34.
- [Release notes](https://github.com/oxidize-rb/rb-sys/releases)
- [Commits](oxidize-rb/rb-sys@v0.9.31...v0.9.34)

---
updated-dependencies:
- dependency-name: rb-sys
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

rubygems/rubygems@6af714b02c
Use `Enumerable#find` to iterate over the candidates, not `Enumerable.each`.

(this makes the code more functional, and - IMO - slightly more idiomatic,
as it avoids setting the "global" (by which I mean: non-local) `tmp`
variable from inside the block)

ruby/tmpdir@d1f20ad694
A code pattern `p + enclen(enc, p, pend)` may lead to a buffer overrun
if incomplete bytes of a UTF-8 character is placed at the end of a
string. Because this pattern is used in several places in onigmo,
this change fixes the issue in the side of `enclen`: the function should
not return a number that is larger than `pend - p`.

Co-Authored-By: Nobuyoshi Nakada <[email protected]>
when it fails to allocate a new page.

Co-authored-by: Alan Wu <[email protected]>
(ruby/erb#23)

Ref: ruby/cgi#26

This native implementation is much faster
and available in `cgi 0.3.3`.

ruby/erb@2d90e9b010
- Add mswin/mswin64 to platforms
- Use TruffleRuby as example instead of Rubinius

Signed-off-by: Takuya Noguchi <[email protected]>
Co-authored-by: André Arko <[email protected]>
It's moved from k0kubun to ruby org.

Also, we don't need JavaScript eval to generate branch if we use
github.ref_name, so v3.0.0 is a version that doesn't use eval.

Co-Authored-By: Nobuyoshi Nakada <[email protected]>

Co-authored-by: Nobuyoshi Nakada <[email protected]>
esent.h is the header for MS essential storage engine (JET) which is not
needed in ruby. basetsd.h has existed since _MSC_VER >= 1200 (VS 6.0)
and is the preferred header to use for WCHAR.
It looks like Cirrus doesn't natively support notifications and they
recomment to use GitHub Actions for it.
https://cirrus-ci.org/guide/notifications/

Because I don't know what the payload looks like, I just added a basic
payload and dumped GitHub context so that we could improve it later.
The internal location in ractor.rb is not usefull at all.
```
$ ruby -e 'Ractor.new {}'
<internal:ractor>:267: warning: Ractor is experimental, ...
```
Ruby CI runs irb and other Ruby core/stdlib tests in the same process.
So adding irb-specific helper to Test::Unit::TestCase could potentially
pollute other components' tests and should be avoided.
nobu and others added 28 commits November 4, 2022 15:43
Since the regexp had expected an empty line before `Co-Authored-By:`
trailer lines, it failed to match when the body has the trailer only.
Xcode no longer links the system include files directory to `/usr`.
Extract the actual header file path from cpp output.
rb_obj_is_kind_of returns a Ruby Qtrue or Qfalse. We should use RTEST
rather than assuming that Qfalse is 0.
We were previously incrementing the max_iv_count on a class in gc
freeing. By the time we free an object though, we're not guaranteed its
class is still valid. Instead, we can do this when marking and we're
guaranteed the object still knows its class.
(ruby/erb#28)

`prepend` is prioritized more than ActiveSupport's monkey-patch, but the
monkey-patch needs to work.

ruby/erb@611de5a865
Bundler's backups changes environment variables starting with
BUNDLER_ORIG_. This causes a lot of noise in tests as the leakchecker
reports them as changed.
* Auto-enable YJIT build when rustc >= 1.58.0 present

* Try different incantation to have rustc output to stdout only

* Add comment, remove whitespace

* Try to detect if we are on a platform on which YJIT is supported
(ruby/erb#29)

Typically, strpbrk(3) is optimized pretty well with SIMD instructions.
Just using it makes this as fast as a SIMD-based implementation for the
no-escape case.

Not utilizing this for escaped cases because memory allocation would be
a more significant bottleneck for many strings anyway. Also, there'll be
some overhead in calling a C function (strpbrk) many times because we're
not using SIMD instructions directly. So using strpbrk all the time
might not necessarily be faster.
This is the same trick used by https://github.com/k0kubun/hescape to
choose the best strategy for different scenarios.

ruby/erb@af26da2858
because it's much slower on M1 ruby/erb#29.
It'd be too complicated to switch the implementation based on known
optimized platforms / versions.

Besides, short strings are the most common usages of this method and
SIMD doesn't really help that case. All in all, I can't justify the
existence of this code.

ruby/erb@30691c8995
fiber machine stack is placed outside of C stack allocated by wasm-ld,
so highest stack address recorded by `rb_wasm_record_stack_base` is
invalid when running on non-main fiber.
Therefore, we should scan `stack_{start,end}` which always point a valid
stack range in any context.
@wks wks merged commit 19e523d into mmtk Nov 7, 2022
wks pushed a commit that referenced this pull request Aug 4, 2023
[Bug #19793]

Dummy frames are created at the top level when requiring another file.
While requiring a file, it will try to convert using encodings. Some of
these encodings will not respond to to_str. If method_missing is
redefined on Object, then it will call method_missing and attempt raise
an error. However, the iseq is invalid as it's a dummy frame so it will
write an invalid iseq to the created NoMethodError.

The following script crashes:

```
GC.stress = true

class Object
  public :method_missing
end

File.write("/tmp/empty.rb", "")
require "/tmp/empty.rb"
```

With the following backtrace:

```
frame #0: 0x00000001000fa8b8 miniruby`RVALUE_MARKED(obj=4308637824) at gc.c:1638:12
frame #1: 0x00000001000fb440 miniruby`RVALUE_BLACK_P(obj=4308637824) at gc.c:1763:12
frame #2: 0x00000001000facdc miniruby`gc_writebarrier_incremental(a=4308637824, b=4308332208, objspace=0x000000010180b000) at gc.c:8822:9
frame #3: 0x00000001000faad8 miniruby`rb_gc_writebarrier(a=4308637824, b=4308332208) at gc.c:8864:17
frame #4: 0x000000010016aff0 miniruby`rb_obj_written(a=4308637824, oldv=36, b=4308332208, filename="../iseq.c", line=1279) at gc.h:804:9
frame #5: 0x0000000100162a60 miniruby`rb_obj_write(a=4308637824, slot=0x0000000100d09888, b=4308332208, filename="../iseq.c", line=1279) at gc.h:837:5
frame #6: 0x0000000100165b0c miniruby`iseqw_new(iseq=0x0000000100d09880) at iseq.c:1279:9
frame #7: 0x0000000100165a64 miniruby`rb_iseqw_new(iseq=0x0000000100d09880) at iseq.c:1289:12
frame #8: 0x00000001000d8324 miniruby`name_err_init_attr(exc=4309777920, recv=4304780496, method=827660) at error.c:1830:35
frame #9: 0x00000001000d1b80 miniruby`name_err_init(exc=4309777920, mesg=4308332496, recv=4304780496, method=827660) at error.c:1869:12
frame #10: 0x00000001000d1bd4 miniruby`rb_nomethod_err_new(mesg=4308332496, recv=4304780496, method=827660, args=4308332448, priv=0) at error.c:1957:5
frame #11: 0x000000010039049c miniruby`rb_make_no_method_exception(exc=4304914512, format=4308332496, obj=4304780496, argc=1, argv=0x000000016fdfab00, priv=0) at vm_eval.c:959:16
frame #12: 0x00000001003b3274 miniruby`raise_method_missing(ec=0x0000000100b06f40, argc=1, argv=0x000000016fdfab00, obj=4304780496, last_call_status=MISSING_NOENTRY) at vm_eval.c:999:15
frame #13: 0x00000001003945d4 miniruby`rb_method_missing(argc=1, argv=0x000000016fdfab00, obj=4304780496) at vm_eval.c:944:5
...
frame #23: 0x000000010038f5e4 miniruby`rb_vm_call_kw(ec=0x0000000100b06f40, recv=4304780496, id=2865, argc=1, argv=0x000000016fdfab00, me=0x0000000100cbfcf0, kw_splat=0) at vm_eval.c:326:12
frame #24: 0x00000001003c18e4 miniruby`call_method_entry(ec=0x0000000100b06f40, defined_class=4304927952, obj=4304780496, id=2865, cme=0x0000000100cbfcf0, argc=1, argv=0x000000016fdfab00, kw_splat=0) at vm_method.c:2720:20
frame #25: 0x00000001003c440c miniruby`check_funcall_exec(v=6171896792) at vm_eval.c:589:12
frame #26: 0x00000001000dec00 miniruby`rb_vrescue2(b_proc=(miniruby`check_funcall_exec at vm_eval.c:587), data1=6171896792, r_proc=(miniruby`check_funcall_failed at vm_eval.c:596), data2=6171896792, args="Pȗ") at eval.c:919:18
frame #27: 0x00000001000deab0 miniruby`rb_rescue2(b_proc=(miniruby`check_funcall_exec at vm_eval.c:587), data1=6171896792, r_proc=(miniruby`check_funcall_failed at vm_eval.c:596), data2=6171896792) at eval.c:900:17
frame #28: 0x000000010039008c miniruby`check_funcall_missing(ec=0x0000000100b06f40, klass=4304923536, recv=4304780496, mid=3233, argc=0, argv=0x0000000000000000, respond=-1, def=36, kw_splat=0) at vm_eval.c:666:15
frame #29: 0x000000010038fa60 miniruby`rb_check_funcall_default_kw(recv=4304780496, mid=3233, argc=0, argv=0x0000000000000000, def=36, kw_splat=0) at vm_eval.c:703:21
frame #30: 0x000000010038fb04 miniruby`rb_check_funcall(recv=4304780496, mid=3233, argc=0, argv=0x0000000000000000) at vm_eval.c:685:12
frame #31: 0x00000001001c469c miniruby`convert_type_with_id(val=4304780496, tname="String", method=3233, raise=0, index=-1) at object.c:3061:15
frame #32: 0x00000001001c4a4c miniruby`rb_check_convert_type_with_id(val=4304780496, type=5, tname="String", method=3233) at object.c:3153:9
frame #33: 0x00000001002d59f8 miniruby`rb_check_string_type(str=4304780496) at string.c:2571:11
frame #34: 0x000000010014b7b0 miniruby`io_encoding_set(fptr=0x0000000100d09ca0, v1=4304780496, v2=4, opt=4) at io.c:11655:19
frame #35: 0x0000000100139a58 miniruby`rb_io_set_encoding(argc=1, argv=0x000000016fdfb450, io=4308334032) at io.c:13497:5
frame #36: 0x00000001003c0004 miniruby`ractor_safe_call_cfunc_m1(recv=4308334032, argc=1, argv=0x000000016fdfb450, func=(miniruby`rb_io_set_encoding at io.c:13487)) at vm_insnhelper.c:3271:12
...
frame #43: 0x0000000100390b08 miniruby`rb_funcall(recv=4308334032, mid=16593, n=1) at vm_eval.c:1137:12
frame #44: 0x00000001002a43d8 miniruby`load_file_internal(argp_v=6171899936) at ruby.c:2500:5
...
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.