Skip to content

Commit 1bf9e44

Browse files
cipherboyzeripathguillep2k
authored
Fix sanitizer config - multiple rules (#11133)
In #9888, it was reported that my earlier pull request #9075 didn't quite function as expected. I was quite hopeful the `ValuesWithShadow()` worked as expected (and, I thought my testing showed it did) but I guess not. @zeripath proposed an alternative syntax which I like: ```ini [markup.sanitizer.1] ELEMENT=a ALLOW_ATTR=target REGEXP=something [markup.sanitizer.2] ELEMENT=a ALLOW_ATTR=target REGEXP=something ``` This was quite easy to adopt into the existing code. I've done so in a semi-backwards-compatible manner: - The value from `.Value()` is used for each element. - We parse `[markup.sanitizer]` and all `[markup.sanitizer.*]` sections and add them as rules. This means that existing configs will load one rule (not all rules). It also means people can use string identifiers (`[markup.sanitiser.KaTeX]`) if they prefer, instead of numbered ones. Co-authored-by: Andrew Thornton <[email protected]> Co-authored-by: guillep2k <[email protected]>
1 parent 6b6f20b commit 1bf9e44

File tree

4 files changed

+38
-39
lines changed

4 files changed

+38
-39
lines changed

custom/conf/app.ini.sample

+4-2
Original file line numberDiff line numberDiff line change
@@ -976,8 +976,10 @@ SHOW_FOOTER_VERSION = true
976976
; Show template execution time in the footer
977977
SHOW_FOOTER_TEMPLATE_LOAD_TIME = true
978978

979-
[markup.sanitizer]
980-
; The following keys can be used multiple times to define sanitation policy rules.
979+
[markup.sanitizer.1]
980+
; The following keys can appear once to define a sanitation policy rule.
981+
; This section can appear multiple times by adding a unique alphanumeric suffix to define multiple rules.
982+
; e.g., [markup.sanitizer.1] -> [markup.sanitizer.2] -> [markup.sanitizer.TeX]
981983
;ELEMENT = span
982984
;ALLOW_ATTR = class
983985
;REGEXP = ^(info|warning|error)$

docs/content/doc/advanced/config-cheat-sheet.en-us.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -658,7 +658,7 @@ Two special environment variables are passed to the render command:
658658
Gitea supports customizing the sanitization policy for rendered HTML. The example below will support KaTeX output from pandoc.
659659

660660
```ini
661-
[markup.sanitizer]
661+
[markup.sanitizer.TeX]
662662
; Pandoc renders TeX segments as <span>s with the "math" class, optionally
663663
; with "inline" or "display" classes depending on context.
664664
ELEMENT = span
@@ -670,7 +670,7 @@ REGEXP = ^\s*((math(\s+|$)|inline(\s+|$)|display(\s+|$)))+
670670
- `ALLOW_ATTR`: The attribute this policy allows. Must be non-empty.
671671
- `REGEXP`: A regex to match the contents of the attribute against. Must be present but may be empty for unconditional whitelisting of this attribute.
672672

673-
You may redefine `ELEMENT`, `ALLOW_ATTR`, and `REGEXP` multiple times; each time all three are defined is a single policy entry.
673+
Multiple sanitisation rules can be defined by adding unique subsections, e.g. `[markup.sanitizer.TeX-2]`.
674674

675675
## Time (`time`)
676676

docs/content/doc/advanced/external-renderers.en-us.md

+7-2
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ IS_INPUT_FILE = false
7373
If your external markup relies on additional classes and attributes on the generated HTML elements, you might need to enable custom sanitizer policies. Gitea uses the [`bluemonday`](https://godoc.org/github.com/microcosm-cc/bluemonday) package as our HTML sanitizier. The example below will support [KaTeX](https://katex.org/) output from [`pandoc`](https://pandoc.org/).
7474

7575
```ini
76-
[markup.sanitizer]
76+
[markup.sanitizer.TeX]
7777
; Pandoc renders TeX segments as <span>s with the "math" class, optionally
7878
; with "inline" or "display" classes depending on context.
7979
ELEMENT = span
@@ -86,6 +86,11 @@ FILE_EXTENSIONS = .md,.markdown
8686
RENDER_COMMAND = pandoc -f markdown -t html --katex
8787
```
8888

89-
You may redefine `ELEMENT`, `ALLOW_ATTR`, and `REGEXP` multiple times; each time all three are defined is a single policy entry. All three must be defined, but `REGEXP` may be blank to allow unconditional whitelisting of that attribute.
89+
You must define `ELEMENT`, `ALLOW_ATTR`, and `REGEXP` in each section.
90+
91+
To define multiple entries, add a unique alphanumeric suffix (e.g., `[markup.sanitizer.1]` and `[markup.sanitizer.something]`).
9092

9193
Once your configuration changes have been made, restart Gitea to have changes take effect.
94+
95+
**Note**: Prior to Gitea 1.12 there was a single `markup.sanitiser` section with keys that were redefined for multiple rules, however,
96+
there were significant problems with this method of configuration necessitating configuration through multiple sections.

modules/setting/markup.go

+25-33
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ func newMarkup() {
4444
continue
4545
}
4646

47-
if name == "sanitizer" {
47+
if name == "sanitizer" || strings.HasPrefix(name, "sanitizer.") {
4848
newMarkupSanitizer(name, sec)
4949
} else {
5050
newMarkupRenderer(name, sec)
@@ -67,44 +67,36 @@ func newMarkupSanitizer(name string, sec *ini.Section) {
6767
return
6868
}
6969

70-
elements := sec.Key("ELEMENT").ValueWithShadows()
71-
allowAttrs := sec.Key("ALLOW_ATTR").ValueWithShadows()
72-
regexps := sec.Key("REGEXP").ValueWithShadows()
70+
elements := sec.Key("ELEMENT").Value()
71+
allowAttrs := sec.Key("ALLOW_ATTR").Value()
72+
regexpStr := sec.Key("REGEXP").Value()
7373

74-
if len(elements) != len(allowAttrs) ||
75-
len(elements) != len(regexps) {
76-
log.Error("All three keys in markup.%s (ELEMENT, ALLOW_ATTR, REGEXP) must be defined the same number of times! Got %d, %d, and %d respectively.", name, len(elements), len(allowAttrs), len(regexps))
74+
if regexpStr == "" {
75+
rule := MarkupSanitizerRule{
76+
Element: elements,
77+
AllowAttr: allowAttrs,
78+
Regexp: nil,
79+
}
80+
81+
ExternalSanitizerRules = append(ExternalSanitizerRules, rule)
7782
return
7883
}
7984

80-
ExternalSanitizerRules = make([]MarkupSanitizerRule, 0, len(elements))
81-
82-
for index, pattern := range regexps {
83-
if pattern == "" {
84-
rule := MarkupSanitizerRule{
85-
Element: elements[index],
86-
AllowAttr: allowAttrs[index],
87-
Regexp: nil,
88-
}
89-
ExternalSanitizerRules = append(ExternalSanitizerRules, rule)
90-
continue
91-
}
92-
93-
// Validate when parsing the config that this is a valid regular
94-
// expression. Then we can use regexp.MustCompile(...) later.
95-
compiled, err := regexp.Compile(pattern)
96-
if err != nil {
97-
log.Error("In module.%s: REGEXP at definition %d failed to compile: %v", name, index+1, err)
98-
continue
99-
}
85+
// Validate when parsing the config that this is a valid regular
86+
// expression. Then we can use regexp.MustCompile(...) later.
87+
compiled, err := regexp.Compile(regexpStr)
88+
if err != nil {
89+
log.Error("In module.%s: REGEXP (%s) at definition %d failed to compile: %v", regexpStr, name, err)
90+
return
91+
}
10092

101-
rule := MarkupSanitizerRule{
102-
Element: elements[index],
103-
AllowAttr: allowAttrs[index],
104-
Regexp: compiled,
105-
}
106-
ExternalSanitizerRules = append(ExternalSanitizerRules, rule)
93+
rule := MarkupSanitizerRule{
94+
Element: elements,
95+
AllowAttr: allowAttrs,
96+
Regexp: compiled,
10797
}
98+
99+
ExternalSanitizerRules = append(ExternalSanitizerRules, rule)
108100
}
109101

110102
func newMarkupRenderer(name string, sec *ini.Section) {

0 commit comments

Comments
 (0)