Skip to content

Tests need to be refactored; failures from master with various EOL formats. #241

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sudoforge opened this issue Jun 16, 2017 · 36 comments
Closed

Comments

@sudoforge
Copy link

sudoforge commented Jun 16, 2017

I'm on Fedora 25 (64-bit), using kernel v4.11.3-202.

I am unable to successfully run cargo test from master of this repository, using rustc 1.17.0 (56124baa99 2017-04-24) and cargo 0.18.0 (fe7b0cdcf 2017-04-24). For sanity checking, I have attempted to clone the repository with the git configuration core.autocrlf set to its various values:

$ git clone https://github.com/cobalt-org/cobalt.rs.git && cd cobalt.rs && cargo test ; cd ../
// fails

$ rm -rf ./cobalt.rs && git -c core.autocrlf=true clone https://github.com/cobalt-org/cobalt.rs.git && cd cobalt.rs && cargo test ; cd ../
//fails

$ rm -rf ./cobalt.rs && git -c core.autocrlf=input clone https://github.com/cobalt-org/cobalt.rs.git && cd cobalt.rs && cargo test ; cd ../
//fails

$ rm -rf ./cobalt.rs && git -c core.autocrlf=false clone https://github.com/cobalt-org/cobalt.rs.git && cd cobalt.rs && cargo test ; cd ../
//fails

This issue was originally surfaced in #209 under Windows 10, so the issue does not seem to be limited to my machine.

@epage
Copy link
Member

epage commented Jun 16, 2017

Oye. Thanks for looking into this further.

@Geobert
Copy link
Contributor

Geobert commented Nov 18, 2017

On my end, with cobalt 0.8.0, git -c core.autocrlf=false clone https://github.com/cobalt-org/cobalt.rs.git gives me a clone with all the tests passing.

Cobalt produces Unix EOL files, I don't know if it should produce OS dependant files or not though

So either force one style of EOL through a .gitattributes file, or make Cobalt produces OS dependant EOL file should fix this

@epage
Copy link
Member

epage commented Nov 18, 2017

The thing that confuses me about this is that I don't think cobalt deals with line endings, it just preserves whatever is in the source file. For the tests, this means the fixture and target would both be changed to CRLF and should work.

When we validate a test run, we generate a diff using a specific EOL. I thought this was just advisory only. Either way, it should treat the two files the same.

@epage
Copy link
Member

epage commented Nov 18, 2017

Either way, I'm fine with adding a .gitattribute file and and editorconfig file to better specify how to handle this.

Ideally, the editorconfig file would also say to not strip trailing whitespace. That is an easy way for people to mess up the tests.

@Geobert
Copy link
Contributor

Geobert commented Nov 19, 2017

I've normalize EOL in assert_dirs_eq and it solves many tests but some are still broken. While investigating these, I've notice a weirdness in the produced file: We have a mix of EOL:
save.zip

this is the product from excerpts test, post-11.md.

While my post-11.md has CRLF everywhere because of autocrl=true, the output has a mix of CRLF and LF.

EDIT: but this is not the cause of test failure

@Geobert
Copy link
Contributor

Geobert commented Nov 19, 2017

The failing tests are:

  • excerpts
  • jsonfeed
  • rss

For excerpts test, it seems that having the md file with CRLF changes the output of the except :-/
Haven't investigate the other two yet

I'll try to fix excerpts (so nobody is working on the same thing)

@Geobert
Copy link
Contributor

Geobert commented Nov 21, 2017

The default value for excerpt_separator is "\n\n" that's why we have different output if the markdown file is in Windows EOL mode.

I don't know how to manage it, because on an OS, we can encounter different EOL anyway so we should not rely on the OS to determine the default excerpt separator.

May be a detection? Like looking at the first EOL? And if ever we have a mixed file, the user should normalize (and we spit a warning)

@epage
Copy link
Member

epage commented Nov 21, 2017

So that answers why the tests fail with normalized whitespace

May be a detection? Like looking at the first EOL? And if ever we have a mixed file, the user should normalize (and we spit a warning)

So we have several options

  • Auto-detect as you mentioned
  • Change the default to something newline agnostic
  • Document this issue and give a recommendation on what to override excerpt_separator to
    • If you're windows-only, change excerpt_separator to "\r\n\r\n"
    • On mixing platforms, consider something like <!-- more --> like in some tests
      • This is what the impacted tests would do

The newline-based separator is nice for a post that immediately goes into content because its a lot more natural. So I somewhat lean towards keeping that. The next question is how many users will be mixing platforms to a degree that we need auto-detection vs documentation?

Any other thoughts?

@epage
Copy link
Member

epage commented Nov 21, 2017

We still have the question of why we need to normalize whitespace with autocrlf. See #335

@Geobert
Copy link
Contributor

Geobert commented Nov 22, 2017

I'd like to keep the newline separator too. On the other hand, we can't force people to use one type of EOL. If Windows user manually create a md file, it will be with CRLF. I tend to go for auto detect. The question is: per file detection or just OS detection?

@epage
Copy link
Member

epage commented Nov 28, 2017

Before we decide on a solution, I feel like we need to better understand the problem: why is there a newline discrepancy in the first place. If we auto-detect, what is it changing about our output?

@Geobert
Copy link
Contributor

Geobert commented Nov 28, 2017

I need to investigate this, I suspect a hardcoded "\n" somewhere ^^

@epage
Copy link
Member

epage commented Nov 28, 2017

Thanks for being willing to look into that!

@Geobert
Copy link
Contributor

Geobert commented Nov 28, 2017

Ok I had a quick look and here my findings:

  • excerpt_separator are hardcoded to "\n\n"
  • in render_dump we had "\n" after the "---" separator
  • in create_rss, there's a "\n" as well

I think we need to change them with something consistent across the website being processed and forbid mixed EOL in a website. Maybe add an option in the config file?

@epage
Copy link
Member

epage commented Nov 28, 2017

  • I provided a suggestion above for excerpt_separator, thoughts on it?
  • render_dump doesn't matter, its only for debugging.
  • create_rss / create_jsonfeed will eventually call into extract_markdown_references, is this what you are referring to? or is there another place adding line endings, like the rss crate?

When you made the tests normalize the line endings, that left the RSS tests failing. I thought you said it was because the excerpts weren't correctly extracted because of excerpt_separator? If thats the case, then there is still a hole missing to explain why all of the other tests started passing after normalization.

@Geobert
Copy link
Contributor

Geobert commented Nov 29, 2017

For the excerpt separator, I think we need both: documentation and auto detection. I'm not comfortable with a hardcoded value for it.

for the rss/json feed, I need to see what's happening in the test, I haven't check, you may be right that the excerpt bug is making them failing though :)

@Geobert
Copy link
Contributor

Geobert commented Nov 29, 2017

@epage
Copy link
Member

epage commented Nov 30, 2017

Found it! Or at least another form of "it".

When we parse things with markdown, the line endings are being converted! I confirmed this by converting an .md file that has a layout to CRLF. The layout has CRLF endings but the content section doesn't (I think).

I also suspect syntect is doing the same thing for syntax highlighting.

@Geobert
Copy link
Contributor

Geobert commented Dec 2, 2017

Well done! I hope there's no other occurrence of this bug. To summarize, we need:

  • find a way to manage excerpt separator
  • find a way to manage this .md conversion

BTW, where it this occurrence?

@epage
Copy link
Member

epage commented Dec 5, 2017

@epage
Copy link
Member

epage commented Dec 5, 2017

BTW, where it this occurrence?

markdown is handled at https://github.com/cobalt-org/cobalt.rs/blob/master/src/document.rs#L324

@epage
Copy link
Member

epage commented Dec 5, 2017

So I'm re-evaluating things and playing with options in my head

  • Give up and go back to Normalize EOL before diff #335
  • "Normalize" .liquid (ie Raw) files to \n, causing everything to now magically be consistent
  • Detect the file's line endings and reconvert markdown/syntect from \n to the user's line endings

So we need to weigh out

  • Preserving the user's line endings
  • Consistent line endings
  • Performance
  • Simplicity

Overall, I'm wondering how valuable preserving and consistency are.

For simplicity, I worry how much we'll be playing whack-a-mole either preserving the user's line endings or normalizing them to \n.

Thoughts? Sorry if it feels like I'm yanking you in different directions. This journey has been valuable for us to make an informed decision.

@Geobert
Copy link
Contributor

Geobert commented Dec 5, 2017

No need to apologies here :) I think that we just need to be consistent on the output.
I'm quite keen of the solution of accepting anything and normalize before processing. Anyway, the generated files are not supposed to be edited manually so having them as "\n" should not be a problem.

As for the test failing, either putting back #335 or a .gitattributes will do the job. I think .gitattributes is better as #335 can hide some issues as you said on the PR.

@epage
Copy link
Member

epage commented Dec 6, 2017

I'm quite keen of the solution of accepting anything and normalize before processing.

If for no other reason than for if someone later has a question on why we did something, why are you keen on this solution?

@Geobert
Copy link
Contributor

Geobert commented Dec 6, 2017

I think it's the easiest way to get rid of this issue ^^'
We need to normalize the EOL somehow, and deciding "everything to LR" is the simplest. I'm just worried on perf.

@Geobert
Copy link
Contributor

Geobert commented Dec 16, 2017

So we agree on this solution? I will have some free time next week to work on cobalt so I will begin by solve this before working on the theming subject :)

@epage
Copy link
Member

epage commented Dec 17, 2017

Sorry, been rushing too much on these cobalt / liquid changes, haven't been spending much time thinking on other things.

We need to normalize the EOL somehow

That is one of the assumptions my post listing out options above is re-evaluating. I'm wondering if we should, for now, just leave things be. Maybe its not all that bad to just normalize the whitespace in the tests?

@Geobert
Copy link
Contributor

Geobert commented Dec 17, 2017

No need to apologies, I haven't work on cobalt for weeks myself ;)

Normalizing the EOL in tests will not solve the tests where excerpts are involved :-/

Geobert pushed a commit to Geobert/cobalt.rs that referenced this issue Dec 21, 2017
Geobert pushed a commit to Geobert/cobalt.rs that referenced this issue Dec 21, 2017
@Geobert
Copy link
Contributor

Geobert commented Dec 21, 2017

in order to solve this, we need 2 things: 

  • normalize into Document::parse (using the crate you pointed earlier)
  • .gitattributes to force the files to be LF (so the copied assets are LF)

I'll wait #346 to land in order to incorporate the changes in cobalt.rs into my refactor.

@epage
Copy link
Member

epage commented Dec 23, 2017

This is my recommendation for tests that use the excerpt

Document this issue and give a recommendation on what to override excerpt_separator to
• If you're windows-only, change excerpt_separator to "\r\n\r\n" 
• On mixing platforms, consider something like <!-- more --> like in some tests 
• This is what the impacted tests would do

@Geobert
Copy link
Contributor

Geobert commented Dec 23, 2017

With my fix, the "\r\n" change is not needed, I'm normalizing the file :) This is more friendly, isn't it?
I need to add a test to make sure that the normalization takes place properly, planning to do it next week :)
gitattributes can be fine tuned to keep "\r\n" for some files only. I'm planning on copying some tests fixture into a *_CRLF version that will keep "\r\n" in the source.

The test will makes sure that the produced result will be "\n"

Not sure I'm clear enough here ^^'

@Geobert
Copy link
Contributor

Geobert commented Dec 30, 2017

I have added some tests: basically copy pasted excerpts into excerpts_CRLF and added a rule into .gitattributes to convert EOL as CRLF. So in a clone, we have these format tested. If you are agree with this approach, I'll add a _CR copy for the Mac format as well.

Geobert pushed a commit to Geobert/cobalt.rs that referenced this issue Jan 5, 2018
Geobert pushed a commit to Geobert/cobalt.rs that referenced this issue Jan 5, 2018
Geobert pushed a commit to Geobert/cobalt.rs that referenced this issue Jan 18, 2018
Geobert pushed a commit to Geobert/cobalt.rs that referenced this issue Jan 19, 2018
split build() into smaller functions
uses normalize end line crate 
Add tests for CRLF case
Geobert pushed a commit to Geobert/cobalt.rs that referenced this issue Jan 21, 2018
split build() into smaller functions
uses normalize end line crate 
Add tests for CRLF case


fmt issue


Fix conflict


fix fmt


Fix clippy


Compil


fix clippy
Geobert pushed a commit to Geobert/cobalt.rs that referenced this issue Jan 21, 2018
@Geobert Geobert mentioned this issue Jan 23, 2018
Geobert pushed a commit to Geobert/cobalt.rs that referenced this issue Jan 26, 2018
Geobert pushed a commit to Geobert/cobalt.rs that referenced this issue Jan 26, 2018
Closes cobalt-org#241
add editorconfig
add gitattributes
Geobert pushed a commit to Geobert/cobalt.rs that referenced this issue Jan 26, 2018
Closes cobalt-org#241
add editorconfig
add gitattributes
Geobert pushed a commit to Geobert/cobalt.rs that referenced this issue Jan 27, 2018
Closes cobalt-org#241
add editorconfig
add gitattributes
Geobert pushed a commit to Geobert/cobalt.rs that referenced this issue Jan 27, 2018
Closes cobalt-org#241
add editorconfig
add gitattributes
@epage epage closed this as completed in ef16be0 Jan 27, 2018
@sudoforge
Copy link
Author

The tests excerpts and excerpts_crlf still fail for me with a fresh clone of the repository on Linux, on the master branch, with no core.autocrlf gitconfig setting.

@Geobert
Copy link
Contributor

Geobert commented Mar 12, 2018

How do you clone? Do you use a GUI? GitKraken for instance have its own autocrlf setting and ignore global one and even .gitattributes one.

What is your OS? Still Fedora? I don't have a Fedora to test :-/

@sudoforge
Copy link
Author

sudoforge commented Mar 12, 2018

@Geobert I clone from the command line directly. And no, I switched to Arch Linux last July.

You can quite literally explore my workstation:

There's nothing I've done, no setting I've configured, that should cause this issue. I've experienced test failures (due to line endings) with this project under fresh installs of Debian, Windows, Fedora, and Arch Linux. I'm curious what platform the project contributors use.

@Geobert
Copy link
Contributor

Geobert commented Mar 12, 2018

Can you please give me the EOL encoding of:

  • tests/target/excerpts/index.html
  • tests/target/excerpts_CRLF/index.html
  • tests/fixtures/excerpts/posts/post-1.md
  • tests/fixtures/excerpts_CRLF/posts/post-1.md

epage referenced this issue in epage/cobalt.rs Aug 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants