Skip to content

Fix the problem when generating TOC for document with UTF-8 characters#24

Closed
ghost wants to merge 1 commit intomasterfrom
unknown repository
Closed

Fix the problem when generating TOC for document with UTF-8 characters#24
ghost wants to merge 1 commit intomasterfrom
unknown repository

Conversation

@ghost
Copy link

@ghost ghost commented Jun 28, 2016

(replace-regexp-in-string "[^a-z0-9 -]" "")

will replace all UTF-8 characters to null string. Then the links in GitHub is not working.

(replace-regexp-in-string "[[:punct:]]" "") 

can solve this problem, and is compatible with previous method.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 97.917% when pulling e8122ba on m31271n:master into c5d4447 on ardumont:master.

1 similar comment
@coveralls
Copy link

coveralls commented Jun 28, 2016

Coverage Status

Coverage remained the same at 97.917% when pulling e8122ba on m31271n:master into c5d4447 on ardumont:master.

@ardumont
Copy link
Owner

Hi,

Thanks for the effort. I'm not sure to understand though.

Do you have some sample so that i could see what this actually solve?

Cheers,

@ghost
Copy link
Author

ghost commented Jun 29, 2016

Thanks for your reply.

For example, I have a title like "配置 SPF(Sender Policy Framework)记录"。

The old markdown-toc--to-link is:

(defun markdown-toc--to-link (title)
  "Given a TITLE, return the markdown link associated."
  (format "[%s](#%s)" title
          (->> title
               downcase
               (replace-regexp-in-string "[^a-z0-9 -]" "")
               (s-replace " " "-"))))

Call it with the title:

(markdown-toc--to-link "配置 SPF(Sender Policy Framework)记录")
;; Eval it, we get "[配置 SPF(Sender Policy Framework)记录](#-spfsender-policy-framework)"

The UTF-8 characters disappear.

(defun markdown-toc--to-link (title)
  "Given a TITLE, return the markdown link associated."
  (format "[%s](#%s)" title
          (->> title
               downcase
               (replace-regexp-in-string "[[:punct:]]" "")
               (s-replace " " "-"))))

Call it with the title, too:

(markdown-toc--to-link "配置 SPF(Sender Policy Framework)记录")
;; Eval it, we get "[配置 SPF(Sender Policy Framework)记录](#配置-spfsender-policy-framework记录)"

The UTF-8 characters are still here.

@ardumont
Copy link
Owner

Wow, yes, that's a problem.
Thanks for the details.

Will take a closer look as soon as i can.

Cheers,

ardumont added a commit that referenced this pull request Jun 29, 2016
@ardumont
Copy link
Owner

ardumont commented Jun 29, 2016

When I'll merge the 0.1.1 branch, this PR will be closed.
I've cherry-picked your commit and iterated over it to fix the edge case.
Some title contains - which were removed.

Thanks for raising this up and proposing a fix!
Awesome!

Cheers,

@ghost
Copy link
Author

ghost commented Jun 29, 2016

You are welcome. ;]

@shino
Copy link

shino commented Jul 20, 2016

This fix with @ardumont 's edge handling works perfectly in Japanese markdown text I'm writing! Thanks a lot for both of you, @ardumont and @m31271n 👍

@ardumont
Copy link
Owner

@shino @m31271n Thanks for the reminder.
I'll update markdown-toc soon ^^

@ardumont ardumont mentioned this pull request Jul 20, 2016
3 tasks
@ardumont ardumont closed this in #25 Jul 20, 2016
@ardumont
Copy link
Owner

This fix with @ardumont 's edge handling works perfectly in Japanese markdown text I'm writing! Thanks a lot for both of you, @ardumont and @m31271n 👍

@shino Also, thanks for providing yet another feedback, it's awesome.

Cheers,

@ghost
Copy link
Author

ghost commented Jul 20, 2016

Cheers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants