-
Notifications
You must be signed in to change notification settings - Fork 479
Add sphinx bibtex to provide bib file based citations #1032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
docs/conf.py
Outdated
| def check_if_v071dev0(): | ||
| """Check to see if current version being built is v0.7.1.dev0""" | ||
| result = False | ||
|
|
||
| if fairlearn.__version__ == "0.7.1.dev0": | ||
| print("Detected 0.7.1.dev0 in fairlearn.__version__") | ||
| result = True | ||
|
|
||
| return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before the PR gets merged, we will have to change this version check. I was thinking of using a greater than comparison here. I did look into this and realized we might have to use the packaging library as suggested in Stackoverflow. However I am not sure if we want to introduce another dependency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this is a doc build dependency, I don't mind adding it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I later realized that packaging is already used by setuptools so most people should already have it installed. But to be explicit, I have added it to the requirements-dev.txt.
|
Thank you for starting to look at this! The versioning of the docs is something we've talked on and off about for a while; this might help animate those discussions. |
adrinjalali
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to use packaging for the version check, and it LGTM otherwise.
romanlutz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for getting the bibtex conversion started! This is great!!!
Make sure to add a line to the changelog (part of docs/user_guide, there's a file somewhere for each version, so just add to the latest one).
docs/refs.bib
Outdated
| @@ -0,0 +1,51 @@ | |||
| Mitigations.rst | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favor of this, but the comments in this file don't add anything from my point of view. In fact, these papers are cited all over the place rather than just in one file. Keeping such a mapping in sync sounds tricky and I don't really see the value it would add since you can just search for the identifier to find where it's cited. Again, just the comments (including paper comments below), everything else is great! I assume you just copied from Google Scholar or somewhere similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that we often cite papers across several sections, so organizing the bibtex by the section of our docs might not work (perfectly). I'm not sure what a good alternative would be though. I'm okay going with this proposal but ensuring that we don't have duplicates.
The bibtex entries below look like a merge of DBLP versions, which I think is sensible. We shouldn't use Googe Scholar as a source for citations. The main decision is perhaps whether we are okay with the citation ids of the form
<FirstAuthorSurname><Year><PaperKeyword>
I'm fine with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping such a mapping in sync sounds tricky and I don't really see the value it would add since you can just search for the identifier to find where it's cited.
Ok that makes sense, I think the step 1 should cover the duplicates, no need for comments. If someone is adding the a new entry, they should check if an entry already exists. I'll modify the PR and the description.
- Check if your required reference already exists in the
docs/refs.bibfile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you just copied from Google Scholar or somewhere similar?
I used the bibtex entries available through arXiv and modified some of the entries to showcase how different formats look.
The bibtex entries below look like a merge of DBLP versions, which I think is sensible. We shouldn't use Googe Scholar as a source for citations. The main decision is perhaps whether we are okay with the citation ids of the form
Thank you for the reference, I did not know that about Google Scholar. I used the ones available through arXiv, they link to the DBLP hosted records. For example - arXiv Page Fairlearn and its DBLP Bibtex. However, they always link back to the arXiv page for the paper.
Takeaway: It might be helpful to establish where we source our bibtex snippets from.
| ~~~~~~~~~~~ | ||
|
|
||
| .. topic:: References: | ||
| .. _references: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this smart enough to include only the cited ones on this page? It will be important moving forward when there are perhaps dozens of citations across the website, but only say, a handful, on this particular page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mirrored what we already had in the documentation. Right now the structure is - Each page like mitigation.rst will only list citations that are used on this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds great. I'm pretty sure that's what @romanlutz had in mind too.
|
|
||
| # Setup for sphinx-bibtex | ||
|
|
||
| # Only use sphinx-bibtex if version is above 0.7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not used in other versions, so it should just be fine(?) or am I missing something?
Oh or is this about the refs file not existing?
Just trying to make sure I get it 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right, it will be fine. However, sphinx-multiversion uses the latest conf.py and for other versions we don't have a refs.bib file which generates a warning. Although it will not stop the documentation build, this can worry some new users, so I made it explicit to use sphinx-bibtex only when the required version is available.
MiroDudik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked the specific bib entries that are being used.
docs/refs.bib
Outdated
| Mitigations.rst | ||
| Agarwal, Beygelzimer, Dudik, Langford, Wallach “A Reductions Approach to Fair Classification”, ICML, 2018. | ||
| @article{agarwal2018reductions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be using something more similar to this: https://dblp.uni-trier.de/rec/conf/icml/AgarwalBD0W18.html?view=bibtex¶m=0
(so @inproceedings rather than @article)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it I will change to @inproceedings, do you mean to use the condensed format? It does not have URLs so the references will not link back to the papers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that having a URL is definitely nice to have, but the full standard citation seems very verbose. I would go for something in between, maybe (also fixing my name, which is getting a bit mangled):
@inproceedings{agarwal2018reductions,
author = {Alekh Agarwal and
Alina Beygelzimer and
Miroslav Dudík and
John Langford and
Hanna M. Wallach},
title = {A Reductions Approach to Fair Classification},
booktitle = {Proceedings of the 35th International Conference on Machine Learning,
{ICML} 2018},
pages = {60--69},
year = {2018},
url = {http://proceedings.mlr.press/v80/agarwal18a.html},
biburl = {https://dblp.org/rec/conf/icml/AgarwalBD0W18.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I'd rather add more info than less. It's at the bottom of the page anyway...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs/refs.bib
Outdated
| } | ||
|
|
||
| Agarwal, Dudik, Wu “Fair Regression: Quantitative Definitions and Reduction-based Algorithms”, ICML, 2019. | ||
| @article{agarwal2019fair, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs/refs.bib
Outdated
| } | ||
|
|
||
| Hardt, Price, Srebro “Equality of Opportunity in Supervised Learning”, NeurIPS, 2016. | ||
| @article{hardt2016equality, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see: https://dblp.uni-trier.de/rec/conf/nips/HardtPNS16.html?view=bibtex¶m=0
[with NIPS -> NeurIPS]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe 2016 preceeds the renaming of the conference (?). Is it okay to call it by the new name regardless? I'm obviously very much in favor, just trying to understand the conventions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, retroactive renaming is okay... we're providing a link, so it shouldn't matter.
|
So it seems there's quite a bit of discussion on how to reference papers. What I've always done is to cope the bibtex entry from google scholar and let it be formatted by whatever library's rendering them. But it seems @MiroDudik doesn't like that. I understand in a paper that may be important, but I'm not sure if this much discussion and work is worth it everytime we'd be adding new citations. This library is not an academic article, so I'm not sure why the sensitivity on the details of the citations are raised here. For me, as long as people can copy/paste our reference and find it wherever they find their articles, it's good enough. |
MiroDudik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.




Fixes #936
This PR adds the
sphinxcontrib-bibtexextension which allows us to use citations from a.bibfile.Usage
docs/refs.bibfile. If not add your bibtex snippet to therefs.bibfile. To keep the.bibfile organized, please ensure that you do not introduce duplicate entries.<FirstAuthorSurname><Year><PaperKeyword>.refs.bibfile, you can reference your citation using the specified citation key using the syntax:The above citation will be rendered as
[Number].4. To add the bibliography, you need to use
:footbibliography:directive at the bottom of your file. This will list all the citations for the current document automatically!Assumptions
:footcite:instead of the normal:cite:because of the way references are resolved.<FirstAuthorSurname><Year><PaperKeyword>convention for citation keys.Caveats
arXivreferences, the library somehow creates URLs for thearXivpaper and shows both the generated URL as well as the URL under the URL parameter in.bibfile. To avoid this, either only specify aURLparameter oreprintalong witheprinttype.To Do
Look out for
sphinx multiversion build, remember to whitelist the PR branch.mitigations.rsthas the new citations.Example Output
Before


After