BibTex plugin proposal review

@xUser5000, I went through your proposal again and it's quite good overall, it's clear, well written and with examples of code and diagrams. I've listed below the possible changes and suggestions I have:

UX Document

You have a section about UI/UX but it's a bit short, while I think getting the UX right is important to get a good idea about how the plugin will work. To improve this, I'd suggest you create a UX Document. It can be a simple forum posts with a few screenshots. I think it should have at least:

  • A mockup of the config screen - detail what fields there will, what they will do, etc.
  • Insert Citation popup - you already have this actually but add it to the UX document
  • How the citation will look in the Markdown document
  • How the citation will look in the rendered view

Citation Parser

  • Investigate the existing parser libraries and find out which one is best. Please let us know your reasoning.
  • From your proposal, I understand that the plugin will parse the document then store the citation data somewhere? If so, please describe how and where you plan to save this data.

Testing

To properly test your plugin you'll need some BibTex files and, as I understand, it can be a relatively complex format, so I think it would be useful to gather as many examples as you can so that you can check that your plugins work well in all cases. Perhaps ask users on the forum if they can provide BibText files that you could use?

Further questions

Rich Text editor support?

You'll need to define how your plugin handles the Rich Text editor. There are in general three levels of compatibility:

  • Doesn't work, and the Markdown is lost if the note is changed from the Rich Text editor
  • Doesn't work, but the Markdown is preserved if the note is changed from the Rich Text editor
  • Works

So you might want to investigate this and see what level you can support.

Editing a citation?

Once the citation has been added to the document, do you need to have something to make it easier to edit or update the citation? Perhaps that could be a stretch goal?

3 Likes

I think this was settled in another topic :slight_smile:

I did some research and took a deep dive into the libraries I mentioned in the proposal and I came up with the following conclusions:

bibtex-js from pcooksey: (excluded) Only for use in HTML documents

Bibtex.js from digitalheir: (excluded) No longer maintained and does not have a good documentation

citation.js from larsgw: (Best option)

  • Supports many formats (including BibTeX of course)
  • Ongoing maintenance
  • Used by many people

So, I will use Citation.js

The best option would be to save it in memory (as an array of reference objects for example) to enable fast update and retrieval.

I stumbled upon this huge dataset containing more than 4000 BibTeX files. This is more than what I need.

The plugin is considered a markdown plugin which means, by definition, it does not support the rich text editor, as mentioned here. More specifically, the plugin won't work, and the Markdown is lost if the note is changed from the Rich Text editor.

Indeed, this is important. I'm thinking maybe I can redesign the citation popup to contain, in addition to its current contents, a list of included references (the references used in the note content). The user should be able to remove or add references to the list as needed. In my opinion, this is an excellent stretch goal.

I guess that's reasonable. Also maybe consider at what point you would parse the file. And how big can a BibTex file be?

1 Like

@laurent

Here's my solution:

There will be a boolean global variable (called updates for example) that indicates whether it is needed to parse the file or not. Any changes happening to the file will result in setting updates = true. The best I can think of is a lazy parsing where the file won't be parsed unless there's a change in the search query and updates == true. Here's a visual demonstration:
Parsing Process

Why not parsing immediately after setting the file path (eager parsing)?

  • Because I don't have full control over the config screen. For example, I cannot show a loading indicator using the current plugin API.
  • The user might want to change the file path several times before switching to the note and doing it this way avoids redundant parsing.

The story is different in the case of big files (e.g more than 1 GB).
I cannot load the whole file into memory so there must be a different strategy. When the query changes, I can create a stream of data from the file where I parse reference-by-reference and see if it matches the query without actually loading the entire file. The performance will drop dramatically, but this is a compromise to make in order to afford such a large amount of memory.

I'd like to see a person with 1GB of BibTeX references.

I agree it is so rare and I have not seen anyone doing that before. :joy:
So, Can I just ignore this case?

@roman_r_m
Can we do a survey for example to gather data about the typical size of a BibTeX file ?

Sure, why ask me?

Btw, you may also consider storing references in a database for very big libraries.

1 Like

@laurent What do you think ?

Having looked again at the chart, I'm not sure all extra complexity needed to support lazy parsing is actually needed.

So what if we parse 2 times instead of one? Pretty sure in the time it takes user to close the options menu, open a note and start inserting references you can parse a typical library 10 times if needed.

And presumably the path to the library will not be changed often.

Updates does not necessarily mean the library path changes. It can also mean that the content of the library changes.

Should have made it more clear, I was responding to this.

Oh, I see.
Well, I don't think it is that complex at all. We're talking about adding a new variable and just changing its value according to the situation. If you compare this diagram with the one here, you won't actually see that much of a difference.

As Roman mentioned if you want to do a survey, then feel free to go ahead, you don't need our permission. And I think gathering some info about the kind of BibTex file you can expect is a good idea.

Your diagram for updates makes sense. If it turns out it is slow to parse the files or you have to deal with large files, you can indeed also cache the parsing result to disk or in database. I suppose you don't need to worry about this for the first version.

1 Like