Home / GitHub Page

What is the difference between Markdown vs Docx vs HTML files?

Hi,

I am still don’t understand the detailed differences between Markdown vs Docx vs HTML files

Could someone please help me to understand it better?

Thanks.

Hmm, I’m not quite sure what you mean, but those are 3 different formats - that’s all.

.docx is a Microsoft document format, which has been used for MS Word for a while now. I believe it is no longer a binary format, but similar to an open text format.

Markdown is a markup language that is easily readable in the source format and can be rendered into HTML. It has limited formatting options, but is widely used in README files and note taking apps.

HTML is the format used on web sites to render content. It’s been around for many years, so I’m not really sure what I should write about it. Except that I despise writing html code. :wink:

I understand the docx and HTML…but wanted to know why markdown is required? why does Joplin requires that instead of having an import option from docx or text file or html files directly?

Why markdown file format is chosen by Joplin if it has limitations than advantages?

Markdown is chosen because it’s easy to write. It’s like plaintext, which in many cases, is all that’s needed but with extra option to create bold text, add images, etc.

HTML is difficult to edit manually, and docx is similar (like tessus, I don’t know if it’s binary or not, but it’s at least some xml or similar).

There are options to import text and markdown, but as you can imagine the app cannot realistically support every existing text format. That’s why you need something like Pandoc to convert to a format supported by Joplin. If we wanted to support everything, the only sensible solution would be to bundle Pandoc, which would be a huge dependency.

A long, long time ago printing was a laborious process that required every letter be individually carved in metal. Things got a bit easier around 150 years ago but if you wanted a book to look professionally published you still needed to go to a profesional printing house to get it done.

As computers got better people started asking themselves if they couldn’t use them to make typography and typesetting easier. Turns out they could… except you still needed to be a programmer to format the text. Wanted a word to be bold? well, better start learning how to code.

Not everybody agreed with this approach so people started trying to make a What You See Is What You Get editor to format text in. These WYSIWYG editors become really popular. Microsoft made a name for itself with it’s DOS based Office suit - the descendants of which are still used today on modern desktops. Word being the WYSIWYG text editor it came with.

The problem with these WYSIWYG text editors was that none was as good as actually programming the text to look a certain way. Two initial solutions were found. For people that needed a simplified text formatting input TeX/LaTeX was developed, and for profesional designers, graphical tools that allowed them to alter the way the text looked at levels equal to the old school “ink and paper” guard. The problems was that most people weren’t designers, and even if they were, no one wanted to format an entire book by hand in a graphical editor. LaTeX was good, but still to complex for “normal” people to use and WYSIWYG editors were still not good enough… so everybody started using the simplest tool that would do the job they had. Office went with WYSIWYG, profesional typesetters when with LaTeX or other tools made for them and Graphic designers … well, they all went with Adobe.

For people that were willing to get a little bit dirty and dive into code a new solution was found. Markup. Markup is basically a simple way of telling a program how to format text that’s easy for humans to understand. Technically LaTeX and Rich Text Format (RTF) are markup, but it was still too complex for most people to bother learning, let alone typical office usage.

Markdown was the solution, basically markdown is the simplest possible way to edit text that remains humanly readable. It’s not as complex of powerful as WYSIWYG editors or TeX and LaTeX editors but it’s good enough to do the job. The problem is of course that it’s still more complex then WYSIWYG while not as useful as a full LaTeX editor.

Now, if Markdown isn’t as powerful as WYSIWYG editors, why is it even used? well, it’s a matter of usability. It’s easier to add two asterisks before and after a word than it is to move your right hand from the keyboard to the mouse, select the text and then press the bold button, or use some convoluted keyboard shortcut that allows you to select text and then bold text with an editor specific keyboard shortcut. Basically, Markdown is a simple(the simplest?) way to edit plain text. It’s useful if you can’t use a mouse (which explains why there are so many markdown editors for mobile platforms), don’t want to use one (which explains why it’s so popular with the CLI obsessed Linux crowd) and with programmers that use more complex markup languages like HTML (the Hyper Text Markup Language) because of the ease with which Markdown can be converted to HTML and back (same goes for other markup languages like LaTeX or RTF documents), and with anyone else that wants to format their text as they go but doesn’t like how difficult that would be in a WYSIWYG editor like MS Word.

As for why Joplin doesn’t import from .docx, because at least at one point in the past the actual programming language that formatted the document was hide from the end user and as such inaccessible. What does that even mean? That meant that you couldn’t get your formatting, or worse, your text out of a .docx document unless you had MS Office. Things are a bit better now, but most simple note taking apps like this one don’t bother with .docx since even if they did, the formatting could be a lot more complicated then simple Markdown could handle. You can still open a .docx and copy the text over into any markdown editor, but I doubt importing .docx and all it’s formatting will ever happen for nay markdown editor.

1 Like

I agree that we should not bundle…But why don’t we have a plugin kind of stuff so that any user can install and use it and then remove the plugin when not needed…Like in Notepad++

We have huge amount of plugin in Notepad++ however it is not needed always…Example import, file manager, XML formatter etc.,

Please consider plugin kind of feature if possible.

Thanks a lot for the detailed explanation…

You can open the note in notepad++ with Joplin’s open with external app option and use the npp+ plugin that you mentioned.

I didn’t understand your answer. I am saying why can’t we have plugin option to add and provide the option to import docx files and do the conversion via the plugin. In that the app will not be overloaded. Mostly it will be one time activity for the user.

With Joplin you can call an external editor, like notepad++ , which uses plugins.

We are saying that adding such a plugin is too much work fior such an edge case.

2 Likes

For those who may be interested docx is basically a set of text based xml files zipped together. Just change the “docx” extension of a Word file to “zip” and you can rummage through the contents like any other “zip” file. Inside the zip there should be a file path similar to file.zip\word\document.xml where the text is stored, hidden amongst tons of tags and xml references.

Thanks, I thought that they changed the format at one point, but I wasn’t 100% sure. Your explanation confirmed it. Thanks again.

The answer to question such as these is often the same:
We don’t have it, because nobody built it.
Don’t forget that there is no for-profit company behind Joplin, which means you get everything for free no strings attached, but it also means everything is done because someone did it in their spare time, and gave their work away out of the goodness of their heart.

I’m sure if you, or anyone else, creates a tool that takes a bunch of DOCX files and imports them into Joplin, nobody will complain and you will be thanked by everyone who needs it.

Another thing is… there don’t seem to be that many people who need it, at least so far, which means the probability of someone taking the time to do it is not very high at the moment.

The good news, though, is it can still be done: if you search this forum, you should find out how. It’s a multi-step process involving Pandoc, but it can be done, and as you mentioned, you only need do it once.

Also see things like this for why some of us prefer markdown to Word. :slight_smile: