Simple Joplin Web Clipper Server

I just wrote a simple Joplin Web Clipper Server for my own little purpose:
pls-split-modules-to-dependency-npm-packages.

Features

  • Convert html to markdown format
  • Download pictures in the html
  • Directory is a notebook
  • Save markdown and pictures to the current(specified) folder(root)
  • decides the rules for stored file name and dir name:
    1. markdown file with the same markdown name folder
      • markdown file: ${folder}/${title}.md
      • markdown assets folder: ${folder}/${title}/
      • markdown assets base file name: ${assetBaseName}
    2. markdown name folder, index(README).md as markdown name in folder
      • markdown file: ${folder}/${title}/index.md
      • markdown assets folder: ${folder}/${title}/
      • markdown assets base file name: ${assetBaseName}
    • you can customize by yourself

Supported variables and functions:

  • folder: the relative to root directory (coming from Joplin Web Clipper)
  • title: (come from Joplin Web Clipper)
  • assetBaseName: the name should not include the extname.
  • date: the ISO format date time.
  • index: the index number of asset.
  • slug : the smart slug of the title.
  • shortid(): return the short unique id.
  • toSlug(str): convert the str to a smart slug.

Usage

Use npm install or download the binary packages here: [https://github.com/snowyu/h2doc.js/releases]

(https://github.com/snowyu/h2doc.js/releases)
$ npm install -g h2doc@alpha
$ h2doc server
running joblin web clipper server...

$ h2doc (-v|--version|version)
h2doc/0.1.0-alpha.3 linux-x64 node-v12.18.2
$ h2doc --help [COMMAND]

USAGE
$ h2doc COMMAND
...

Configuration

The config file name could be .md-config.(yaml|json) or md-config.(yaml|json).

The config file search order:

  1. the current working(root) directory
  2. the user home directory
  3. the application config directory
# .md-config.yaml
output:
  root: . # the root folder, defaults to current working directory.
  exclude:
    - node_modules
  deep: 5 # Specifies the maximum depth of a read directory relative to the root.
  markdown: ${folder}/${title}.md # whether use the smart slug as markdown file name
  asset: ${folder}/${title}/
  assetBaseName: ${assetBaseName} # do not include extname
slug: # the smart slug options, if it is string which means separator
  separator: '-' # String to replace whitespace with, defaults to -
  lang: '' # ISO 639-1 two-letter language code, defaults to auto-detected language
  tone: false # add tone numbers to Pinyin transliteration of Chinese, defaults to true
  separateNumbers: false # separate numbers that are within a word, defaults to false
  maintainCase: false # maintain the original string's casing, defaults to false
download: true # whether download assets
format: # WARNING: these options maybe changed in the future
  headingStyle: 'atx' # setext or atx
  hr: '---'
  bulletListMarker: '*'
  codeBlockStyle: 'fenced' # indented or fenced
  fence: '```' # ``` or ~~~
  emDelimiter: '_' # _ or *
  strongDelimiter: '**' # ** or __
  linkStyle: 'inlined' # inlined or referenced
  linkReferenceStyle: 'full' # full, collapsed, or shortcut
  gfw:
    strikethrough: true # for converting <strike>, <s>, and <del> elements
    tables: true
    taskListItems: true
frontMatter: # whether use front matter(insert into markdown).
  title: true
  url: true
  author: true
  date: true
  publisher: true
  lang: true
  description: true
  image: true
  video: true
  audio: true
3 Likes

it dawns on me but i don’t quite understand it (my mind runs on idle currently).
Would you mind to explain the use case in bit more details (with pics)?

eg(on the default config):

Suppose your wanna clip this with title: “my title”

<a href='link'><img src='https://.../abc.png' /> </a>

And your startup the server to clip:

h2doc server your-clip-dir
<Ctrl+c> to stop it
cd your-clip-dir
ls -R your-clip-dir
my-title.md
my-title/abc.png
1 Like