Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I didn't know that EPUB is based on HTML. I always had the impression that it has its own binary format.

Using HTML as a base has a lot of sense.



It's just a zip file with HTML documents and some (ePub-specific) XML files to define metadata, chapters, and a few things like that. I use this "epub-edit" script to edit them:

  #!/bin/zsh
  #
  # Extract epub file to a temp directory, launch shell to edit it, and re-zip
  # it. Nothing about this is really epub-specific as such.
  echo " $@" | grep -q -- ' -h' && { sed '1,2d; /^[^#]/q; s/^# \?//;' "$0" | sed '$d'; exit 0; }  # Show docs
  [ "${ZSH_VERSION:-}" = "" ] && echo >&2 "Only works with zsh" && exit 1
  setopt err_exit no_unset no_clobber pipefail
  
  full=$1:a
  
  tmp=$(mktemp -d)
  bsdtar xf $1 -C $tmp
  
  cd $tmp
  print "Editing $1; press ^D to exit"
  zsh ||:
  
  mv -f $full $full.orig
  zip -f $full *
  cd -
  rm -r $tmp
And then I use vim to edit the HTML files and such.


It's just a zip file. Under Linux/Mac/BSD you can trivially write a script which unzip's and outputs the ebook's HTML files into a large text stream and that output can be used as the input of a text mode web browser, allowing you to read ebooks everywhere with just two lines of code.


W3C standards basically always build on top of other existing W3C standards.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: