elfs: (Default)
[personal profile] elfs
For reasons I'm not going to go into, I need to two different HTML parsers. One needs to accept almost any arbitrary HTML5 input without the concomitant javascript processing, and then spit out a stripped down, whitelist-tags-and-attributes-only version for storage; the other needs to recognize the full suite, plus a completely alien set of tags into which I'll be throwing some, er, extra functionality.

I need this all written in coffeescript.

Nobody's done anything like this before, at least not in Coffeescript. My brain is spinning; I haven't worked with real parsers since my days at F5. Nothing like this was necessary for Isilon or IndieFlix. And, oh my gods, the HTML5 parsing standard is explicit, easy to implement, and huge.

I can use some of the existing Javascript or Python parsers as starting points, but they're not terribly easy to extend. I'd also like to try and use a parser-combinator, because my experience has been that PC grammars are easier to understand. But try as I might, my head explodes when trying to grasp whatever it is I'm trying to do. Still, we'll see. After fridgemagnets, I need a bigger project.

Date: 2012-04-04 07:51 pm (UTC)
From: [identity profile] mikstera.livejournal.com
Which Javascript parsers are you using as your starting point?

Also, what makes the use of coffeescript a hard and fast requirement (which I assume from your choice of words)?

Profile

elfs: (Default)
Elf Sternberg

December 2025

S M T W T F S
 12345 6
78910111213
14151617181920
21222324252627
28293031   

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Dec. 29th, 2025 06:22 am
Powered by Dreamwidth Studios