c# - Scrape JavaScript array data with CsQuery -



c# - Scrape JavaScript array data with CsQuery -

some of info want scrape contained within pages javascript. looks similar pattern:

<script type="text/javascript"> arrayname["field1"] = 12; arrayname["field2"] = 42; arrayname["field3"] = 1442; </script> <script type="text/javascript"> arrayname["field4"] = 62; arrayname["field5"] = 3; arrayname["field6"] = 542; </script>

it's mixed in hell of lot of other javascript. need these values.

i started so:

var dom = cq.createfromurl("http://somesite.xxx"); cq script = dom["script[type='text/javascript']"];

but cannot think how grab data. way create regex , loop on or there way has improve performance?

i can't see how utilize css selectors actual javascript code. should seek different approach?

it seems looking server-side javascript engine - csquery can contents of script tags enough, need run script , able refer entities created. while in theory 1 create kind of query language parse out lines of script, reality is, that's running it. if need pull out particular lines containing simple assignments, , context isn't important, you're looking @ simple regular expressions (or grep) filter out need.

i have used neosis v8 wrapper -- http://javascriptdotnet.codeplex.com/ -- on nuget neosis.javascript.

it's fast (since uses google's v8 engine under hood); real downside it's not pure .net solution, 1 time set it's pretty painless. illustration of using in project https://github.com/jamietre/sharplinter uses run jshint.

there variety of 100% .net javascript engines such jint, ironjs , jurassic. have used jurassic before , it's fastest because compiles bytecode. it's surprisingly complete, not beingness actively developed, , hard much support. of them much, much slower v8 , offer no real advantages other having no non-.net references.

unless really, need 100% .net utilize javscriptdotnet.

c# csquery

Comments

Popular posts from this blog

xslt - DocBook 5 to PDF transform failing with error: "fo:flow" is missing child elements. Required content model: marker* -

mediawiki - How do I insert tables inside infoboxes on Wikia pages? -

Local Service User Logged into Windows -